際際滷

際際滷Share a Scribd company logo
March Machine Learning Mania 2016
弌仍从舒 仆舒 Kaggle 弌仄亠亟仂于 仆仂仆
仗亠仍 2016
March machine learning mania 2016
Timeline
March machine learning mania 2016
弌仗亠亳亳从舒
 亠亟从舒亰于舒亠仄 弍亟亠亠
 弌仂亠于仆仂于舒仆亳亠 仗仂仂亟亳 亢亠 亳 亞仂亟舒 仗仂亟磲
 仆舒亳亠仍仆仂亠 于仍亳礌亳亠 仍舒亶仆仂亳
 仂亢仆仂 亳仗仂仍亰仂于舒 仍ミ英亠 于仆亠仆亳亠 亟舒仆仆亠
 舒仍仂 亳仆仂仄舒亳亳 仂 仆亳仆 仄舒舒 (~2000 仂从)
 亠仂弍仍舒亟舒仆亳亠 feature engineringa
舒仆仆亠
 仂亳亠从亳亠 亟舒仆仆亠  1985 亞仂亟舒: 从仂, 从仂亞亟舒, 亞亟亠 亳  从亠仄 亞舒仍
 仂亟仂弍仆亠 亟舒仆仆亠  2003 亞仂亟舒, 亟仂弍舒于仍磳 舒亳亳从舒 仄舒亠亶:
弍仂从亳, 仗仂亟弍仂, 
~145K 仂从 亟仍 亠亞仍仆仂亞仂 亠亰仂仆舒
~2K  亟仍 仆亳仂于
~71K 仂从 亟仍 亠亞仍仆仂亞仂 亠亰仂仆舒
~850  亟仍 仆亳仂于
Elo rating system
 舒亢亟仂亶 从仂仄舒仆亟亠 仗亳于舒亳于舒亠 仆舒舒仍仆亶 亠亶亳仆亞, 仆舒仗亳仄亠, 1500.
 仍 从舒亢亟仂亶 从仂仄舒仆亟 亳舒亠 仄舒仂亢亳亟舒仆亳亠 于亳亞舒仆仆 仂从仂于
(1  仗仂弍亠亟舒, 0  仗仂舒亢亠仆亳亠):
 弍仆仂于仍磳 亠亶亳仆亞:
K  K-factor, 亠仄 仄亠仆亠 K, 亠仄 弍仂仍亠亠 从仂仆亠于舒亳于仆舒 亳亠仄舒.
 仂亢仆仂 于于亠亳 仗仂仗舒于从亳 仆舒 亳亞 亟仂仄舒, 仆舒 舒亰仆仂 于 亠.
Wiki
亞亳亠 亠亶亳仆亞仂于亠 亳亠仄
 Glicko  仍亠仆仆舒 于亠亳 Elo.
 Chessmetrics  仗仂仗仂亠, 仆仂 弍仂仍亠亠 于于亳亠仍仆舒 从 束于仂仂亟亳仄
亰于亰亟舒仄損.
 TrueSkill  亠亶亳仆亞仂于舒 亳亠仄舒 仂 Microsoft.
仂亳 于亠
  亟舒仆舒 亳仂亳 仄舒亠亶 仄亠亢亟 亟于仄 从仂仄舒仆亟舒仄亳.
于亠亟仄 亟仍 从舒亢亟仂亶 亳亰 从仂仄舒仆亟 束于亠損, 仂仗亠亟亠仍磳仄亶 亟舒于仆仂 亠 仗仂弍亠亟:
1 = 0,5 + ゐ_1   粒  2 = 0,5 + ゐ_2   粒 
亞亟亠 ゐ_1   从仂仍亳亠于仂 仗仂弍亠亟 仗亠于仂亶 从仂仄舒仆亟 仆舒亟 于仂仂亶 n 仍亠 仆舒亰舒亟, 粒 
从仂亳亳亠仆 亰舒舒仆亳.
丐仂亞亟舒 仄仂亢仆仂 亟亠仍舒 仗亠亟从舒亰舒仆亳亠:
p1 =
1
1+2
p2 =
2
1+2
舒仗亳仄亠, 亠仍亳 从仂仄舒仆亟 亳亞舒仍亳 仂亟亳仆 舒亰 于 仂仄 亞仂亟 亳 弍仂仍亠 于亠 仆亳从仂亞亟舒
仆亠 弍仍仂, 于亠仂仆仂 仗仂于仂仆仂亶 仗仂弍亠亟 仗仂弍亠亟亳亠仍 仂亠仆亳于舒亠 于 0.75.
舒从 仂仄亳仂于舒 亟舒舒亠
 弌亠 亟舒仆仆亠 仆亠仍亰 仗仂仂 舒从 亟舒 舒仍亞仂亳仄
舒从 仂仄亳仂于舒 亟舒舒亠
w_team l_team w_team features l_team features target
Train:
舒从 仂仄亳仂于舒 亟舒舒亠
w_team l_team w_team features l_team features
l_team w_team l_team features w_team features
1
0
Train:
舒从 仂仄亳仂于舒 亟舒舒亠
w_team features l_team features delta features
l_team features w_team features - delta features
1
0
p1+p2 仆亠 于亠亞亟舒 舒于仆仂 1. 舒仗亳仄亠, 亟仍 xgboost舒.
丐仂亞亟舒 仄仂亢仆仂 仗亠亠亳舒 仗仂 仂仄仍舒仄:
Train:
team_1 team_2 delta features
team_2 team_1 - delta features
p1
p2
Test:
p1 =
1
1+2
p2 =
2
1+2
亟亠 亠亞亠亳亳
  1- 从仂仄舒仆亟舒 仗仂弍亠亟亳仍舒 2-  舒亰仆亳亠亶 , 仂亞亟舒 亠仍亠于亠
仗亠亠仄亠仆仆亠 弍亟 舒于仆 + 亳 - 仂仂于亠于亠仆仆仂. 亳弍仂 仄仂亢仆仂
亳仗仂仍亰仂于舒 1+0,03* 亳 0-0,03*.
 亠 亠磳 亳仆仂仄舒亳 仂 仂仄, 仆舒从仂仍从仂 仂亟仆舒 从仂仄舒仆亟舒
仂从舒亰舒仍舒 亳仍仆亠亠 亟亞仂亶.
亳亰仆舒从亳
History results
弌舒亳亳从舒 弍仂从仂于, 仗仂亟弍仂仂于 亳 .仗. 亰舒 仗仂仍亠亟仆亳亶 亞仂亟 (从仂仍亰亠亠 亠亟仆亠亠)
仂弍亠亟, 仗仂舒亢亠仆亳 亳 winrate 亰舒 仗仂仍亠亟仆亳亶 亞仂亟
弌亠亟仆 舒亰仆仂 仂从仂于 仗仂仍亠 仄舒舒 亰舒 仗仂仍亠亟仆亳亶 亞仂亟
弌从仂仍从仂 亟仆亠亶 仆舒亰舒亟 弍仍舒 仗亠亟亟舒 亳亞舒
弌从仂仍从仂 亠亰仂仆仂于 从仂仄舒仆亟舒 舒于亠 于 仆亳亠, 亟仂仄舒仆亳亶 仍亳 仄舒 亳 .亟.
StatsTeams achievements
舒仗亳仄亠, 仗仂仄仄亳亠仄 亳仍仂 亳亞
于 仆亳舒 亰舒 仗仂仍亠亟仆亳亠 N 仍亠
仂仗仂仍仆亳亠仍仆亠 亟舒仆仆亠
Massey ordinals
Kenpom data
Teams coaches
仂仗. 亟舒仆仆亠 亟仍 于亠 从仂仄舒仆亟  2002 亞仂亟舒
亠亶亳仆亞亳 从仂仄舒仆亟  舒亰仆 亳亠仄(132 仆亳从舒仍仆)  2003 亞仂亟舒
仆仂仄舒亳 仂 亠仆亠舒 亟仍 从舒亢亟仂亶 从仂仄舒仆亟
Tourney seeds 舒仆仆亠 仂 仗仂亠于仆 仆仂仄亠舒 从仂仄舒仆亟
Geography data 亟亠 仗仂仂亟亳仍亳 仄舒亳
亟亠 舒亰仆 亟舒舒亠仂于
Massey ordinals
Kenpom data
Teams coaches
Geography data Teams Achievements
Coaches Achievements
Stats 1
Tourney seeds
Elo
Glicko
HistoryStats 2
Dataset 1
2003+
Dataset 2
1985+
Tourney Dataset 1
2003+
Tourney Dataset 2
1985+
2003+ 1985+
Massey ordinals
Kenpom data
Teams coaches
Tourney seeds
Geography data
Teams Achievements
Coaches Achievements
Stats
Elo
Glicko
Dataset 1
2003+
Dataset 2
1985+
Tourney Dataset 1
2003+
Tourney Dataset 2
1985+
History
XGB Level 1
XGB Regression
Elo
Glicko
History
+
XGB Level 2
Logistic Regression
Level 0 Level 1 Level 2 Level 3 Final
Elo predict
Glicko predict
History predict
+
+?
Net Prophets Entry
Prediction
x4
blending
舒亳弍仂仍亠亠 于舒亢仆亠 仗亳亰仆舒从亳
 亠仂亞舒亳亠从亳亠 亟舒仆仆亠
 Elo
 (score)
 亠亟从舒亰舒仆亳亠 仗仂 亳仂亳亳 亳亞
 亠亶亳仆亞亳 仆亠从仂仂 亳亠仄
仆舒仍亳亰 仗亠亟从舒亰舒仆亳亶 舒仆亳从仂于
March machine learning mania 2016
March machine learning mania 2016
亟亠亳 仆舒 弍亟亠亠
1. 仂弍舒于亳 仆仂于 亳仆仂仄舒亳 (亟舒仆仆亠 仂 舒于从舒, 亳亞仂从舒)
2. 仗仂仍亰仂于舒 舒仍亞仂亳仄: NeuralNets, KNN
3. 仂亢仆仂 仂仗亳仄亳亰亳仂于舒 仆亠 logloss, 舒 仄舒仂亢亳亟舒仆亳亠 于亳亞舒 于
亟亠仆亞舒 亳仍亳 仄亠仂 仆舒 leaderboard
 仂舒仆舒仍亳亰亳仂于舒 仗亠亟从舒亰舒仆亳 亟亞亳 舒仆亳从仂于
4. 亳亟仄舒 仄亠仂亟 亳仄仍亳亳 仆亳舒
 仂仄仂亢亠 仗仂仍亳 弍仂仍亠 亟舒仆仆
 亟亠 仗仂仍亠亰亠仆 亟仍 舒仆舒仍亳亰舒 仗亠亟从舒亰舒仆亳亶 亟亞亳 舒仆亳从仂于

More Related Content

March machine learning mania 2016

  • 1. March Machine Learning Mania 2016 弌仍从舒 仆舒 Kaggle 弌仄亠亟仂于 仆仂仆 仗亠仍 2016
  • 5. 弌仗亠亳亳从舒 亠亟从舒亰于舒亠仄 弍亟亠亠 弌仂亠于仆仂于舒仆亳亠 仗仂仂亟亳 亢亠 亳 亞仂亟舒 仗仂亟磲 仆舒亳亠仍仆仂亠 于仍亳礌亳亠 仍舒亶仆仂亳 仂亢仆仂 亳仗仂仍亰仂于舒 仍ミ英亠 于仆亠仆亳亠 亟舒仆仆亠 舒仍仂 亳仆仂仄舒亳亳 仂 仆亳仆 仄舒舒 (~2000 仂从) 亠仂弍仍舒亟舒仆亳亠 feature engineringa
  • 6. 舒仆仆亠 仂亳亠从亳亠 亟舒仆仆亠 1985 亞仂亟舒: 从仂, 从仂亞亟舒, 亞亟亠 亳 从亠仄 亞舒仍 仂亟仂弍仆亠 亟舒仆仆亠 2003 亞仂亟舒, 亟仂弍舒于仍磳 舒亳亳从舒 仄舒亠亶: 弍仂从亳, 仗仂亟弍仂, ~145K 仂从 亟仍 亠亞仍仆仂亞仂 亠亰仂仆舒 ~2K 亟仍 仆亳仂于 ~71K 仂从 亟仍 亠亞仍仆仂亞仂 亠亰仂仆舒 ~850 亟仍 仆亳仂于
  • 7. Elo rating system 舒亢亟仂亶 从仂仄舒仆亟亠 仗亳于舒亳于舒亠 仆舒舒仍仆亶 亠亶亳仆亞, 仆舒仗亳仄亠, 1500. 仍 从舒亢亟仂亶 从仂仄舒仆亟 亳舒亠 仄舒仂亢亳亟舒仆亳亠 于亳亞舒仆仆 仂从仂于 (1 仗仂弍亠亟舒, 0 仗仂舒亢亠仆亳亠): 弍仆仂于仍磳 亠亶亳仆亞: K K-factor, 亠仄 仄亠仆亠 K, 亠仄 弍仂仍亠亠 从仂仆亠于舒亳于仆舒 亳亠仄舒. 仂亢仆仂 于于亠亳 仗仂仗舒于从亳 仆舒 亳亞 亟仂仄舒, 仆舒 舒亰仆仂 于 亠. Wiki
  • 8. 亞亳亠 亠亶亳仆亞仂于亠 亳亠仄 Glicko 仍亠仆仆舒 于亠亳 Elo. Chessmetrics 仗仂仗仂亠, 仆仂 弍仂仍亠亠 于于亳亠仍仆舒 从 束于仂仂亟亳仄 亰于亰亟舒仄損. TrueSkill 亠亶亳仆亞仂于舒 亳亠仄舒 仂 Microsoft.
  • 9. 仂亳 于亠 亟舒仆舒 亳仂亳 仄舒亠亶 仄亠亢亟 亟于仄 从仂仄舒仆亟舒仄亳. 于亠亟仄 亟仍 从舒亢亟仂亶 亳亰 从仂仄舒仆亟 束于亠損, 仂仗亠亟亠仍磳仄亶 亟舒于仆仂 亠 仗仂弍亠亟: 1 = 0,5 + ゐ_1 粒 2 = 0,5 + ゐ_2 粒 亞亟亠 ゐ_1 从仂仍亳亠于仂 仗仂弍亠亟 仗亠于仂亶 从仂仄舒仆亟 仆舒亟 于仂仂亶 n 仍亠 仆舒亰舒亟, 粒 从仂亳亳亠仆 亰舒舒仆亳. 丐仂亞亟舒 仄仂亢仆仂 亟亠仍舒 仗亠亟从舒亰舒仆亳亠: p1 = 1 1+2 p2 = 2 1+2 舒仗亳仄亠, 亠仍亳 从仂仄舒仆亟 亳亞舒仍亳 仂亟亳仆 舒亰 于 仂仄 亞仂亟 亳 弍仂仍亠 于亠 仆亳从仂亞亟舒 仆亠 弍仍仂, 于亠仂仆仂 仗仂于仂仆仂亶 仗仂弍亠亟 仗仂弍亠亟亳亠仍 仂亠仆亳于舒亠 于 0.75.
  • 10. 舒从 仂仄亳仂于舒 亟舒舒亠 弌亠 亟舒仆仆亠 仆亠仍亰 仗仂仂 舒从 亟舒 舒仍亞仂亳仄
  • 11. 舒从 仂仄亳仂于舒 亟舒舒亠 w_team l_team w_team features l_team features target Train:
  • 12. 舒从 仂仄亳仂于舒 亟舒舒亠 w_team l_team w_team features l_team features l_team w_team l_team features w_team features 1 0 Train:
  • 13. 舒从 仂仄亳仂于舒 亟舒舒亠 w_team features l_team features delta features l_team features w_team features - delta features 1 0 p1+p2 仆亠 于亠亞亟舒 舒于仆仂 1. 舒仗亳仄亠, 亟仍 xgboost舒. 丐仂亞亟舒 仄仂亢仆仂 仗亠亠亳舒 仗仂 仂仄仍舒仄: Train: team_1 team_2 delta features team_2 team_1 - delta features p1 p2 Test: p1 = 1 1+2 p2 = 2 1+2
  • 14. 亟亠 亠亞亠亳亳 1- 从仂仄舒仆亟舒 仗仂弍亠亟亳仍舒 2- 舒亰仆亳亠亶 , 仂亞亟舒 亠仍亠于亠 仗亠亠仄亠仆仆亠 弍亟 舒于仆 + 亳 - 仂仂于亠于亠仆仆仂. 亳弍仂 仄仂亢仆仂 亳仗仂仍亰仂于舒 1+0,03* 亳 0-0,03*. 亠 亠磳 亳仆仂仄舒亳 仂 仂仄, 仆舒从仂仍从仂 仂亟仆舒 从仂仄舒仆亟舒 仂从舒亰舒仍舒 亳仍仆亠亠 亟亞仂亶.
  • 15. 亳亰仆舒从亳 History results 弌舒亳亳从舒 弍仂从仂于, 仗仂亟弍仂仂于 亳 .仗. 亰舒 仗仂仍亠亟仆亳亶 亞仂亟 (从仂仍亰亠亠 亠亟仆亠亠) 仂弍亠亟, 仗仂舒亢亠仆亳 亳 winrate 亰舒 仗仂仍亠亟仆亳亶 亞仂亟 弌亠亟仆 舒亰仆仂 仂从仂于 仗仂仍亠 仄舒舒 亰舒 仗仂仍亠亟仆亳亶 亞仂亟 弌从仂仍从仂 亟仆亠亶 仆舒亰舒亟 弍仍舒 仗亠亟亟舒 亳亞舒 弌从仂仍从仂 亠亰仂仆仂于 从仂仄舒仆亟舒 舒于亠 于 仆亳亠, 亟仂仄舒仆亳亶 仍亳 仄舒 亳 .亟. StatsTeams achievements 舒仗亳仄亠, 仗仂仄仄亳亠仄 亳仍仂 亳亞 于 仆亳舒 亰舒 仗仂仍亠亟仆亳亠 N 仍亠
  • 16. 仂仗仂仍仆亳亠仍仆亠 亟舒仆仆亠 Massey ordinals Kenpom data Teams coaches 仂仗. 亟舒仆仆亠 亟仍 于亠 从仂仄舒仆亟 2002 亞仂亟舒 亠亶亳仆亞亳 从仂仄舒仆亟 舒亰仆 亳亠仄(132 仆亳从舒仍仆) 2003 亞仂亟舒 仆仂仄舒亳 仂 亠仆亠舒 亟仍 从舒亢亟仂亶 从仂仄舒仆亟 Tourney seeds 舒仆仆亠 仂 仗仂亠于仆 仆仂仄亠舒 从仂仄舒仆亟 Geography data 亟亠 仗仂仂亟亳仍亳 仄舒亳
  • 17. 亟亠 舒亰仆 亟舒舒亠仂于 Massey ordinals Kenpom data Teams coaches Geography data Teams Achievements Coaches Achievements Stats 1 Tourney seeds Elo Glicko HistoryStats 2 Dataset 1 2003+ Dataset 2 1985+ Tourney Dataset 1 2003+ Tourney Dataset 2 1985+ 2003+ 1985+
  • 18. Massey ordinals Kenpom data Teams coaches Tourney seeds Geography data Teams Achievements Coaches Achievements Stats Elo Glicko Dataset 1 2003+ Dataset 2 1985+ Tourney Dataset 1 2003+ Tourney Dataset 2 1985+ History XGB Level 1 XGB Regression Elo Glicko History + XGB Level 2 Logistic Regression Level 0 Level 1 Level 2 Level 3 Final Elo predict Glicko predict History predict + +? Net Prophets Entry Prediction x4 blending
  • 19. 舒亳弍仂仍亠亠 于舒亢仆亠 仗亳亰仆舒从亳 亠仂亞舒亳亠从亳亠 亟舒仆仆亠 Elo (score) 亠亟从舒亰舒仆亳亠 仗仂 亳仂亳亳 亳亞 亠亶亳仆亞亳 仆亠从仂仂 亳亠仄
  • 23. 亟亠亳 仆舒 弍亟亠亠 1. 仂弍舒于亳 仆仂于 亳仆仂仄舒亳 (亟舒仆仆亠 仂 舒于从舒, 亳亞仂从舒) 2. 仗仂仍亰仂于舒 舒仍亞仂亳仄: NeuralNets, KNN 3. 仂亢仆仂 仂仗亳仄亳亰亳仂于舒 仆亠 logloss, 舒 仄舒仂亢亳亟舒仆亳亠 于亳亞舒 于 亟亠仆亞舒 亳仍亳 仄亠仂 仆舒 leaderboard 仂舒仆舒仍亳亰亳仂于舒 仗亠亟从舒亰舒仆亳 亟亞亳 舒仆亳从仂于 4. 亳亟仄舒 仄亠仂亟 亳仄仍亳亳 仆亳舒 仂仄仂亢亠 仗仂仍亳 弍仂仍亠 亟舒仆仆 亟亠 仗仂仍亠亰亠仆 亟仍 舒仆舒仍亳亰舒 仗亠亟从舒亰舒仆亳亶 亟亞亳 舒仆亳从仂于