ã¿ãªããããã«ã¡ã¯ïŒ ã¢ã«ãŽãªãºã ã®æŠåšãè£å
ããæãæ¥ãŸããã
ä»æ¥ãç§ãã¡ã¯æã人æ°ãããå®è·µãããŠããæ©æ¢°åŠç¿ã¢ã«ãŽãªãºã ã®1ã€ã§ããåŸé
ããŒã¹ãã£ã³ã°ã培åºçã«åæããŸãã ããŒã¹ãã£ã³ã°ã®ã«ãŒããã©ãããæ¥ãŠãã¢ã«ãŽãªãºã ã®è£ã§å®éã«äœãèµ·ãã£ãŠããã®ãã«ã€ããŠ-ã«ããã®äžã§ããŒã¹ãã£ã³ã°ã®äžçãžã®ç§ãã¡ã®ã«ã©ãã«ãªæ
ã§ã
UPDïŒçŸåšãã³ãŒã¹ã¯è±èªã§ã mlcourse.aiãšãããã©ã³ãåã§ãMedium ã«é¢ããèšäº ãKaggleïŒ Dataset ïŒããã³GitHubã«é¢ããè³æããããŸã ã
ãªãŒãã³ã³ãŒã¹ã®2åç®ã®ç«ã¡äžãïŒ2017幎9æãã11æïŒã®äžç°ãšããŠããã®èšäºã«åºã¥ããè¬çŸ©ã®ãã㪠ã
ã·ãªãŒãºã®èšäºã®ãªã¹ã ãã®èšäºã®æŠèŠïŒ
- 玹ä»ãšæŽå²ã®åäž
- GBMã¢ã«ãŽãªãºã
- æ倱é¢æ°
- GBMçè«ã®èŠçŽ
- 宿é¡
- 䟿å©ãªãªã³ã¯
1.ããŒã¹ãã£ã³ã°ã®çŽ¹ä»ãšæŽå²
ããŒã¿åæã«é¢äžããã»ãšãã©ã®äººã¯ãå°ãªããšã1åã¯ããŒã¹ããè¡ãããšãèããŠããŸãã ãã®ã¢ã«ãŽãªãºã ã¯ã次ã®ã¿ã¹ã¯ã§è©ŠããŠã¿ã䟡å€ã®ããã¢ãã«ã®ã玳士çšã»ãããã«å«ãŸããŠããŸãã Xgboostã¯ãå€ãã®å Žåã MLã³ã³ããã£ã·ã§ã³ã«åã€ããã®æšæºçãªã¬ã·ãã«é¢é£ä»ããããŠããããã¹ã¿ãã¯xgboostsãã«é¢ããããŒã ãçã¿åºããŠããŸãã ãããŠããŒã¹ãã£ã³ã°ã¯ã»ãšãã©ã®æ€çŽ¢ãšã³ãžã³ã®éèŠãªéšåã§ãã äžè¬çãªéçºã«ã€ããŠãããŒã¹ãã£ã³ã°ãã©ã®ããã«çŸããŠçºå±ããããèŠãŠã¿ãŸãããã
ããŒã¹ãã£ã³ã°ã¹ããŒãªãŒ
ãã¹ãŠã¯ãå€æ°ã®æ¯èŒç匱ãåçŽãªã¢ãã«ãã1ã€ã®åŒ·ãã¢ãã«ãååŸã§ãããã©ãããšããåé¡ããå§ãŸããŸããã 匱ãã¢ãã«ãšã¯ããã¥ãŒã©ã«ãããã¯ãŒã¯ãªã©ã®ããã匷ããã¢ãã«ãšã¯å¯Ÿç
§çã«ãææ決å®ããªãŒã®ãããªå°ããåçŽãªã¢ãã«ã ãã§ã¯ãããŸããã ç§ãã¡ã®å Žåã匱ãã¢ãã«ã¯ä»»æã®æ©æ¢°åŠç¿ã¢ã«ãŽãªãºã ã§ããããã®ç²ŸåºŠã¯ã©ã³ãã ãªæšæž¬ããããããã«é«ãå ŽåããããŸãã
ãã®è³ªåã«å¯Ÿããè¯å®çãªæ°åŠççãã¯éåžžã«è¿
éã«èŠã€ãããŸãããããèªäœãéèŠãªçè«ççµæã§ããïŒMLã®åžå°æ§ïŒã ãã ããå®è¡å¯èœãªã¢ã«ãŽãªãºã ãšAdaboostãç»å ŽãããŸã§ã«æ°å¹ŽããããŸããã 圌ãã®äžè¬çãªã¢ãããŒãã¯ãå
¥åããŒã¿ãéã¿ä»ãããããšã«ãããåçŽãªã¢ãã«ã®ç·åœ¢çµåïŒåºæ¬ã¢ã«ãŽãªãºã ïŒãç±å¿ã«æ§ç¯ããããšã§ããã åŸç¶ã®åã¢ãã«ïŒååãšããŠæ±ºå®æšïŒã¯ã以åã«èª€ã£ãŠäºæž¬ããã芳枬å€ã«å€§ããªéã¿ãšåªå
床ãäžãããããªæ¹æ³ã§æ§ç¯ãããŸããã
Adaboostã«ã€ããŠããå°ãè¯ãæ¹æ³ã§ãæ©æ¢°åŠç¿ã«é¢ããæ®ãã®ã»ãšãã©ã®ã³ãŒã¹ã®äŸã«åŸããåŸé
ããŒã¹ãã£ã³ã°ã®åã«ããã®å身ã§ããAdaboostã泚ææ·±ã調ã¹ãŠãã ããã ãã ããAdaboostã¯GBMãšã®å䜵ãçµãŠããããåãªããã©ã€ããŒããªããªãšãŒã·ã§ã³ã§ããããšãæããã«ãªã£ããããããã«æãèå³æ·±ããã®ã«é²ãããšã«ããŸããã
ã¢ã«ãŽãªãºã èªäœã¯ãèåŸã«ãã芳枬å€ãèšéããçŽæãéåžžã«æ確ã«èŠèŠçã«è§£éããŠããŸãã Adaboostã®åå埩ã§ããŒã¿ãæ·±ã1ã®ããªãŒïŒãããããåãæ ªãïŒã§é€ç®ããããšããåé¡åé¡ã®ããã¡ãã®äŸãèããŠã¿ãŸãããã æåã®2åã®ç¹°ãè¿ãã§ã次ã®å³ã衚瀺ãããŸãã
ãã€ã³ãã®ãµã€ãºã¯ã誀ã£ãäºæž¬ã§åãåãéã¿ã«å¯Ÿå¿ããŸãã ãããŠãåå埩ã§ããããã®éã¿ãã©ã®ããã«å¢å ãããã確èªããŸã-åãæ ªã¯ãã®ãããªã¿ã¹ã¯ã ãã§ã¯å¯ŸåŠã§ããŸããã ãã ãã以åã«æ§ç¯ãããåãæ ªã®å éæ祚ãè¡ããšãç®çã®åé¢ãåŸãããŸãã
Adaboostã®åäœã®ãã詳现ãªäŸãäžé£ã®å埩äžã«ãç¹ã«ã¯ã©ã¹éã®å¢çã§ããã€ã³ããé 次å¢å ããããšã瀺ãããŠããŸãã
Adaboostã¯ããŸãæ©èœããŸããããã¢ããªã³ã§åäœããã¢ã«ãŽãªãºã ã®æ£åœæ§ãã»ãšãã©ãªãã£ããããããŸããŸãªæšæž¬ããããã®åšå²ã«çºçããŸããïŒèª°ãããããè¶
ã¢ã«ãŽãªãºã ãšéæ³ã®åŒŸäžžãšèŠãªãã誰ããæççã§ããããã®æèŠãå
±æããŸããã¿ã€ããªãªãŒããŒãã£ããã£ã³ã°ã«ããã»ãšãã©é©çšã§ããªãã¢ãããŒãã ããã¯ãAdaboostãäžå®å®ã§ããããšãå€æããé«æåºããŒã¿ãžã®é©çšæ§ã«ç¹ã«åœãŠã¯ãŸããŸããã 幞ããªããšã«ããã§ã«ã©ããœããšã©ã¹ãã£ãã¯ããããã©ã³ãã ãã©ã¬ã¹ãã®äžçããããããã¹ã¿ã³ãã©ãŒãçµ±èšåŠéšã®ææãããžãã¹ãå§ãããšãã1999幎ã«ãžã§ããŒã ããªãŒããã³ã¯ãããŒã¹ãã£ã³ã°ã¢ã«ãŽãªãºã ïŒåŸé
ããŒã¹ãã£ã³ã°ãå¥åGradient BoostingïŒMachineïŒãå¥åGBMïŒã®ææã®äžè¬åãæãä»ããŸããã ãã®äœæ¥ã«ãããããªãŒããã³ã¯ããã«å€ãã®ã¢ã«ãŽãªãºã ãäœæããããã®çµ±èšçåºç€ãèšå®ããæ©èœç©ºéã§æé©åãšããŠããŒã¹ãããããã®äžè¬çãªã¢ãããŒããæäŸããŸããã
ã¹ã¿ã³ãã©ãŒãã¬ã¬ã·ãŒäžè¬çã«ãã¹ã¿ã³ãã©ãŒãçµ±èšå±ã®ããŒã ã¯ãCARTãããŒãã¹ãã©ããããã®ä»å€ãã®ããšã«é¢äžããŠãããå°æ¥ã®çµ±èšæç§æžã«ååãäºåã«å
¥åããŠããŸãã æŠããŠãç§ãã¡ã®æ¥åžžçãªããŒã«ã®å€§éšåãããã«ç»å Žãã誰ãä»ã«äœãçŸããããç¥ã£ãŠããŸãã ãŸãã¯ããã§ã«ç»å ŽããŠããŸããããŸã ååãªé
åžãèŠã€ãããŸããïŒ glinternetãªã©ïŒã
ããªãŒããã³èªèº«ã®ãããªã¯ããŸããããŸããã ãããã圌ãšäžç·ã«ãCARTã®äœæã«é¢ããéåžžã«èå³æ·±ãã€ã³ã¿ãã¥ãŒããããŸããäžè¬ã«ã40幎以äžåã®çµ±èšçåé¡ïŒããŒã¿ãšããŒã¿ãµã€ãšã³ã¹ã®åæã«èµ·å ããïŒã®è§£æ±ºæ¹æ³ã«ã€ããŠïŒ
ããŒã¿åæã®æçãªæŽå²ã®ã·ãªãŒãºãããããªãã䜿çšããç§ãã¡ã®æ¯æ¥ã®æ¹æ³ãäœæããåå è
ã®1人ããã®ããŒã¿åæã®å顧ãšã Hastieããã®è¬çŸ©ããããŸãïŒ
å®éãã¢ã«ãŽãªãºã ã®æ§ç¯ã«ããããšã³ãžãã¢ãªã³ã°ããã³ã¢ã«ãŽãªãºã ã®ç 究ïŒMLã®ç¹åŸŽïŒããããã®ãããªã¢ã«ãŽãªãºã ã®æ§ç¯ããã³ç 究æ¹æ³ã®æ¬æ Œçãªæ¹æ³è«ãžã®ç§»è¡ããããŸããã æ°åŠçã¹ã¿ããã£ã³ã°ã®èŠ³ç¹ããèŠããšãäžèŠãããšããããã»ã©å€§ããªå€åã¯ãããŸããïŒåŒ±ãã¢ã«ãŽãªãºã ãè¿œå ïŒããŒã¹ãïŒãã以åã®ã¢ãã«ããã¡ã€ãã©ã€ãºãããŠããªãããŒã¿ã®éšåãåŸã
ã«æ¹åããŠã¢ã³ãµã³ãã«ãæ§ç¯ããŸãã ãããã次ã®åçŽãªã¢ãã«ãæ§ç¯ãããšããããã¯ãªãŒããŒãŠãšã€ãããã芳枬ã ãã§ãªããç®çé¢æ°ã®äžè¬çãªåŸé
ãããè¯ãè¿äŒŒãããããªæ¹æ³ã§æ§ç¯ãããŸãã æŠå¿µã¬ãã«ã§ã¯ãããã¯æ³ååãšæ¡åŒµã®å€§ããªç¯å²ãäžããŸããã
GBMã®æŽå²
åŸé
ããŒã¹ãã£ã³ã°ã¯ãããã«ã玳士ã®ã»ãããã«çœ®ãæããããšã¯ãããŸããã§ãã-ç»å ŽããŠãã10幎以äžããããŸããã ãŸããããŒã¹GBMã«ã¯ãããŸããŸãªçµ±èšã¿ã¹ã¯çšã®å€ãã®æ¡åŒµæ©èœããããŸããæ¢åã®GAMã¢ãã«ã匷åããGLMboostãšGAMboostãçåæ²ç·çšã®CoxBoostãã©ã³ãã³ã°çšã®RankBoostãšLambdaMARTã§ãã 第äºã«ãç°ãªãååãšç°ãªããã©ãããã©ãŒã ã§åãGBMã®å€ãã®å®è£
ããããŸãïŒç¢ºççGBMãGBDTïŒåŸé
ããŒã¹ã決å®æšïŒãGBRTïŒåŸé
ããŒã¹ãååž°ããªãŒïŒãMARTïŒå€éå æ³ååž°ããªãŒïŒãäžè¬ããŒã¹ãã£ã³ã°ãã·ã³ãšããŠã®GBMãã®ä»ã ããã«ãæ©æ¢°åŠç¿è
ã®ã³ãã¥ããã£ã¯éåžžã«çŽ°ååãããŠããããã¹ãŠã«åŸäºããŠããŸããããã®ãããããŒã¹ãã£ã³ã°ã®æåã远跡ããããšã¯éåžžã«å°é£ã§ãã
åæã«ãããŒã¹ãã£ã³ã°ã¯æ€çŽ¢ãšã³ãžã³çµæã®ã¿ã¹ã¯ã®ã©ã³ãã³ã°ã«ç©æ¥µçã«äœ¿çšããå§ããŸããã ãã®åé¡ã¯ãæ倱é¢æ°ã®èŠ³ç¹ããæžãåºããããã®ã§ãããçºè¡é åºã®ãšã©ãŒã«çœ°éãç§ããããGBMã«åçŽã«æ¿å
¥ããã®ã䟿å©ã«ãªããŸããã AltaVistaã¯ã©ã³ãã³ã°ã®ããŒã¹ããå°å
¥ããæåã®äŒæ¥ã®1ã€ã§ãããããã«YahooãYandexãBingãªã©ãç¶ããŸããã ããã«ãå®è£
ã«ã€ããŠèšãã°ãä»åŸæ°å¹ŽéããŒã¹ãããããšãäœæ¥ãšã³ãžã³å
ã®äž»èŠãªã¢ã«ãŽãªãºã ã«ãªããããã€ãã®ç§åŠèšäºã«å«ãŸããå¥ã®äº€æå¯èœãªç 究æè¡ã§ã¯ãªããªã£ããšèšãããŸããã
ããŒã¹ãã£ã³ã°ãæ®åãããäž»ãªåœ¹å²ã¯ãML競æãç¹ã«kaggleãæãããŸããã ç 究è
ã¯ãååãªåå è
ãšã¿ã¹ã¯ããããã¢ã«ãŽãªãºã ãšã¢ãããŒããåãã人ã
ãæå
端ã®ãªãŒãã³ãªéäºã§ç«¶äºããããã®å
±éã®ãã©ãããã©ãŒã ãé·ãéæ¬ ããŠããŸããã ã¬ã¬ãŒãžã§å¥ã®å¥è·¡ã®ã¢ã«ãŽãªãºã ãæé·ãããé°é¬±ãªãã€ãã®å€©æã¯ããã¯ãéããããããŒã¿ã«åž°ããããšãã§ãããéã«çã®ãã¬ãŒã¯ã¹ã«ãŒã©ã€ãã©ãªã¯ãéçºã®ããã®åªãããã©ãããã©ãŒã ãåãåããŸããã ããã¯ãŸãã«ããŒã¹ãã§èµ·ãã£ãããšã§ãããã»ãŒããã«kaggleã«å®çããŸãã ïŒ2011幎以éãåè
ãšã®ã€ã³ã¿ãã¥ãŒã§GBMãæ¢ãå¿
èŠããããŸãïŒãã©ã€ãã©ãªãšããŠã®xgboostã¯ãç»å ŽåŸããã«äººæ°ãåããŸããã åæã«ãxgboostã¯æ°ãããŠããŒã¯ãªã¢ã«ãŽãªãºã ã§ã¯ãªããè¿œå ã®ãã¥ãŒãªã¹ãã£ãã¯ãåããå€å
žçãªGBMã®éåžžã«å¹ççãªå®è£
ã§ãã
ãããŠã2017幎ã«ãæ°åŠçåé¡ãã¢ã«ãŽãªãºã æè¡ããMLã®éåžžã«å
žåçãªæ¹æ³ã§ãéåžžã®ã¢ã«ãŽãªãºã ãšéåžžã®æ¹æ³è«ã®åºçŸããæåããå®çšçãªã¢ããªã±ãŒã·ã§ã³ãšãã®åºçŸåŸã®å€§é䜿çšã«è³ãã¢ã«ãŽãªãºã ã䜿çšããŠããŸãã
2. GBMã¢ã«ãŽãªãºã
MLåé¡ã®ã¹ããŒãã¡ã³ã
æåž«ã«ããæå°ãšããäžè¬çãªæèã§ãæ©èœãå埩ããåé¡ã解決ããŸãã æ©èœãã¢ã®ã»ããããããŸã \倧x ããã³ã¿ãŒã²ããå€æ° \倧Y ã \倧\å·Š\ {ïŒx_iãy_iïŒ\å³\} _ {i = 1ã\ ldotsãn}\倧\å·Š\ {ïŒx_iãy_iïŒ\å³\} _ {i = 1ã\ ldotsãn} ããã©ãŒã ã®äŸåé¢ä¿ã埩å
ããŸã \倧y = f ïŒ x ïŒ ã è¿äŒŒã«ãã埩å
ããŸã \倧\åžœåF ïŒ X ïŒ ãã©ã®è¿äŒŒãããè¯ãããç解ããããã«ãæ倱é¢æ°ããããŸã \倧L ïŒ y ã f ïŒ ãããæå°åããŸãïŒ
\倧Y \çŽ\ãããF ïŒ X ïŒã\倧\ãããfïŒxïŒ=\ã¢ã³ããŒã»ããfïŒxïŒ arg min LïŒyãfïŒxïŒïŒ
ãããŸã§ã®ãšãããäžæ¯ã®çš®é¡ã«ã€ããŠã¯äœãä»®å®ããŠããŸããã \倧fïŒxïŒ ãè¿äŒŒã®ã¢ãã«ã«ã€ããŠã§ã¯ãããŸãã \倧\åžœåfïŒxïŒ ããŸãã¿ãŒã²ããå€æ°ã®ååžã«ã€ã㊠\倧y ã ããã¯æ©èœã§ãã \倧LïŒyãfïŒ åŸ®åå¯èœã§ãªããã°ãªããŸããã äžçã®ãã¹ãŠã®ããŒã¿ã§åé¡ã解決ããã®ã§ã¯ãªããèªç±ã«äœ¿ããããŒã¿ã§ã®ã¿è§£æ±ºããå¿
èŠããããããæåŸ
ã®ç¹ã§ãã¹ãŠãæžãæããŸãã ã€ãŸããè¿äŒŒå€ãæ¢ããŸã \倧\åžœåfïŒxïŒ ãã®ãããå¹³åã§æ¬¡ã®ããŒã¿ã®æ倱é¢æ°ãæå°åã§ããŸãã
large hatfïŒxïŒ= undersetfïŒxïŒ arg min mathbbExãy[LïŒyãfïŒxïŒïŒ]
ç³ãèš³ãããŸããããæ©èœ \倧fïŒxïŒ äžçã«ããããããã ãã§ã¯ãããŸãã-ãããã®éåžžã«æ©èœçãªç©ºéã¯ç¡éã§ãã ãããã£ãŠãäœããã®æ¹æ³ã§åé¡ã解決ããããã«ãæ©æ¢°åŠç¿ã§ã¯ãæ€çŽ¢ç©ºéã¯éåžžãç¹å®ã®ãã©ã¡ãŒã¿ãŒåãããé¢æ°ãã¡ããªãŒã«å¶éãããŸã largefïŒxã thetaïŒã theta in mathbbRd ã ããã«ããããã©ã¡ãŒã¿å€ã®ãã§ã«å®å
šã«è§£æ±ºå¯èœãªæé©åã«æžå°ãããããåé¡ã倧å¹
ã«ç°¡çŽ åãããŸãã
large hatfïŒxïŒ=fïŒxã hat thetaïŒã large hat theta= underset theta arg min mathbbExãy[LïŒyãfïŒxã\ã·ãŒã¿ïŒïŒ]
æé©ãªãã©ã¡ãŒã¿ãŒã®åæãœãªã¥ãŒã·ã§ã³ \倧\åžœå\ã·ãŒã¿ 1è¡ã«ååšããããšã¯ã»ãšãã©ãªããããéåžžããã©ã¡ãŒã¿ãŒã¯å埩çã«è¿äŒŒãããŸãã ãŸããçµéšçæ倱é¢æ°ãæžãåºãå¿
èŠããããŸã LargeL thetaïŒ hat thetaïŒ ããŒã¿ã«åºã¥ããŠããããã©ã®çšåºŠé©åã«è©äŸ¡ãããã瀺ããŸãã è¿äŒŒå€ãæžããŸã \倧\åžœå\ã·ãŒã¿ ã®ããã« \倧M åã®åœ¢åŒã§ã®å埩ïŒæ確ã«ãããããããã³ããŒã¹ãã«æ
£ããããïŒïŒ
large hat theta= sumMi=1 hat thetaiã largeL thetaïŒ hat thetaïŒ= sumNi=1LïŒyiãfïŒxiã hat thetaïŒïŒ
ãã€ã³ãã¯å°ããã§ã-æå°åããé©åãªå埩ã¢ã«ãŽãªãºã ã䜿çšããã ãã§ã LargeL thetaïŒ hat thetaïŒ ã æãç°¡åã§æãé »ç¹ã«äœ¿çšããããªãã·ã§ã³ã¯ãåŸé
éäžã§ãã ãã®ããã«ã¯ãã°ã©ããŒã·ã§ã³ãæžãå¿
èŠããããŸã large nablaL thetaïŒ hat thetaïŒ ç¹°ãè¿ãè©äŸ¡ãè¿œå ããŸã \倧\åžœå thetai ããã«æ²¿ã£ãŠïŒãã€ãã¹èšå·ä»ã-ãšã©ãŒãæžãããå¢å ãããããªãïŒã ããã ãã§ããæåã®ã¢ãããŒããäœããã®æ¹æ³ã§åæåããã ãã§ãã \倧\åžœå theta0 ãããŠãå埩åæ°ãéžæããŸã \倧M ãã®æé ãç¶ããŸãã ã¡ã¢ãªãŒã䜿çšããªã圢åŒã§ã®è¿äŒŒå€ã®ä¿å \倧\åžœå\ã·ãŒã¿ åçŽãªã¢ã«ãŽãªãºã ã¯æ¬¡ã®ããã«ãªããŸãã
- åæãã©ã¡ãŒã¿ãŒè¿äŒŒã®åæå large hat theta= hat theta0
- å埩ããšã« \倧t=1ã\ããããM ç¹°ãè¿ããŸãïŒ
- æ倱é¢æ°ã®åŸé
ãèšç®ãã large nablaL thetaïŒ hat thetaïŒ çŸåšã®è¿äŒŒå€ã§ \倧\åžœå\ã·ãŒã¿
large nablaL thetaïŒ hat thetaïŒ= left[ frac partialLïŒyãfïŒxã thetaïŒïŒ partial theta right] theta= hat theta - çŸåšã®å埩è¿äŒŒãèšå®ãã \倧\åžœå thetat èšç®ãããåŸé
ã«åºã¥ããŠ
\倧\åžœå thetat\å·Šç¢å°â nablaL thetaïŒ hat thetaïŒ - ãã©ã¡ãŒã¿è¿äŒŒã®æŽæ° \倧\åžœå\ã·ãŒã¿ ïŒ
large hat theta leftarrow hat theta+ hat thetat= sumti=0 hat thetai
- æçµè¿äŒŒãä¿å \倧\åžœå\ã·ãŒã¿
large hat theta= sumMi=0 hat thetai - èŠã€ãã£ãæ©èœã䜿çšãã \倧ãã hatfïŒxïŒ=fïŒxã hat thetaïŒ äºçŽå¶
æ©èœåŸé
éäž
æèãåºããïŒæ©èœç©ºéã§æé©åãå®è¡ããè¿äŒŒãç¹°ãè¿ãæ€çŽ¢ã§ããããšãå°ãæ³åããŠãã ãã \倧\åžœåfïŒxïŒ é¢æ°èªäœã®åœ¢ã§ã è¿äŒŒã¯ããããããé¢æ°ã§ãã挞é²çãªæ¹åã®åèšãšããŠæžããŸãã 䟿å®äžãæåã®è¿äŒŒããå§ããŠããã®éãããã«èæ
®ããŸã \倧\åžœåf0ïŒxïŒ ïŒ
large hatfïŒxïŒ= sumMi=0 hatfiïŒxïŒ
ããžãã¯ã¯ãŸã èµ·ãã£ãŠããŸãããç§ãã¡ã¯ç§ãã¡ã®ã¢ãããŒããæ¢ãããšã«ããŸãã \倧\åžœåfïŒxïŒ å€æ°ã®ãã©ã¡ãŒã¿ãŒãæã€1ã€ã®å€§ããªã¢ãã«ïŒãã¥ãŒã©ã«ãããã¯ãŒã¯ãªã©ïŒã§ã¯ãªããé¢æ°ã®åèšã®åœ¢ã§ããã®ããã«æ©èœç©ºéã移åãããµããããŸãã
ãã®åé¡ã解決ããã«ã¯ãæ€çŽ¢ãããã€ãã®é¢æ°ãã¡ããªãŒã«éå®ããå¿
èŠããããŸã large hatfïŒxïŒ=hïŒxã thetaïŒ ã ããããæåã«ãã¢ãã«ã®åèšã¯ãã®ãã¡ããªãŒã®ã©ã®ã¢ãã«ãããè€éã«ãªãå¯èœæ§ããããŸãïŒæ·±ã1ã®2ã€ã®æšã®åãæ ªã®åèšã¯1ã€ã®åãæ ªã§ãã¯ãè¿äŒŒã§ããŸããïŒã 第äºã«ãäžè¬çãªã¿ã¹ã¯ã¯æ©èœç©ºéã§ãŸã è¡ãããŠããŸãã é¢æ°ã®åã¹ãããã§æé©ãªä¿æ°ãéžæããå¿
èŠãããããšãããã«èæ
®ããŸã \倧 rho in mathbbR ã ã¹ãããçš \倧t ã¿ã¹ã¯ã¯æ¬¡ã®ãšããã§ãã
large hatfïŒxïŒ= sumtâ1i=0 hatfiïŒxïŒã largeïŒ rhotã thetatïŒ= underset rhoã theta arg min mathbbExãy[LïŒyã hatfïŒxïŒ+ rho cdothïŒxã thetaïŒïŒ]ã large hatftïŒxïŒ= rhot cdothïŒxã thetatïŒ
ãããŠä»ãéæ³ã®æã§ãã ç§ãã¡ã¯ãã¹ãŠã®ã¿ã¹ã¯ãäžè¬çãªæ¹æ³ã§æžããŸããããŸãã§å¥œããªã¢ãã«ãåããèšç·Žã§ãããã®ããã§ãã \倧ããhïŒxã thetaïŒ æ倱é¢æ°ã«é¢ã㊠\倧ããLïŒyãfïŒxã thetaïŒïŒ ã å®éã«ã¯ãããã¯éåžžã«å°é£ã§ãããããåé¡ã解決äžã®ãã®ã«æžããç°¡åãªæ¹æ³ãèæ¡ãããŸããã
æ倱é¢æ°ã®åŸé
ã®è¡šçŸããããã°ãããŒã¿ã§ãã®å€ãèšç®ã§ããŸãã ããã§ã¯ãäºæž¬ããã®åŸé
ãšæãçžé¢ããããã«ã¢ãã«ããã¬ãŒãã³ã°ããŸãããïŒãã€ãã¹èšå·ä»ãïŒã ã€ãŸããOLSååž°ã®åé¡ã解決ãããããã®æ®å·®ã®äºæž¬ããŸã£ããã«ããããšããŸãã ãããŠåé¡ã®ããããããŠååž°ã®ããããããŠå
éšã®ã©ã³ãã³ã°ã®ããã«ãç§ãã¡ã¯åžžã«ç䌌æ®åºéã®å·®ã®äºä¹ãæå°åããŸã \倧r ãããŠç§ãã¡ã®äºæž¬ã ã¹ãããçš \倧t æçµã¿ã¹ã¯ã¯æ¬¡ã®ãšããã§ãã
large hatfïŒxïŒ= sumtâ1i=0 hatfiïŒxïŒã largerit=â left[ frac\éšåLïŒyiãfïŒxiïŒïŒ\éšåfïŒxiïŒ\å³]fïŒxïŒ= hatfïŒxïŒã quad mboxfori=1ã ldotsãnã large thetat= underset theta arg min sumni=1ïŒritâhïŒxiã thetaïŒïŒ2ã large rhot= underset rho arg min sumni=1LïŒyiã hatfïŒxiïŒ+ rho cdothïŒxiã thetatïŒïŒ
ããªãŒããã³ã®å€å
žçãªGBMã¢ã«ãŽãªãºã
1999幎ã«Jerome Friedmanã«ãã£ãŠææ¡ãããGBMã¢ã«ãŽãªãºã ãæçµçã«æžãåºãããã«å¿
èŠãªãã®ã¯ãã¹ãŠæã£ãŠããŸãã ç§ãã¡ã¯ä»ã§ããæåž«ãšæãããšããäžè¬çãªåé¡ã解決ããŠããŸãã ã¢ã«ãŽãªãºã ã®å
¥åæã«ãããã€ãã®ã³ã³ããŒãã³ããåéããå¿
èŠããããŸãã
- ããŒã¿ã»ãã \倧\å·Š\ {ïŒx_iãy_iïŒ\å³\} _ {i = 1ã\ ldotsãn} ;
- å埩åæ° \倧M ;
- æ倱é¢æ°ã®éžæ \倧LïŒyãfïŒ æãããã°ã©ããŒã·ã§ã³ã§;
- åºæ¬çãªã¢ã«ãŽãªãºã ã®æ©èœãã¡ããªãŒã®éžæ \倧ããhïŒxã thetaïŒ ã圌ãã®ãã¬ãŒãã³ã°ã®æé ;
- è¿œå ã®ãã€ããŒãã©ã¡ãŒã¿ãŒ \倧ããhïŒxã thetaïŒ ããšãã°ã決å®æšã®ããªãŒã®æ·±ãã
æŸçœ®ãããå¯äžã®ç¬éã¯ãæåã®ã¢ãããŒãã§ãã \倧ããf0ïŒxïŒ ã ç°¡åã«ããããã«ãå®æ°å€ã®ã¿ãåæåãšããŠäœ¿çšãããŸã \倧\ã¬ã³ã ã 圌ãšåæ§ã«æé©ãªæ¯ç \倧 rho ãã€ããªæ€çŽ¢ããŸãã¯ïŒåŸé
ã§ã¯ãªãïŒå
ã®æ倱é¢æ°ã«é¢é£ããå¥ã®ã©ã€ã³æ€çŽ¢ã¢ã«ãŽãªãºã ãèŠã€ããŸãã ãããã£ãŠãGBMã¢ã«ãŽãªãºã ïŒ
- å®æ°å€ã§GBMãåæåãã large hatfïŒxïŒ= hatf0ã hatf0= gammaã gamma in mathbbR
large hatf0= underset gamma arg min sumni=1LïŒyiã gammaïŒ - å埩ããšã« \倧t=1ã\ããããM ç¹°ãè¿ããŸãïŒ
- æ¬äŒŒæ®ãç©ãæ°ãã \倧ããrt
largerit=â left[ frac partialLïŒyiãfïŒxiïŒïŒ partialfïŒxiïŒ right]fïŒxïŒ= hatfïŒxïŒã quad mboxfori=1ã ldotsãn - æ°ããåºæ¬ã¢ã«ãŽãªãºã ãæ§ç¯ãã \倧ããhtïŒxïŒ æ¬äŒŒæ®éªžã®ååž°ã®ãã㪠\倧\å·Š\ {ïŒx_iãr_ {it}ïŒ\å³\} _ {i = 1ã\ ldotsãn}
- æé©ãªæ¯çãèŠã€ãã \倧 rhot 㧠\倧ããhtïŒxïŒ å
ã®æ倱é¢æ°ãšæ¯èŒããŠ
large rhot= underset rho arg min sumni=1LïŒyiã hatfïŒxiïŒ+ rho cdothïŒxiã thetaïŒïŒ - ä¿åãã \倧\åžœåftïŒxïŒ= rhot cdothtïŒxïŒ
- çŸåšã®ã¢ãããŒããæŽæ° \倧\åžœåfïŒxïŒ
large hatfïŒxïŒ leftarrow hatfïŒxïŒ+ hatftïŒxïŒ= sumti=0 hatfiïŒxïŒ
- æçµçãªGBMã¢ãã«ãæ§ç¯ãã \倧\åžœåfïŒxïŒ
large hatfïŒxïŒ= sumMi=0 hatfiïŒxïŒ - èšç·Žãããã¢ãã«ã§kaggleãšäžçãåŸæããïŒããã§äºæž¬ãç«ãŠãèªåã§ãããç解ããïŒ
GBMã®ã¹ããããã€ã¹ãããã®äŸ
ããã¡ãã®äŸã䜿çšããŠãGBMã®ä»çµã¿ãç解ããŠã¿ãŸãããã ãã€ãºã®å€ãé¢æ°ã埩å
ããããã«äœ¿çšããŸã largey=cosïŒxïŒ+ epsilonã epsilon sim mathcalNïŒ0ã frac15ïŒãx in[â5,5] ã
ããã¯å®éã®ã¿ãŒã²ããå€æ°ã§ã®ååž°ã®ã¿ã¹ã¯ã§ãããããäºä¹å¹³åå¹³æ¹æ ¹æ倱é¢æ°ã䜿çšããŸãã 300çµã®èŠ³æž¬å€ãèªåã§çæããããããæ·±ã2ã®æ±ºå®æšã§è¿äŒŒããŸããGBMã䜿çšããããã«å¿
èŠãªãã¹ãŠããŸãšããŸãããã
- ç©å
·ããŒã¿ \倧\å·Š\ {ïŒx_iãy_iïŒ\å³\} _ {i = 1ã\ ldotsã300} â
- å埩åæ° \倧M=3 â;
- RMSæ倱é¢æ° \倧LïŒyãfïŒ=ïŒyâfïŒ2 â
åŸé
\倧ããLïŒyãfïŒ=L2 æ倱ã¯ââæ®ãç©ã§ã \倧r=ïŒyâfïŒ â; - åºæ¬çãªã¢ã«ãŽãªãºã ãšããŠã®æ±ºå®æš \倧hïŒxïŒ â;
- 決å®æšã®ãã€ããŒãã©ã¡ãŒã¿ãŒïŒæšã®æ·±ãã¯2âã§ãã
äºä¹å¹³å誀差ã¯åçŽã§ãããåæåãããŠããŸã \倧\ã¬ã³ã ããã³ä¿æ°ä»ã \倧 rhot ã ã€ãŸããå¹³åå€ã§GBMãåæåããŸã large gamma= frac1n cdot sumni=1yi ãããŠãã¹ãŠ \倧 rhot 1ã«çãã
GBMãèµ·åãã2çš®é¡ã®ã°ã©ããæç»ããŸãïŒå®éã®è¿äŒŒ \倧\åžœåfïŒxïŒ ïŒéãã°ã©ãïŒãããã³æ§ç¯ãããåããªãŒ \倧\åžœåftïŒxïŒ æ¬äŒŒãã©ã³ã¹ïŒç·è²ã®ã°ã©ãïŒã«ã€ããŠã ã°ã©ãçªå·ã¯å埩çªå·ã«å¯Ÿå¿ããŸãã
2åç®ã®å埩ã§ãããªãŒãé¢æ°ã®ã¡ã€ã³ãã©ãŒã ãç¹°ãè¿ããããšã«æ³šæããŠãã ããã ãã ããæåã®å埩ã§ã¯ãã¢ã«ãŽãªãºã ãé¢æ°ã®ãå·Šãã©ã³ããã®ã¿ãæ§ç¯ããããšãããããŸãïŒ \倧x in[â5ãâ4] ïŒ ç§ãã¡ã®ããªãŒã«ã¯ããã«å¯Ÿç§°ãã©ã³ããæ§ç¯ããã®ã«ååãªæ·±ãããªããå·Šãã©ã³ãã®ãšã©ãŒã倧ããã£ããããããã¯ã²ã©ãèµ·ãããŸããã , "" .
: - , GBM . , , GBM , . , GBM , Brilliantly wrong :
http://arogozhnikov.imtqy.com/2016/06/24/gradient_boosting_explained.html3.
, , , , ? , y L(y,f) ã , , .
, â . : yâR yâ{â1,1} ã , , , .
yâR ã , , (y|x) . :
- L(y,f)=(yâf)2 , L2 loss, Gaussian loss. , . () â .
- L(y,f)=|yâf| , L1 loss, Laplacian loss. , , , . , , , , , .
- , Lq loss, Quantile loss. , , L1 , 75%-, α=0.75 ã , , .
Lq , 75%- . :
- {(xi,yi)}i=1,âŠ,300 â
- M=3 â;
- â;
- L0.75(y,f) â , α=0.75 ã -, :
ri=â[âL(yi,f(xi))âf(xi)]f(x)=Ëf(x)=
=αI(yi>Ëf(xi))â(1âα)I(yiâ€Ëf(xi)),for i=1,âŠ,300 â; - h(x) â;
- : 2 â;
â y ã , Ït , line search. , :
å®éã«ã¯ãéåžžã®ã¬ã ãã³ããšã¯éåžžã«ç°ãªãäœãããã¬ãŒãã³ã°ããŠããããšãèŠãã®ã¯çããã§ã-åå埩㧠\倧ri å¯èœãªå€ã¯2ã€ã ãã§ãã ãã ããGBMã®çµæã¯å
ã®é¢æ°ãšéåžžã«äŒŒãŠããŸãã
ãã®ããã¡ãã®äŸããåŠç¿ããããã«ã¢ã«ãŽãªãºã ãé¢ãããšã次ã®ããã«ã·ããããã2次æ倱é¢æ°ãšã»ãŒåãçµæãåŸãããŸãã \倧\çŽ0.135 ã ãããã90ïŒ
ãè¶
ããåäœæ°ãæ¢ããŠãããšãèšç®ãå°é£ã«ãªãå¯èœæ§ããããŸãã ã€ãŸããå¿
èŠãªåäœæ°ãè¶
ãããã€ã³ãæ°ã®æ¯çãå°ããããå ŽåïŒäžåè¡¡ãªã¯ã©ã¹ãªã©ïŒãã¢ãã«ã¯å®æ§çã«åŠç¿ã§ããŸããã éå®åã®ã¿ã¹ã¯ã解決ãããšãã¯ããã®ãããªãã¥ã¢ã³ã¹ãèæ
®ãã䟡å€ããããŸãã
ååž°æ倱é¢æ°ã«ã€ããŠããå°ãååž°åé¡ã®ããã«ãè¿œå ã®ããã¹ããã¹ããããã£ãæã€é¢æ°ãå«ãå€ãã®æ倱é¢æ°ãéçºãããŸããã ãã®ãããªäŸã®1ã€ããããŒããŒæ倱é¢æ°ã§ãã é¢æ°ã®æ¬è³ªã¯ãããããªåå·®ã§ã¯æ¬¡ã®ããã«æ©èœããããšã§ã \倧èŠæš¡ãªL2 ãæå®ã®ãããå€ããã次ã®ããã«æ©èœãå§ããŸã \倧L1 ã ããã«ãããå°ããªäžæ£ç¢ºããšåå·®ã«çŠç¹ãåãããã«ãé¢æ°ã®äžè¬çãªåœ¢åŒãžã®ç°åžžå€ãšããã«ç¶ãäºæ¬¡çã«å€§ããªèª€å·®ã®å¯äžãæžããããšãã§ããŸãã
次ã®ããã¡ãã®äŸã§ããã®æ倱é¢æ°ã®åäœã確èªã§ããŸãã åºç€ãšããŠãããã¡ãã®ããŒã¿é¢æ°ã䜿çšããŸã largey= fracsinïŒxïŒx ç¹å¥ãªãã€ãºãè¿œå ãããŸãããäžæ¹åæŸåºãžã§ãã¬ãŒã¿ãŒãšããŠæ©èœããã¬ãŠã¹ååžãšãã«ããŒã€ååžã®æ··åã æ倱é¢æ°èªäœã¯ADããããã«è¡šç€ºããã察å¿ããGBMã¯FHããããã«è¡šç€ºãããŸãïŒEããããã«ã¯åæé¢æ°ïŒã
ãããŠã
倧ããªè§£å床㧠ã
ãã®äŸã§ã¯ãèŠèŠçãªæçãã®ããã®åºæ¬çãªã¢ã«ãŽãªãºã ãšããŠã¹ãã©ã€ã³ã䜿çšãããŸããã çµå±ã®ãšãããç§ãã¡ã¯ãã§ã«ããªããæšã ãã§ãªã解æŸã§ãããšèšã£ãŠããŸããïŒ
äŸã®çµæã«ãããšã人çºçã«äœæããããã€ãºã®åé¡ã«ããã \倧èŠæš¡ãªL2 ã \倧L 1 ãããŠãããŒããŒæ倱ã¯éåžžã«é¡èã§ãã ããŒããŒæ倱ãã©ã¡ãŒã¿ãŒãé©åã«éžæãããšããªãã·ã§ã³ã®äžããé¢æ°ã®æé©ãªè¿äŒŒãååŸããããšããã§ããŸãã ãŸãããã®äŸã§ã¯ãæ¡ä»¶ä»ãåäœæ°ã®éããã¯ã£ãããšèŠããŸãïŒãã®å Žåã¯10ïŒ
ã50ïŒ
ã90ïŒ
ïŒã
æ®å¿µãªããšã«ãããŒããŒæ倱é¢æ°ã¯ãã¹ãŠã®ææ°ã®ã©ã€ãã©ãªã«å®è£
ãããŠããŸããïŒãŸã xgboostã§ã¯ãªãh2oã«å®è£
ãããŠããŸãïŒã æ¡ä»¶ä»ãåäœæ°ãæ¡ä»¶ä»ããšã¯ã¹ããã€ããªã©ã®ãšããŸããã¯ãªãã®ãå«ããä»ã®èå³æ·±ãæ倱é¢æ°ã«ãåãããšãåœãŠã¯ãŸããŸã ã ããããäžè¬çã«ããã®ãããªãªãã·ã§ã³ãååšãã䜿çšã§ããããšãç¥ãããšã¯éåžžã«äŸ¿å©ã§ãã
åé¡æ倱é¢æ°
次ã«ããã€ããªåé¡ãåæããŸãã \倧ããy \ in \å·Š\ {-1ã1 \å³\} ã GBMã䜿çšãããšãéåžžã«åŸ®åäžå¯èœãªæ倱é¢æ°ã§ãã£ãŠãæé©åããããšãã§ããããšãæ¢ã«èŠãŸããã äžè¬ã«ãããããããšãªãããã®ã±ãŒã¹ãããã€ãã®å¥ã®ååž°åé¡ãšããŠè§£æ±ºããããšããããšãã§ããŸã \倧èŠæš¡ãªL2 æ倱ã§ãããããŸãæ£ç¢ºã§ã¯ãããŸããïŒå¯èœã§ããïŒã
ã¿ãŒã²ããå€æ°ã®ååžã¯æ ¹æ¬çã«ç°ãªããããã¯ã©ã¹ã©ãã«èªäœã§ã¯ãªãã察æ°å°€åºŠãäºæž¬ããã³æé©åããŸãã ãããè¡ãããã«ãä¹ç®ãããäºæž¬ãšçã®ã©ãã«äžã®æ倱é¢æ°ãåå®åŒåããŸã \倧y cdotf ïŒçç±ããªãããã§ã¯ãªããç°ãªãæåã®ã©ãã«ãéžæããŸããïŒã ãã®ãããªæ倱ã®åé¡é¢æ°ã®æãæåãªããªã¢ã³ãïŒ
- \倧LïŒyãfïŒ=logïŒ1+expïŒâ2yfïŒïŒ ã圌女ã¯ããžã¹ãã£ãã¯æ倱ã§ããã圌女ã¯ãã«ããŒã€æ倱ã§ãã èå³æ·±ãç¹æ§ã¯ãã¯ã©ã¹ã©ãã«ãæ£ç¢ºã«äºæž¬ããããšããã§ããããšã§ãã ããããããã¯ãã°ã§ã¯ãããŸããã ããã©ãããããã®æ倱é¢æ°ãæé©åããããšã«ããããã¹ãŠã®èŠ³æž¬å€ãæ£ããäºæž¬ãããå Žåã§ããã¯ã©ã¹ããæŒãã®ãããåé¡åšãæ¹åãç¶ããããšãã§ããŸãã ããã¯ããã€ããªåé¡ã§æãæšæºçã§äžè¬çã«äœ¿çšãããæ倱é¢æ°ã§ãã
- \倧ããLïŒyãfïŒ=expïŒâyfïŒ ãããã¯Adaboostã®æ倱ã§ãã ãã®ãããåŸæ¥ã®Adaboostã¢ã«ãŽãªãºã ã¯ããã®æ倱é¢æ°ãåããGBMãšåçã§ãã æŠå¿µçã«ã¯ããã®æ倱é¢æ°ã¯ããžã¹ãã£ãã¯æ倱ã«éåžžã«äŒŒãŠããŸãããåé¡ãšã©ãŒã«å¯Ÿããææ°ããã«ãã£ãããå³ãããããŸã䜿çšãããŸããã
åé¡åé¡ã®æ°ããããã¡ãããŒã¿ãçæããŸãã ãã€ãºã®å€ãã³ãµã€ã³ãåºç€ãšããŠãã¿ãŒã²ããå€æ°ã®ã¯ã©ã¹ãšããŠç¬Šå·é¢æ°ã䜿çšããŸãã æ°ããããŒã¿ã¯æ¬¡ã®ãšããã§ãïŒæ確ã«ããããã«ãžãã¿ãã€ãºãè¿œå ãããŠããŸãïŒã
ããžã¹ãã£ãã¯æ倱ã䜿çšããŠãå®éã«ããŒã¹ããããã®ã確èªããŸãã åãšåæ§ã«ã決å®ãããã®ããŸãšããŸãã
- ç©å
·ããŒã¿ \倧\å·Š\ {ïŒx_iãy_iïŒ\å³\} _ {i = 1ã\ ldotsã300}ãy_i \ in \å·Š\ {-1ã1 \å³\} â
- å埩åæ° \倧M=3 â;
- æ倱é¢æ°ãšããŠ-ããžã¹ãã£ãã¯æ倱ããã®åŸé
ã¯æ¬¡ã®ããã«èæ
®ãããŸãïŒ
largeri= frac2 cdotyi1+expïŒ2 cdotyi cdot hatfïŒxiïŒïŒã quad mboxfori=1ã ldotsã300ã â; - åºæ¬çãªã¢ã«ãŽãªãºã ãšããŠã®æ±ºå®æš \倧hïŒxïŒ â;
- 決å®æšã®ãã€ããŒãã©ã¡ãŒã¿ãŒïŒæšã®æ·±ãã¯2âã§ãã
ä»åã¯ã¢ã«ãŽãªãºã ã®åæåã§ããã¹ãŠãå°ãè€éã«ãªããŸãã ãŸããç§ãã¡ã®ã¯ã©ã¹ã¯äžåè¡¡ã§ãããçŽ63ïŒ
ãã37ïŒ
ã®å²åã§åå²ãããŠããŸãã 第äºã«ãæ倱é¢æ°ã®åæåã®åæåŒã¯äžæã§ãã ã ããæã
ã¯èŠãŠãããŸã \倧\åžœåf0=\ã¬ã³ã æ€çŽ¢ïŒ
æé©ãªåæè¿äŒŒã¯-0.273ã®é åã§èŠã€ãããŸããã è² ã®å€ã«ãªããšæšæž¬ã§ããŸããïŒæã人æ°ã®ãããã¹ãŠã®ã¯ã©ã¹ãäºæž¬ããæ¹ãæçã§ãïŒãæ£ç¢ºãªå€ã®å
¬åŒã¯ããã§ã«è¿°ã¹ãããã«ããã§ã¯ãããŸããã ããŠãæåŸã«GBMãèµ·åããŠãå®éã«äœãèµ·ããããèŠãŠã¿ãŸãããã
ã¢ã«ãŽãªãºã ã¯æ£åžžã«æ©èœããã¯ã©ã¹ã®åé¢ã埩å
ããŸããã ãäžäœãé åãã©ã®ããã«åé¢ãããŠããããããªãŒããã¬ãã£ãã¯ã©ã¹ã®æ£ããäºæž¬ã«èªä¿¡ãæã£ãŠããããšãããã³ã¯ã©ã¹ãæ··åãããå Žæã§2ã€ã®ã¹ããããã©ã®ããã«åœ¢æããããã確èªã§ããŸãã æ¬äŒŒæ®åºãããããªãå€ãã®æ£ããåé¡ããã芳枬å€ãšãããŒã¿ã®ãã€ãºã«èµ·å ãã倧ããªèª€å·®ã䌎ãäžå®æ°ã®èŠ³æž¬å€ãããããšãããããŸãã GBMãåé¡åé¡ã§å®éã«äºæž¬ãããã®ã®ããã«èŠããŸãïŒããžã¹ãã£ãã¯æ倱é¢æ°ã®æ¬äŒŒæ®å·®ã®ååž°ïŒã
éã
ã¿ã¹ã¯ã«å¯ŸããŠãããå
·äœçãªæ倱é¢æ°ãèãåºããšãã«ç¶æ³ãçºçããããšããããŸãã ããšãã°ãéèã·ãªãŒãºã®äºæž¬ã§ã¯ãæç³»åã®å€§ããªåãã«å€§ããªéã¿ãä»ãããå ŽåããããŸãããŸããã¯ã©ã€ã¢ã³ãã®æµåºãäºæž¬ããã¿ã¹ã¯ã§ã¯ãLTVã®é«ãã¯ã©ã€ã¢ã³ãããã®æµåºïŒå¯¿åœã®äŸ¡å€ãã¯ã©ã€ã¢ã³ããå°æ¥ç§ãã¡ã«ãããããéã®éïŒãäºæž¬ããæ¹ãé©åã§ãã
çµ±èšæŠå£«ã®æ¬åœã®æ¹æ³ã¯ãç¬èªã®æ倱é¢æ°ãèãåºããããã«å¯Ÿããå°é¢æ°ãæžãïŒãããŠãããå¹æçãªèšç·Žã®ããã«ãããã»è¡åãïŒããã®é¢æ°ãå¿
èŠãªç¹æ§ãæºãããŠãããã©ããã泚ææ·±ããã§ãã¯ããããšã§ãã ããããã©ããã§ééããç¯ããèšç®äžã®å°é£ã«é¥ããäžè¬ã«ç 究ã«å®¹èªã§ããªãã»ã©é·ãæéãè²»ããå¯èœæ§ãé«ãã
代ããã«ãéåžžã«ã·ã³ãã«ãªããŒã«ãçºæãããŸãããããã¯å®éã«ã¯ã»ãšãã©èŠããããŠããŸãã-芳枬å€ã®éã¿ä»ããšéã¿é¢æ°ã®å²ãåœãŠã§ãã ãã®ãããªéã¿ä»ãã®æãç°¡åãªäŸã¯ãã¯ã©ã¹ã®ãã©ã³ã¹ããšãããã«éã¿ãèšå®ããããšã§ãã äžè¬çãªå Žåãå
¥åå€æ°ã®ããã«ããŒã¿ã®ãµãã»ãããããã£ãŠããå Žå \倧 ããã³ã¿ãŒã²ããå€æ° \倧y ç§ãã¡ã®ã¢ãã«ã«ãšã£ãŠéåžžã«éèŠã§ãã \倧wïŒxãyïŒ ã äž»ãªããšã¯ãã¹ã±ãŒã«ã®åçæ§ã«é¢ããäžè¬çãªèŠä»¶ãæºããããšã§ãã
largewi in mathbbRã largewi geq0 quad mboxfori=1ã ldotsãnã large sumni=1wi>0
éã¿ã¯ãæ倱é¢æ°èªäœã解決ããåé¡ã«åãããŠåŸ®èª¿æŽããæéã倧å¹
ã«ççž®ããã¢ãã«ã®ã¿ãŒã²ããããããã£ã䜿çšããå®éšãä¿é²ã§ããŸãã ãããã®éã¿ãã©ã®ããã«æ£ç¢ºã«èšå®ãããã¯ãå°ãåµé çãªã¿ã¹ã¯ã§ãã GBMã¢ã«ãŽãªãºã ãšæé©åã®èŠ³ç¹ãããåçŽã«ã¹ã«ã©ãŒãŠã§ã€ããè¿œå ãããã®æ§è³ªã«ç®ãã€ã¶ã£ãŠãã ããã
largeLwïŒyãfïŒ=w cdotLïŒyãfïŒã largerit=âwi cdot left[ frac partialLïŒyiãfïŒxiïŒïŒ partialfïŒxiïŒ right]fïŒxïŒ= hatfïŒxïŒã quad mboxfori=1ã ldotsãn
ä»»æã®éã¿ã®å Žåãã¢ãã«ã®çŸããçµ±èšç¹æ§ãããããªãããšã¯æããã§ãã äžè¬ã«ãéã¿ãå€ã«é¢é£ä»ãã \倧y ãèãæã€ããšãã§ããŸãã ããšãã°ãæ¯äŸéã¿ã®äœ¿çš \倧|y| 㧠\倧L1 æ倱é¢æ°-åçã§ã¯ãªã \倧èŠæš¡ãªL2 åŸé
ã¯äºæž¬èªäœã®å€ãèæ
®ããªããããæ倱 \倧\åžœåfïŒxïŒ ã
ç§ãã¡ã®èœåãããããç解ããããã«ãããããã¹ãŠã«ã€ããŠè°è«ããŠããŸãã ããã¡ãããŒã¿ã®éã¿ã®éåžžã«ãšããŸããã¯ãªäŸãèããŠã¿ãŸãããã 匷ãé察称ãªéã¿é¢æ°ã次ã®ããã«å®çŸ©ããŸãã
$$ display $$ \ large \ begin {equation} wïŒxïŒ= \ left \ {\ begin {array} {@ {} ll @ {}} 0.1ãïŒ\ text {if} \ x \ leq 0 \\ 0.1 + | cosïŒxïŒ|ãïŒ\ text {if} \ x> 0 \ end {array} \ rightã \ end {equation} $$衚瀺$$
ãã®ãããªéã¿ã®å©ããåããŠã2ã€ã®ããããã£ã衚瀺ãããããšãæåŸ
ãããŸãã \倧 ãããã³é¢æ°ã®åœ¢åŒãå
ã®ã³ãµã€ã³ã«ããé¡äŒŒããŠããŸãã æé©ä¿æ°ã®ã©ã€ã³ãµãŒããå«ããåé¡ã®åã®äŸããååŸããä»ã®ãã¹ãŠã®GBMèšå®ã ç§ãã¡ãåŸããã®ãèŠãŠã¿ãŸãããïŒ
çµæã¯æåŸ
ãããã®ã§ãã æåã«ãæåã®å埩ã§ãå
ã®ã³ãµã€ã³ãã»ãŒç¹°ãè¿ããç°ãªãç䌌æ®åºãã©ãã ããããã確èªã§ããŸãã 第äºã«ãé¢æ°ã°ã©ãã®å·ŠåŽã¯ãéã¿ã倧ããå³åŽãåªå
ããŠã»ãšãã©ç¡èŠãããŸããã 第äžã«ã3åç®ã®å埩ã§åãåã£ãé¢æ°ã¯å€ãã®è©³çŽ°ãåãåããå
ã®ã³ãµã€ã³ã«ããã«äŒŒããã®ã«ãªããŸããïŒãŸããç°¡åãªåé©åãéå§ããããšã«ããïŒã
éã¿ã¯ãç§ãã¡èªèº«ã®å±éºãšãªã¹ã¯ãåããŠãã¢ãã«ã®ããããã£ã倧å¹
ã«å¶åŸ¡ã§ãã匷åãªããŒã«ã§ãã æ倱é¢æ°ãæé©åããå Žåã¯ãæåã«åçŽãªåé¡ã解決ããããšãè©Šã¿ãŸãããå¿
èŠã«å¿ããŠèŠ³æž¬å€ã®éã¿ãè¿œå ããŸãã
4. GBMçè«ã®èŠçŽ
ä»æ¥ã¯ãåŸé
ããŒã¹ãã£ã³ã°ã®åºæ¬çè«ã«ã€ããŠèª¬æããŸããã GBMã¯ç¹å®ã®ã¢ã«ãŽãªãºã ã§ã¯ãªããã¢ãã«ã®ã¢ã³ãµã³ãã«ãæ§ç¯ããæ¹æ³ã«é¢ããäžè¬çãªæ¹æ³è«ã§ãã ããã«ããã®æ¹æ³è«ã¯éåžžã«æè»ã§æ¡åŒµå¯èœã§ããããŸããŸãªæ倱é¢æ°ãèæ
®ã«å
¥ããªããå€æ°ã®ã¢ãã«ããã¬ãŒãã³ã°ããåæã«ãããã«ããŸããŸãªéã¿é¢æ°ããããããšãã§ããŸãã
æ©æ¢°åŠç¿ã®ç«¶äºã®å®è·µãšçµéšã瀺ãããã«ãæšæºã¿ã¹ã¯ïŒéåžžã«ã¹ããŒã¹ãªããŒã¿ã ãã§ãªããåçãšãªãŒãã£ãªãé€ããã¹ãŠïŒã§ãGBMã¯ã»ãšãã©ã®å Žåæãå¹ççãªã¢ã«ãŽãªãºã ã§ãïŒGBMã¯ã»ãšãã©åžžã«ã¹ã¿ãã¯ãšäžäœã¬ãã«ã®ã¢ã³ãµã³ãã«ãé€ããŸãïŒãããã®äžå¯æ¬ ãªéšåã§ãïŒã 匷ååŠç¿ ïŒMinecraftãICML 2016ïŒã«ã¯GBMã®é©å¿ããããã³ã³ãã¥ãŒã¿ãŒããžã§ã³ã§ãŸã 䜿çšãããŠããViola-Jonesã¢ã«ãŽãªãºã ã¯Adaboostã«åºã¥ããŠããŸãã
ãã®èšäºã§ã¯ãGBMã®æ£ååã確çè«ãããã³é¢é£ããã¢ã«ãŽãªãºã ã®ãã€ããŒãã©ã¡ãŒã¿ãŒã«é¢é£ãã質åãç¹ã«çç¥ããŸããã ç§ãã¡ãããããšããã«ã¢ã«ãŽãªãºã ã®å°æ°ã®å埩ãéžãã ã®ã¯çç±ããªãããã§ã¯ãããŸããã \倧M = 3 ã 3ã€ã§ã¯ãªã30æ¬ã®ããªãŒãéžæããäžèšã®ããã«GBMãå®è¡ããå Žåãçµæã¯ããŸãäºæž¬ã§ããŸããã
ãã®ãããªç¶æ³ã§äœããã¹ãããGBMæ£ååãã©ã¡ãŒã¿ãŒãçžäºã«ã©ã®ããã«çžäºæ¥ç¶ãããŠããããããã³åºæ¬ã¢ã«ãŽãªãºã ã®ãã€ããŒãã©ã¡ãŒã¿ãŒã«ã€ããŠã¯ã次ã®èšäºã§èª¬æããŸãã ãã®äžã§ãææ°ã®ããã±ãŒãž-xgboost ã lightgbmããã³h2oãåæããé©åãªæ§æãå®è·µããŸãã ãããŸã§ã®éãGBMèšå®ãå¥ã®éåžžã«ã¯ãŒã«ãªã€ã³ã¿ã©ã¯ãã£ããã¢ã§ãã¬ã€ããããšããå§ãããŸãã
http://arogozhnikov.imtqy.com/2016/07/05/gradient_boosting_playground.html5.宿é¡
å®éã®å®¿é¡ã¯ãã³ãŒã¹ã®æ¬¡ã®ã»ãã·ã§ã³ã§çºè¡šãããŸããVKã°ã«ãŒããšã³ãŒã¹ãªããžããªã§ãã©ããŒã§ããŸã ã
ç·Žç¿ãšããŠã ãã®ã¿ã¹ã¯ãå®äºããŸã -ãã©ã€ãé
延ãäºæž¬ããããã«ã Kaggle Inclass ã³ã³ããã£ã·ã§ã³ã®åçŽãªããŒã¹ã©ã€ã³ãç Žãå¿
èŠããããŸãã
6.䟿å©ãªãªã³ã¯
貎éãªã³ã¡ã³ããå¯ããŠãããyorko ïŒãŠãªã»ã«ã·ããããŒïŒãšãç·šéãæäŒã£ãŠãããbauchgefuehl ïŒAnastasia ManokhinaïŒã«æè¬ããŸãã