
1ãæåãã¬ã³ã¿ã¯ãåãããªãŒããã¿ãŒããŒãã³ã°ãããã ã4ã€ã®åŠéšã®ããããã«ãœãŒã·ã£ã«ãããã¯ãŒã¯ãžã®ã¢ã¯ã»ã¹ãæäŸããåå è
ãç¹å®ããã³ã³ãã¹ããéå§ããŸããã ç«¶äºã¯ããŸãè¡ãããç°ãªãçºé³ã®ååã¯ç°ãªãåŠéšã«ãã£ãŠæ±ºå®ãããåæ§ã®è±èªãšãã·ã¢èªã®ååãšå§ãåæ§ã®æ¹æ³ã§é
åžãããŸããã ååžãåãšå§ã®ã¿ã«äŸåããŠãããã©ããããããŠåäººã®æ°ãä»ã®èŠå ãäœããã®åœ¢ã§èæ
®ãããŠãããã©ããã¯ããããŸãããããã®ç«¶äºã¯ãã®èšäºã®ã¢ã€ãã¢ã瀺åããŸããïŒåé¡åšããŒãããèšç·ŽããŠããŠãŒã¶ãŒãç°ãªãåŠéšã«é
åžã§ããããã«ããŸãã
ãã®èšäºã§ã¯ãååãšå§ã«å¿ããŠãããªãŒããã¿ãŒã®éšçœ²ã«äººã
ãé
åžããç°¡åãªMLã¢ãã«ãäœæããŸããCRISPã®æ¹æ³è«ã«åŸã£ãŠå°ããªèª¿æ»ããã»ã¹ãå®è¡ããŸããã ã€ãŸããç§ãã¡ïŒ
- åé¡ãè¿°ã¹ãŸãã
- ãã®ãœãªã¥ãŒã·ã§ã³ãžã®å¯èœãªã¢ãããŒãã調æ»ããããŒã¿èŠä»¶ïŒãœãªã¥ãŒã·ã§ã³ã®æ¹æ³ãšããŒã¿ïŒãçå®ããŸãã
- å¿
èŠãªããŒã¿ïŒè§£æ±ºæ¹æ³ãšããŒã¿ïŒãåéããŸãã
- åéããããŒã¿ã»ããã調æ»ããŸãïŒæ¢çŽ¢çç ç©¶ïŒã
- çããŒã¿ãããã£ãŒãã£ãæœåºããŸãïŒãã£ãŒãã£ãšã³ãžãã¢ãªã³ã°ïŒã
- æ©æ¢°åŠç¿ã®ã¢ãã«ãèšç·ŽããŠã¿ãŸãããïŒã¢ãã«è©äŸ¡ïŒã
- åŸãããçµæãæ¯èŒãããœãªã¥ãŒã·ã§ã³ã®å質ãè©äŸ¡ããå¿
èŠã«å¿ããŠãã©ã°ã©ã2ã6ãç¹°ãè¿ããŸãã
- ãœãªã¥ãŒã·ã§ã³ã䜿çšå¯èœãªãµãŒãã¹ïŒçç£ïŒã«ããã¯ããŸãã

ãã®ã¿ã¹ã¯ã¯ç°¡åã«æãããããããªãã®ã§ãããã»ã¹å
šäœïŒ2æéæªæºïŒããã³ãã®èšäºïŒèªã¿åãæéã15åæªæºïŒã«è¿œå ã®å¶éã課ããŸãã
ãã§ã«ããŒã¿ãµã€ãšã³ã¹ã®çŸããçŽ æŽãããäžçã«æ²¡é ããŠããŠãKaggliteãçµ¶ããèŠãããªãå ŽåããŸãã¯ïŒç¥ãçŠããŠããïŒååãšã®ããŒãã£ã³ã°äžã«Hadupã®é·ããæž¬å®ãããå Žåãèšäºã¯åçŽã§èå³ããããããªãããã«èŠããŸãã ãŸããæçµã¢ãã«ã®å質ã¯ãã®èšäºã®äž»ãªäŸ¡å€ã§ã¯ãããŸããã èŠåããŸããã è¡ãã
ãã®èšäºã§äœ¿çšãããŠããã³ãŒããå«ãgithubãªããžããªã¯ã奜å¥å¿ã®åŒ·ãèªè
ã«ãå©çšã§ããŸãã ãšã©ãŒãçºçããå Žåã¯ãPRãéããŠãã ããã
æç¢ºãªå€æåºæºãæããªãåé¡ãç¡éã«è§£æ±ºããããšã¯å¯èœã§ãããã®ãããå
¥åãããè¡ã«å¿çããŠããGryffindorãããRavenclawãããHufflepuffãããŸãã¯ãSlytherinãã®çããåŸãããšãã§ãããœãªã¥ãŒã·ã§ã³ãåŸãããšããã«å€æããŸãã
å®éããã©ãã¯ããã¯ã¹ãååŸããå¿
èŠããããŸãã
" " => [?] => Griffindor
å
ã®é»ãåžœåã¯ãæ§æ Œãšå人çãªè³è³ªã«å¿ããŠãè¥ããŠã£ã¶ãŒããéšéã«é
åžããŸããã ã¿ã¹ã¯ã®æ¡ä»¶ã«å¿ãããã£ã©ã¯ã¿ãŒãšæ§æ Œã«é¢ããããŒã¿ã¯å©çšã§ããªããããåå è
ã®ååãšå§ã䜿çšããŸããæ¬ã®ãã£ã©ã¯ã¿ãŒãæ¬ã®åºèº«ã®åŠéšã«å¯Ÿå¿ããåŠéšã«åé
ããå¿
èŠãããããšãå¿ããªãã§ãã ããã ãããŠãç§ãã¡ã®æ±ºå®ãããªãŒããããã«ãããŸãã¯ã¬ã€ãŽã³ã¯ããŒã«é
åžããå Žåããããããã¢ã¯ééããªãåæºããŸãïŒããããæ¬ã®ç²Ÿç¥ãäŒããããã«ããªãŒãã°ãªãã£ã³ããŒã«ãšã¹ãªã¶ãªã³ã«çãã確çã§éãã¹ãã§ãïŒã
確çã«ã€ããŠè©±ããŠããã®ã§ãããå³å¯ãªæ°åŠçšèªã§åé¡ã圢åŒåããŸãã ããŒã¿ãµã€ãšã³ã¹ã®èгç¹ãããåé¡åé¡ã解決ããŸããã€ãŸããç¹å®ã®ã¯ã©ã¹ã®ãªããžã§ã¯ãïŒååãšå§ã®åœ¢åŒã®æååïŒãå²ãåœãŠãŸãïŒå®éã«ã¯ãã©ãã«ãŸãã¯ã©ãã«ã§ããããã¯ãyes / noå€ãæã€æ°å€ãŸãã¯4ã€ã®å€æ°ã§ãïŒ ïŒ å°ãªããšãããªãŒã®å Žåãã°ãªãã£ã³ããŒã«ãšã¹ãªã¶ãªã³ã®2ã€ã®çããäžããã®ãæ£ããããšãçè§£ããŠããŸãããã®ãããåžœåãå®çŸ©ããç¹å®ã®åŠéšãäºæž¬ããªãæ¹ãããã§ããããããã®åŠéšã«äººãå²ãåœãŠããã確çã¯ãããçš®ã®æ©èœ
ææšãšå質è©äŸ¡
ã¿ã¹ã¯ãšç®æšãå®åŒåããã ä»ãç§ãã¡ã¯ããã解決ããæ¹æ³ãèããŸã ããããããã ãã§ã¯ãããŸããã 調æ»ãéå§ããã«ã¯ãåè³ªææšãå
¥åããå¿
èŠããããŸãã ã€ãŸãã2ã€ã®ç°ãªããœãªã¥ãŒã·ã§ã³ãçžäºã«æ¯èŒããæ¹æ³ã決å®ããŸãã
人çã®ãã¹ãŠãè¯ããŠã·ã³ãã«ã§ã-ã¹ãã æ€åºåšã¯åä¿¡ã¡ãã»ãŒãžã«æå°éã®ã¹ãã ãæž¡ãå¿
èŠããããå¿
èŠãªæåãæå€§éã«æž¡ãå¿
èŠãããããšãçŽæçã«çè§£ããŠããŸãã
å®éã«ã¯ããã¹ãŠãããè€éã§ããã ããã®ç¢ºèªã¯ãã©ã®ããã«ã©ã®ãããªã¡ããªãã¯ã䜿çšããããã説æãã倿°ã® èšäºã§ãã ç·Žç¿ã¯ãããæãããçè§£ããã®ã«åœ¹ç«ã¡ãŸãããããã«ã€ããŠã¯å¥ã®æçš¿ãæžãã誰ããå®éã«ãããã©ã®ããã«ç°ãªãããçè§£ããŠçè§£ã§ããããã«ããªãŒãã³ãªããŒãã«ãäœæããããšããçŽæããŸãã
ç§ãã¡ã«ãšã£ãŠäžåž¯ã¯ãæé«ã®ãã®ãéžã³ãŸãããããROC AUCã§ãã ããã¯ããã®å Žåã®ã¡ããªãã¯ã«å¿
èŠãªãã®ã§ãã誀æ€ç¥ãå°ãªããå®éã®äºæž¬ãæ£ç¢ºã§ããã°ããã»ã©ãROC AUCã倧ãããªããŸãã
çæ³çãªROCã¢ãã«ã®å ŽåãAUCã¯1ã§ãããã¯ã©ã¹ã絶察çã«ã©ã³ãã ã«å®çŸ©ããçæ³çãªã©ã³ãã ã¢ãã«-0.5ã§ãã
ã¢ã«ãŽãªãºã
ãã©ãã¯ããã¯ã¹ã§ã¯ãæ¬ã®ããŒããŒã®ååžãèæ
®ããå
¥åãšããŠå¥ã®ååãšå§ã䜿çšããçµæãæäŸããå¿
èŠããããŸãã åé¡ã®åé¡ã解決ããã«ã¯ãããŸããŸãªæ©æ¢°åŠç¿ã¢ã«ãŽãªãºã ã䜿çšã§ããŸãã
ãã¥ãŒã©ã«ãããã¯ãŒã¯ãå æ°åè§£ãã·ã³ãç·åœ¢ååž°ããŸãã¯SVMãªã©ã
äžè¬çãªä¿¡å¿µã«åããŠãããŒã¿ãµã€ãšã³ã¹ã¯ãã¥ãŒã©ã«ãããã¯ãŒã¯ã ãã«éå®ãããããã®ã¢ã€ãã¢ãæ®åãããããã«ããã®èšäºã§ã¯ãã¥ãŒã©ã«ãããã¯ãŒã¯ã¯å¥œå¥å¿readerçãªèªè
ãžã®æŒç¿ãšããŠæ®ãããŠããŸãã ããŒã¿åæã§åäžã®ã³ãŒã¹ïŒç¹ã«äž»èгçã«åªããã³ãŒã¹-ODSïŒãåè¬ããªãã£ã人 ããŸãã¯ã¢ããã¥ã¢æŒåž«ã®éèªã«ãæ²èŒãããŠããæ©æ¢°åŠç¿ãAIã«é¢ããnåã®ãã¥ãŒã¹ãåã«èªãã 人ã¯ãããããã¢ã«ãŽãªãºã ã®äžè¬çãªã°ã«ãŒãã®ååãæºãããŸããïŒãã®ã³ã°ãããŒã¹ãã£ã³ã°ããµããŒããã¯ãã«æ³ïŒSVMïŒãç·åœ¢ååž°ã åé¡ã解決ããããã«äœ¿çšããã®ã¯ãããã§ãã
ãããŠãããæ£ç¢ºã«ãæ¯èŒããŸãïŒ
- ç·åœ¢ååž°
- ããŒã¹ãã£ã³ã°ïŒXGboostãLightGBMïŒ
- æšã決å®ããïŒå³å¯ã«èšãã°ãããã¯åãããŒã¹ãã§ãããå¥ã«åããŸãïŒäœåãªæšïŒ
- ãã®ã³ã°ïŒã©ã³ãã ãã©ã¬ã¹ãïŒ
- SVM
Hogwartsã®ååŠçãåŠéšã®1ã€ã«é
åžããåé¡ã¯ã圌ã«å¯Ÿå¿ããåŠéšãå®çŸ©ããããšã§è§£æ±ºã§ããŸãããå³å¯ã«èšãã°ããã®åé¡ã¯åã¯ã©ã¹ãåå¥ã«å±ãããã©ããã倿ããåé¡ã解決ããããšã«ãªããŸãã ãããã£ãŠããã®èšäºã®ãã¬ãŒã ã¯ãŒã¯ã§ã¯ãååŠéšã«1ã€ãã€ã4ã€ã®ã¢ãã«ãååŸãããšããç®æšãèšå®ããŸããã
ããŒã¿
ãã¬ãŒãã³ã°ã®ããã®é©åãªããŒã¿ã»ãããããã«éèŠãªããšã«ã¯ãé©åãªç®çã§äœ¿çšããããã®åæ³çãªããŒã¿ã»ãããèŠã€ããããšã¯ãããŒã¿ãµã€ãšã³ã¹ã§æãè€éã§æéã®ãããã¿ã¹ã¯ã®1ã€ã§ãã ç§ãã¡ã®ã¿ã¹ã¯ã§ã¯ãããªãŒã»ããã¿ãŒã®äžçã«é¢ãããŠã£ãã¢ããããŒã¿ãååŸããŸãã ããšãã°ããã®ãªã³ã¯ã§ã¯ãã°ãªãã£ã³ããŒã«ã®åŠéšã§åŠãã ãã¹ãŠã®ãã£ã©ã¯ã¿ãŒãèŠã€ããããšãã§ããŸãã ãã®å Žåãéå¶å©ç®çã§ããŒã¿ã䜿çšããããšãéèŠã§ãããããã£ãŠããã®ãµã€ãã®ã©ã€ã»ã³ã¹ã«éåããŸããã

ããŒã¿ãµã€ãšã³ãã£ã¹ãã¯ãããªã«ã¯ãŒã«ã ãšæã人ã®ããã«ãããŒã¿ãµã€ãšã³ãã£ã¹ãã«è¡ã£ãŠæããŸããããŒã¿ã®ã¯ãªãŒãã³ã°ãšæºåãªã©ã®ã¹ããããããããšãæãåºãããŠãã ããã ããšãã°ããã°ãªãã£ã³ããŒã«ã®äžçªç®ã®ç¥äºããåé€ãããã°ãªãã£ã³ããŒã«ã®æªç¥ã®å°å¥³ããåèªåã§åé€ããã«ã¯ãããŠã³ããŒãããããŒã¿ãæåã§ç®¡çããå¿
èŠããããŸãã å®éã®äœæ¥ã§ã¯ãã¿ã¹ã¯ã®æ¯äŸçã«å€§ããªéšåã¯åžžã«ãããŒã¿ã»ããå
ã®æ¬ æå€ã®æºåãã¯ãªãŒãã³ã°ãããã³åŸ©å
ã«é¢é£ä»ããããŠããŸãã
å°ãctrl + cïŒctrl + vãåºåãããšãè±èªãšãã·ã¢èªã®2ã€ã®èšèªã®æåã®ååãå«ã4ã€ã®ããã¹ããã¡ã€ã«ãåŸãããŸãã
åéããããŒã¿ã調æ»ããŸãïŒEDAãæ¢çŽ¢çããŒã¿åæïŒ
ãã®æ®µéã§ã¯ãåŠéšã®åŠçã®ååãå«ã4ã€ã®ãã¡ã€ã«ããããŸãã詳ããèŠãŠãããŸãã
$ ls ../input griffindor.txt hufflpuff.txt ravenclaw.txt slitherin.txt
åãã¡ã€ã«ã«ã¯ãåŠçã®ååãšå§ïŒããå ŽåïŒã1è¡ã«1ã€ãã€å«ãŸããŠããŸãã
$ wc -l ../input/*.txt 250 ../input/griffindor.txt 167 ../input/hufflpuff.txt 180 ../input/ravenclaw.txt 254 ../input/slitherin.txt 851 total
åéãããããŒã¿ã®åœ¢åŒã¯æ¬¡ã®ãšããã§ãã
$ cat ../input/griffindor.txt | head -3 && cat ../input/griffindor.txt | tail -3 Charlie Stainforth Melanie Stanmore Stewart
ç§ãã¡ã®å
šäœã®èãã¯ããã©ãã¯ããã¯ã¹ïŒãŸãã¯ãã©ãã¯ãããïŒãåºå¥ããããšãåŠã¶ããšãã§ããååãšå§ã«é¡äŒŒãããã®ããããšããä»®å®ã«åºã¥ããŠããŸãã
ã¢ã«ãŽãªãºã ã¯ãã®ãŸãŸè¡ãéãããšãã§ããŸãããåºæ¬ã¢ãã«ã¯ãDracoããšãHarryãã®éããèªåã§çè§£ã§ããªããããçµæã¯è¯ããããŸããããã®ãããååãšå§ããèšå·ãæœåºããå¿
èŠããããŸãã
ããŒã¿ã®æºåïŒæ©èœãšã³ãžãã¢ãªã³ã°ïŒ
æšè ïŒãŸãã¯è±èªã® æ©èœããã®æ©èœ -ããããã£ïŒã¯ããªããžã§ã¯ãã®éç«ã£ãããããã£ã§ãã éå»1幎éã«è»¢è·ããåæ°ãå·Šæã®æã®æ°ããšã³ãžã³ã®ãšã³ãžã³å®¹éãèµ°è¡è·é¢ã100,000 kmãè¶
ãããã©ããã æšèã®ããããçš®é¡ã®åé¡ã¯éåžžã«å€ãã®çºæè
ã«ãã£ãŠèæ¡ããããã®ã§ããããã®ç¹ã«é¢ããŠåäžã®ã·ã¹ãã ã¯ååšãããåäžã®ã·ã¹ãã ã«ãªãããšãã§ããªããããæšèã®äŸã次ã«ç€ºããŸãã
- æçæ°
- ã«ããŽãªãŒïŒæå€§12ã12-18ãŸãã¯18+ïŒ
- ãã€ããªãŒå€ïŒæåã®ããŒã³ãè¿ãããã©ããïŒ
- æ¥ä»ãè²ãæ ªåŒãªã©
ãã£ãŒãã£ ã®æ€çŽ¢ ïŒãŸãã¯åœ¢æïŒïŒ è±èªã® ãã£ãŒãã£ãšã³ãžãã¢ãªã³ã° ïŒã¯ãããŒã¿åæã®å°éå®¶ã®ç ç©¶ãŸãã¯äœæ¥ã«ãããç¬ç«ããæ®µéãšããŠéç«ã£ãŠããããšããããããŸãã å®éãåžžèãçµéšã仮説ãã¹ããå®éã«è¡ãããŠãããšãããã»ã¹èªäœã«åœ¹ç«ã¡ãŸãã æ£ããå
åãããã«æšæž¬ããããšã¯ãå®å
šãªæãšåºæ¬çãªç¥èãšéãçµã¿åãããããšã®åé¡ã§ãã æã«ã¯ã·ã£ãŒãããºã ããããŸãããäžè¬çãªã¢ãããŒãã¯éåžžã«åçŽã§ãïŒæãã€ããããšãå®è¡ããæ°ããèšå·ã远å ããŠãœãªã¥ãŒã·ã§ã³ãæ¹åã§ãããã©ããã確èªããå¿
èŠããããŸãã ããšãã°ãã¿ã¹ã¯ã®ãµã€ã³ãšããŠãååã«ã·ã¥ãŒãšããé³ã®æ°ãå«ããããšãã§ããŸãã
ã¢ãã«ã®æåã®ããŒãžã§ã³ïŒå®éã®ããŒã¿ãµã€ãšã³ã¹ç ç©¶-åäœãšããŠã¯æ±ºããŠå®æã§ããªãããïŒã§ã¯ãååãšå§ã«æ¬¡ã®æ©èœã䜿çšããŸãã
- 1ããã³åèªã®æåŸã®æå-æ¯é³ãŸãã¯åé³
- äºéæ¯é³ãšåé³
- æ¯é³ãåé³ãèŽèŠé害è
ãæå£°ã®æ°
- ååã®é·ããå§ã®é·ã
- ...
ãããè¡ãããã«ã ãã®ãªããžããªãåºç€ãšããŠãã©ãã³æåã«äœ¿çšã§ããããã«ã¯ã©ã¹ã远å ããŸãã ããã«ãããåæåã®çºé³ã倿ããæ©äŒãåŸãããŸãã
>> from Phonetic import RussianLetter, EnglishLetter >> RussianLetter('').classify() {'consonant': True, 'deaf': False, 'hard': False, 'mark': False, 'paired': False, 'shock': False, 'soft': False, 'sonorus': True, 'vowel': False} >> EnglishLetter('d').classify() {'consonant': True, 'deaf': False, 'hard': True, 'mark': False, 'paired': False, 'shock': False, 'soft': False, 'sonorus': True, 'vowel': False}
ããã§ãçµ±èšãèšç®ããããã®ç°¡åãªé¢æ°ãå®çŸ©ã§ããŸããäŸïŒ
def starts_with_letter(word, letter_type='vowel'): """ , . :param word: :param letter_type: 'vowel' 'consonant'. . :return: Boolean """ if len(word) == 0: return False return Letter(word[0]).classify()[letter_type] def count_letter_type(word): """ . :param word: :param debug: :return: :obj:`dict` of :obj:`str` => :int:count """ count = { 'consonant': 0, 'deaf': 0, 'hard': 0, 'mark': 0, 'paired': 0, 'shock': 0, 'soft': 0, 'sonorus': 0, 'vowel': 0 } for letter in word: classes = Letter(letter).classify() for key in count.keys(): if classes[key]: count[key] += 1 return count
ãããã®é¢æ°ã䜿çšãããšããã§ã«æåã®å
åãååŸã§ããŸãã
from feature_engineering import * >> print(" («»): ", len("")) («»): 5 >> print(" («») : ", starts_with_letter('', 'vowel')) («») : False >> print(" («») : ", starts_with_letter('', 'consonant')) («») : True >> count_Harry = count_letter_type("") >> print (" («»): ", count_Harry['paired']) («»): 1
å³å¯ã«èšãã°ããããã®é¢æ°ã䜿çšããŠãæååã®ãã¯ãã«è¡šçŸãååŸã§ããŸããã€ãŸãããããã³ã°ãååŸããŸãã
ããã§ãæ©æ¢°åŠç¿ã¢ã«ãŽãªãºã ã«å
¥åã§ããããŒã¿ã»ããã®åœ¢åŒã§ããŒã¿ãæç€ºã§ããŸãã
>> from data_loaders import load_processed_data >> hogwarts_df = load_processed_data() >> hogwarts_df.head()

ããã«ããã®çµââæãåçåŸã«æ¬¡ã®ãããªãµã€ã³ã衚瀺ãããŸãã
>> hogwarts_df[hogwarts_df.columns].dtypes
åãåã£ããµã€ã³ name object surname object is_english bool name_starts_with_vowel bool name_starts_with_consonant bool name_ends_with_vowel bool name_ends_with_consonant bool name_length int64 name_vowels_count int64 name_double_vowels_count int64 name_consonant_count int64 name_double_consonant_count int64 name_paired_count int64 name_deaf_count int64 name_sonorus_count int64 surname_starts_with_vowel bool surname_starts_with_consonant bool surname_ends_with_vowel bool surname_ends_with_consonant bool surname_length int64 surname_vowels_count int64 surname_double_vowels_count int64 surname_consonant_count int64 surname_double_consonant_count int64 surname_paired_count int64 surname_deaf_count int64 surname_sonorus_count int64 is_griffindor int64 is_hufflpuff int64 is_ravenclaw int64 is_slitherin int64 dtype: object
æåŸã®4åã¯å¯Ÿè±¡ãšãããŠããŸã-åŠçãã©ã®åŠéšã«ç»é²ãããŠãããã«é¢ããæ
å ±ãå«ãŸããŠããŸãã
ã¢ã«ãŽãªãºã ãã¬ãŒãã³ã°
äžèšã§èšãã°ãã¢ã«ãŽãªãºã ã¯äººã
ãšåãããã«èšç·ŽãããŸãã圌ãã¯ééããç¯ããããããåŠã³ãŸãã ã©ãã ããã¹ãç¯ããããçè§£ããããã«ãã¢ã«ãŽãªãºã ã¯ãšã©ãŒé¢æ°ïŒæå€±é¢æ°ã è±èªã® æå€±é¢æ° ïŒã䜿çšããŸãã
ååãšããŠãåŠç¿ããã»ã¹ã¯éåžžã«åçŽã§ãããããã€ãã®ã¹ãããã§æ§æãããŠããŸãã
- äºæž¬ãè¡ããŸãã
- ãšã©ãŒãè©äŸ¡ããŸãã
- ã¢ãã«ãã©ã¡ãŒã¿ãä¿®æ£ããŸãã
- ç®æšã«éããããããã»ã¹ã忢ããããããŒã¿ãçµäºãããŸã§1ã3ãç¹°ãè¿ããŸãã
çµæã®ã¢ãã«ã®å質ãè©äŸ¡ããŸãã
ãã¡ãããå®éã«ã¯ããã¹ãŠãå°ãè€éã§ãã ããšãã°ã éå°é©åã®çŸè±¡ããããŸããã¢ã«ãŽãªãºã ã¯ãã©ã®ç¹åŸŽãçãã«å¯Ÿå¿ããããæåéãèŠããããšãã§ãããããã£ãŠã圌ãèšç·Žããããã®ãšé¡äŒŒããŠããªããªããžã§ã¯ãã®çµæãæªåãããããšãã§ããŸãã ãããåé¿ããããã«ãããŸããŸãªææ³ãšãããã³ã°ããããŸãã
äžèšã®ããã«ã4ã€ã®åé¡ã解決ããŸãïŒååŠéšã«1ã€ã ãããã£ãŠãã¹ãªã¶ãªã³ã®ããŒã¿ãæºåããŸãã
åŠç¿äžãã¢ã«ãŽãªãºã ã¯ãã®çµæãå®éã®ããŒã¿ãšåžžã«æ¯èŒããŸããããŒã¿ã»ããã®ãã®éšåã¯æ€èšŒã«å²ãåœãŠãããŠããããã§ãã è¯ãããŒã³ã®ã«ãŒã«ã¯ãã¢ã«ãŽãªãºã ããŸã£ããèŠãªãã£ãåå¥ã®ããŒã¿ã§ã¢ã«ãŽãªãºã ã®çµæãè©äŸ¡ããããã«ãèæ
®ãããŸãã ãããã£ãŠããµã³ãã«ã70/30ã®å²åã§åå²ããæåã®ã¢ã«ãŽãªãºã ããã¬ãŒãã³ã°ããŸãã
from sklearn.cross_validation import train_test_split from sklearn.ensemble import RandomForestClassifier
ã§ãã ããã§ããã®ã¢ãã«ã®å
¥åã«ããŒã¿ãéä¿¡ãããšãçµæãçæãããŸãã ããã¯æ¥œããã®ã§ããŸãã¯ããªãŒã®ã¢ãã«ãã¹ãªã¶ãªã³ãèªèããŠãããã©ããã確èªããŸãã ãããè¡ãã«ã¯ããŸãã¢ã«ãŽãªãºã ã®äºæž¬ãååŸããããã«é¢æ°ãæºåããŸãã
ã³ãŒãã衚瀺 from data_loaders import parse_line_to_hogwarts_df import pandas as pd def get_single_student_features (name): """ :param name: string :return: pd.DataFrame """ featurized_person_df = parse_line_to_hogwarts_df(name) person_df = pd.DataFrame(featurized_person_df, columns=[ 'name', 'surname', 'is_english', 'name_starts_with_vowel', 'name_starts_with_consonant', 'name_ends_with_vowel', 'name_ends_with_consonant', 'name_length', 'name_vowels_count', 'name_double_vowels_count', 'name_consonant_count', 'name_double_consonant_count', 'name_paired_count', 'name_deaf_count', 'name_sonorus_count', 'surname_starts_with_vowel', 'surname_starts_with_consonant', 'surname_ends_with_vowel', 'surname_ends_with_consonant', 'surname_length', 'surname_vowels_count', 'surname_double_vowels_count', 'surname_consonant_count', 'surname_double_consonant_count', 'surname_paired_count', 'surname_deaf_count', 'surname_sonorus_count', ], index=[0] ) featurized_person = person_df.drop( ['name', 'surname'], axis = 1 ) return featurized_person def get_predictions_vector (model, person): """ :param model: :param person: string :return: list """ encoded_person = get_single_student_features(person) return model.predict_proba(encoded_person)[0]
次ã«ãã¢ã«ãŽãªãºã ã®çµæãèæ
®ããããã«å°ããªãã¹ãããŒã¿ã»ãããèšå®ããŸãã
def score_testing_dataset (model): """ . :param model: """ testing_dataset = [ " ", "Kirill Malev", " ", "Harry Potter", " ", " ","Severus Snape", " ", "Tom Riddle", " ", "Salazar Slytherin"] for name in testing_dataset: print ("{} â {}".format(name, get_predictions_vector(model, name)[1])) score_testing_dataset(rfc_model)
â 0.5 Kirill Malev â 0.5 â 0.0 Harry Potter â 0.0 â 0.75 â 0.9 Severus Snape â 0.5 â 0.2 Tom Riddle â 0.5 â 0.2 Salazar Slytherin â 0.3
çµæã¯çãããã£ãã ãã®ã¢ãã«ã«ããã°ãåŠéšã®åµèšè
ã§ããåŠéšã«ã¯ããŸããã ãããã£ãŠã峿 Œãªå質ãè©äŸ¡ããå¿
èŠããããŸããæåã«å°ããã¡ããªãã¯ãèŠãŠãã ããã
from sklearn.metrics import accuracy_score, roc_auc_score, classification_report predictions = rfc_model.predict(X_test) print("Classification report: ") print(classification_report(y_test, predictions)) print("Accuracy for Random Forest Model: %.2f" % (accuracy_score(y_test, predictions) * 100)) print("ROC AUC from first Random Forest Model: %.2f" % (roc_auc_score(y_test, predictions)))
Classification report: precision recall f1-score support 0 0.66 0.88 0.75 168 1 0.38 0.15 0.21 89 avg / total 0.56 0.62 0.56 257 Accuracy for Random Forest Model: 62.26 ROC AUC from first Random Forest Model: 0.51
çµæãéåžžã«çãããã£ãã®ã¯é©ãããšã§ã¯ãããŸãã-ROC AUCãçŽ0.51ã§ãããšããããšã¯ãã¢ãã«ãã³ã€ã³ãã¹ããããããã«åªããŠããããšã瀺åããŠããŸãã
çµæããã¹ãããŸãã åè³ªææš
äžèšã®1ã€ã®äŸã䜿çšããŠãsklearnã€ã³ã¿ãŒãã§ã€ã¹ããµããŒããã1ââã€ã®ã¢ã«ãŽãªãºã ãã©ã®ããã«ãã¬ãŒãã³ã°ããããã調ã¹ãŸããã æ®ãã¯ãŸã£ããåãæ¹æ³ã§ãã¬ãŒãã³ã°ãããããããã¹ãŠã®ã¢ã«ãŽãªãºã ããã¬ãŒãã³ã°ããããããã®å Žåã«æé©ãªã¢ã«ãŽãªãºã ãéžæããããšããã§ããŸããã

ããã¯è€éã§ã¯ãããŸãããã¢ã«ãŽãªãºã ããšã«1ãæšæºèšå®ã§ãã¬ãŒãã³ã°ããã»ããå
šäœããã¬ãŒãã³ã°ããŠãã¢ã«ãŽãªãºã ã®å質ã«åœ±é¿ããããŸããŸãªãªãã·ã§ã³ãäžŠã¹æ¿ããŸãã ãã®æ®µéã¯ã¢ãã«ãã¥ãŒãã³ã°ãŸãã¯ãã€ããŒãã©ã¡ãŒã¿ãŒæé©åãšåŒã°ãããã®æ¬è³ªã¯éåžžã«ç°¡åã§ããæè¯ã®çµæãããããèšå®ã®ã»ãããéžæãããŸãã
from model_training import train_classifiers from data_loaders import load_processed_data import warnings warnings.filterwarnings('ignore')
â 0.09437856871661066 Kirill Malev â 0.20820536334902712 â 0.07550095601699099 Harry Potter â 0.07683794773639624 â 0.9414529336862744 â 0.9293671807790949 Severus Snape â 0.6576783576162999 â 0.18577792617672767 Tom Riddle â 0.8351835484058869 â 0.25930925139546795 Salazar Slytherin â 0.24008788903854789
ãã®ããŒãžã§ã³ã®æ°å€ã¯éå»ããã䞻芳çã«ã¯è¯ãèŠããŸãããå
éšã®å®ç§äž»çŸ©è
ã«ã¯ãŸã ååã§ã¯ãããŸããã ãããã£ãŠãããæ·±ãã¬ãã«ã«é²ã¿ãã¿ã¹ã¯ã®è£œåæèŠã«æ»ããŸããããŒããŒãååžåžœåã«ãã£ãŠæ±ºå®ãããæãå¯èœæ§ã®é«ãæå¡ãäºæž¬ããå¿
èŠããããŸãã ã€ãŸããåŠéšããšã«ã¢ãã«ããã¬ãŒãã³ã°ããå¿
èŠããããŸãã

>> from model_training import train_all_models
çµæãšå€é
ååž°ã®çµæã®é·ãçµè« SVM Default Report Accuracy for SVM Default: 73.93 ROC AUC for SVM Default: 0.53 Tuned SVM Report Accuracy for Tuned SVM: 72.37 ROC AUC for Tuned SVM: 0.50 KNN Default Report Accuracy for KNN Default: 70.04 ROC AUC for KNN Default: 0.58 Tuned KNN Report Accuracy for Tuned KNN: 69.65 ROC AUC for Tuned KNN: 0.58 XGBoost Default Report Accuracy for XGBoost Default: 70.43 ROC AUC for XGBoost Default: 0.54 Tuned XGBoost Report Accuracy for Tuned XGBoost: 68.09 ROC AUC for Tuned XGBoost: 0.56 Random Forest Default Report Accuracy for Random Forest Default: 73.93 ROC AUC for Random Forest Default: 0.62 Tuned Random Forest Report Accuracy for Tuned Random Forest: 74.32 ROC AUC for Tuned Random Forest: 0.54 Extra Trees Default Report Accuracy for Extra Trees Default: 69.26 ROC AUC for Extra Trees Default: 0.57 Tuned Extra Trees Report Accuracy for Tuned Extra Trees: 73.54 ROC AUC for Tuned Extra Trees: 0.55 LGBM Default Report Accuracy for LGBM Default: 70.82 ROC AUC for LGBM Default: 0.62 Tuned LGBM Report Accuracy for Tuned LGBM: 74.71 ROC AUC for Tuned LGBM: 0.53 RGF Default Report Accuracy for RGF Default: 70.43 ROC AUC for RGF Default: 0.58 Tuned RGF Report Accuracy for Tuned RGF: 71.60 ROC AUC for Tuned RGF: 0.60 FRGF Default Report Accuracy for FRGF Default: 68.87 ROC AUC for FRGF Default: 0.59 Tuned FRGF Report Accuracy for Tuned FRGF: 69.26 ROC AUC for Tuned FRGF: 0.59 SVM Default Report Accuracy for SVM Default: 70.43 ROC AUC for SVM Default: 0.50 Tuned SVM Report Accuracy for Tuned SVM: 71.60 ROC AUC for Tuned SVM: 0.50 KNN Default Report Accuracy for KNN Default: 63.04 ROC AUC for KNN Default: 0.49 Tuned KNN Report Accuracy for Tuned KNN: 65.76 ROC AUC for Tuned KNN: 0.50 XGBoost Default Report Accuracy for XGBoost Default: 69.65 ROC AUC for XGBoost Default: 0.54 Tuned XGBoost Report Accuracy for Tuned XGBoost: 68.09 ROC AUC for Tuned XGBoost: 0.50 Random Forest Default Report Accuracy for Random Forest Default: 66.15 ROC AUC for Random Forest Default: 0.51 Tuned Random Forest Report Accuracy for Tuned Random Forest: 70.43 ROC AUC for Tuned Random Forest: 0.50 Extra Trees Default Report Accuracy for Extra Trees Default: 64.20 ROC AUC for Extra Trees Default: 0.49 Tuned Extra Trees Report Accuracy for Tuned Extra Trees: 70.82 ROC AUC for Tuned Extra Trees: 0.51 LGBM Default Report Accuracy for LGBM Default: 67.70 ROC AUC for LGBM Default: 0.56 Tuned LGBM Report Accuracy for Tuned LGBM: 70.82 ROC AUC for Tuned LGBM: 0.50 RGF Default Report Accuracy for RGF Default: 66.54 ROC AUC for RGF Default: 0.52 Tuned RGF Report Accuracy for Tuned RGF: 65.76 ROC AUC for Tuned RGF: 0.53 FRGF Default Report Accuracy for FRGF Default: 65.76 ROC AUC for FRGF Default: 0.53 Tuned FRGF Report Accuracy for Tuned FRGF: 69.65 ROC AUC for Tuned FRGF: 0.52 SVM Default Report Accuracy for SVM Default: 74.32 ROC AUC for SVM Default: 0.50 Tuned SVM Report Accuracy for Tuned SVM: 74.71 ROC AUC for Tuned SVM: 0.51 KNN Default Report Accuracy for KNN Default: 69.26 ROC AUC for KNN Default: 0.48 Tuned KNN Report Accuracy for Tuned KNN: 73.15 ROC AUC for Tuned KNN: 0.49 XGBoost Default Report Accuracy for XGBoost Default: 72.76 ROC AUC for XGBoost Default: 0.49 Tuned XGBoost Report Accuracy for Tuned XGBoost: 74.32 ROC AUC for Tuned XGBoost: 0.50 Random Forest Default Report Accuracy for Random Forest Default: 73.93 ROC AUC for Random Forest Default: 0.52 Tuned Random Forest Report Accuracy for Tuned Random Forest: 74.32 ROC AUC for Tuned Random Forest: 0.50 Extra Trees Default Report Accuracy for Extra Trees Default: 73.93 ROC AUC for Extra Trees Default: 0.52 Tuned Extra Trees Report Accuracy for Tuned Extra Trees: 73.93 ROC AUC for Tuned Extra Trees: 0.50 LGBM Default Report Accuracy for LGBM Default: 73.54 ROC AUC for LGBM Default: 0.52 Tuned LGBM Report Accuracy for Tuned LGBM: 74.32 ROC AUC for Tuned LGBM: 0.50 RGF Default Report Accuracy for RGF Default: 73.54 ROC AUC for RGF Default: 0.51 Tuned RGF Report Accuracy for Tuned RGF: 73.93 ROC AUC for Tuned RGF: 0.50 FRGF Default Report Accuracy for FRGF Default: 73.93 ROC AUC for FRGF Default: 0.53 Tuned FRGF Report Accuracy for Tuned FRGF: 73.93 ROC AUC for Tuned FRGF: 0.50 SVM Default Report Accuracy for SVM Default: 80.54 ROC AUC for SVM Default: 0.50 Tuned SVM Report Accuracy for Tuned SVM: 80.93 ROC AUC for Tuned SVM: 0.52 KNN Default Report Accuracy for KNN Default: 78.60 ROC AUC for KNN Default: 0.50 Tuned KNN Report Accuracy for Tuned KNN: 80.16 ROC AUC for Tuned KNN: 0.51 XGBoost Default Report Accuracy for XGBoost Default: 80.54 ROC AUC for XGBoost Default: 0.50 Tuned XGBoost Report Accuracy for Tuned XGBoost: 77.04 ROC AUC for Tuned XGBoost: 0.52 Random Forest Default Report Accuracy for Random Forest Default: 77.43 ROC AUC for Random Forest Default: 0.49 Tuned Random Forest Report Accuracy for Tuned Random Forest: 80.54 ROC AUC for Tuned Random Forest: 0.50 Extra Trees Default Report Accuracy for Extra Trees Default: 76.26 ROC AUC for Extra Trees Default: 0.48 Tuned Extra Trees Report Accuracy for Tuned Extra Trees: 78.60 ROC AUC for Tuned Extra Trees: 0.50 LGBM Default Report Accuracy for LGBM Default: 75.49 ROC AUC for LGBM Default: 0.51 Tuned LGBM Report Accuracy for Tuned LGBM: 80.54 ROC AUC for Tuned LGBM: 0.50 RGF Default Report Accuracy for RGF Default: 78.99 ROC AUC for RGF Default: 0.52 Tuned RGF Report Accuracy for Tuned RGF: 75.88 ROC AUC for Tuned RGF: 0.55 FRGF Default Report Accuracy for FRGF Default: 76.65 ROC AUC for FRGF Default: 0.50 # ,
from sklearn.linear_model import LogisticRegression clf = LogisticRegression(random_state=0, solver='lbfgs', multi_class='multinomial') hogwarts_df = load_processed_data_multi()
â [0.3602361 0.16166944 0.16771712 0.31037733] Kirill Malev â [0.47473072 0.16051924 0.13511385 0.22963619] â [0.38697926 0.19330242 0.17451052 0.2452078 ] Harry Potter â [0.40245098 0.16410043 0.16023278 0.27321581] â [0.13197025 0.16438855 0.17739254 0.52624866] â [0.17170203 0.1205678 0.14341742 0.56431275] Severus Snape â [0.15558044 0.21589378 0.17370406 0.45482172] â [0.39301231 0.07397324 0.1212741 0.41174035] Tom Riddle â [0.26623969 0.14194379 0.1728505 0.41896601] â [0.24843037 0.21632736 0.21532696 0.3199153 ] Salazar Slytherin â [0.09359144 0.26735897 0.2742305 0.36481909]
confusion_matrix:
confusion_matrix(clf.predict(X_data), y)
array([[144, 68, 64, 78], [ 8, 9, 8, 6], [ 22, 18, 31, 20], [ 77, 73, 78, 151]])
def get_predctions_vector (models, person): predictions = [get_predictions_vector (model, person)[1] for model in models] return { 'slitherin': predictions[0], 'griffindor': predictions[1], 'ravenclaw': predictions[2], 'hufflpuff': predictions[3] } def score_testing_dataset (models): testing_dataset = [ " ", "Kirill Malev", " ", "Harry Potter", " ", " ","Severus Snape", " ", "Tom Riddle", " ", "Salazar Slytherin"] data = [] for name in testing_dataset: predictions = get_predctions_vector(models, name) predictions['name'] = name data.append(predictions) scoring_df = pd.DataFrame(data, columns=['name', 'slitherin', 'griffindor', 'hufflpuff', 'ravenclaw']) return scoring_df
name slitherin griffindor hufflpuff ravenclaw 0 0.349084 0.266909 0.110311 0.091045 1 Kirill Malev 0.289914 0.376122 0.384986 0.103056 2 0.338258 0.400841 0.016668 0.124825 3 Harry Potter 0.245377 0.357934 0.026287 0.154592 4 0.917423 0.126997 0.176640 0.096570 5 0.969693 0.106384 0.150146 0.082195 6 Severus Snape 0.663732 0.259189 0.290252 0.074148 7 0.268466 0.579401 0.007900 0.083195 8 Tom Riddle 0.639731 0.541184 0.084395 0.156245 9 0.653595 0.147506 0.172940 0.137134 10 Salazar Slytherin 0.647399 0.169964 0.095450 0.26126

, . ROC AUC , 0.5. , :
, , , , XGBoost CV , .
éèŠïŒ , 70% . , 4 .
from model_training import train_production_models from xgboost import XGBClassifier best_models = [] for i in range (0,4): best_models.append(XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1, colsample_bytree=0.7, gamma=0, learning_rate=0.05, max_delta_step=0, max_depth=6, min_child_weight=11, missing=-999, n_estimators=1000, n_jobs=1, nthread=4, objective='binary:logistic', random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=1337, silent=1, subsample=0.8)) slitherin_model, griffindor_model, ravenclaw_model, hufflpuff_model = \ train_production_models(best_models) top_models = slitherin_model, griffindor_model, ravenclaw_model, hufflpuff_model score_testing_dataset(top_models)
name slitherin griffindor hufflpuff ravenclaw 0 0.273713 0.372337 0.065923 0.279577 1 Kirill Malev 0.401603 0.761467 0.111068 0.023902 2 0.031540 0.616535 0.196342 0.217829 3 Harry Potter 0.183760 0.422733 0.119393 0.173184 4 0.945895 0.021788 0.209820 0.019449 5 0.950932 0.088979 0.084131 0.012575 6 Severus Snape 0.634035 0.088230 0.249871 0.036682 7 0.426440 0.431351 0.028444 0.083636 8 Tom Riddle 0.816804 0.136530 0.069564 0.035500 9 0.409634 0.213925 0.028631 0.252723 10 Salazar Slytherin 0.824590 0.067910 0.111147 0.085710
, , .
, , . .
import pickle pickle.dump(slitherin_model, open("../output/slitherin.xgbm", "wb")) pickle.dump(griffindor_model, open("../output/griffindor.xgbm", "wb")) pickle.dump(ravenclaw_model, open("../output/ravenclaw.xgbm", "wb")) pickle.dump(hufflpuff_model, open("../output/hufflpuff.xgbm", "wb"))
, . , , , .
, , . , . , Data Scientist â -.
:
, docker-, python-. , flask.
from __future__ import print_function
Dockerfile:
FROM datmo/python-base:cpu-py35
:
docker build -t talking_hat . && docker rm talking_hat && docker run --name talking_hat -p 5000:5000 talking_hat
â . , Apache Benchmark . , . â .
$ ab -p data.json -T application/json -c 50 -n 10000 http://0.0.0.0:5000/predict
ab This is ApacheBench, Version 2.3 <$Revision: 1807734 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 0.0.0.0 (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests Server Software: Werkzeug/0.14.1 Server Hostname: 0.0.0.0 Server Port: 5000 Document Path: /predict Document Length: 141 bytes Concurrency Level: 50 Time taken for tests: 238.552 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 2880000 bytes Total body sent: 1800000 HTML transferred: 1410000 bytes Requests per second: 41.92 [#/sec] (mean) Time per request: 1192.758 [ms] (mean) Time per request: 23.855 [ms] (mean, across all concurrent requests) Transfer rate: 11.79 [Kbytes/sec] received 7.37 kb/s sent 19.16 kb/s total Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 3 Processing: 199 1191 352.5 1128 3352 Waiting: 198 1190 352.5 1127 3351 Total: 202 1191 352.5 1128 3352 Percentage of the requests served within a certain time (ms) 50% 1128 66% 1277 75% 1378 80% 1451 90% 1668 95% 1860 98% 2096 99% 2260 100% 3352 (longest request)
, :
def prod_predict_classes_for_name (full_name): <...> predictions = get_predctions_vector([ app.slitherin_model, app.griffindor_model, app.ravenclaw_model, app.hufflpuff_model ], person_df.drop(['name', 'surname'], axis=1)) return { 'slitherin': float(predictions[0][1]), 'griffindor': float(predictions[1][1]), 'ravenclaw': float(predictions[2][1]), 'hufflpuff': float(predictions[3][1]) } def create_app(): <...> with app.app_context(): app.slitherin_model = pickle.load(open("models/slitherin.xgbm", "rb")) app.griffindor_model = pickle.load(open("models/griffindor.xgbm", "rb")) app.ravenclaw_model = pickle.load(open("models/ravenclaw.xgbm", "rb")) app.hufflpuff_model = pickle.load(open("models/hufflpuff.xgbm", "rb")) return app
:
$ docker build -t talking_hat . && docker rm talking_hat && docker run --name talking_hat -p 5000:5000 talking_hat $ ab -p data.json -T application/json -c 50 -n 10000 http://0.0.0.0:5000/predict
ab This is ApacheBench, Version 2.3 <$Revision: 1807734 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 0.0.0.0 (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests Server Software: Werkzeug/0.14.1 Server Hostname: 0.0.0.0 Server Port: 5000 Document Path: /predict Document Length: 141 bytes Concurrency Level: 50 Time taken for tests: 219.812 seconds Complete requests: 10000 Failed requests: 3 (Connect: 0, Receive: 0, Length: 3, Exceptions: 0) Total transferred: 2879997 bytes Total body sent: 1800000 HTML transferred: 1409997 bytes Requests per second: 45.49 [#/sec] (mean) Time per request: 1099.062 [ms] (mean) Time per request: 21.981 [ms] (mean, across all concurrent requests) Transfer rate: 12.79 [Kbytes/sec] received 8.00 kb/s sent 20.79 kb/s total Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.1 0 2 Processing: 235 1098 335.2 1035 3464 Waiting: 235 1097 335.2 1034 3462 Total: 238 1098 335.2 1035 3464 Percentage of the requests served within a certain time (ms) 50% 1035 66% 1176 75% 1278 80% 1349 90% 1541 95% 1736 98% 1967 99% 2141 100% 3464 (longest request)
ã§ãã . , .
ãããã«
, . - .
, :
- feature engineering- ( ), , Soundex .
- PyTorch . , , .
- flask Quart , , .
- - -, .
, , . , !
ãã®èšäºã¯ãããŒã¿åæã®åéã§ãã·ã¢èªã話ã倿°ã®å°éå®¶ãéãããªãŒãã³ããŒã¿ãµã€ãšã³ã¹ã³ãã¥ããã£ãªãã§ã¯å
¬éãããªãã£ãã§ãããã