ã¿ãªããããã«ã¡ã¯ïŒ
ç§ã®ååã¯ã¢ã¬ã¯ã»ã€ã»ããŒãã³ãã§ãã ç§ã¯Align Technologyã®ããŒã¿ãµã€ãšã³ãã£ã¹ãã§ãã ãã®èšäºã§ã¯ãããŒã¿åæã®å®éšäžã«å®è·µããæ©èœéžæã®ã¢ãããŒãã«ã€ããŠèª¬æããŸãã
åœç€Ÿã§ã¯ãçµ±èšåŠè
ãšæ©æ¢°åŠç¿ãšã³ãžãã¢ããæ£è
ã®æ²»çã«é¢é£ãã倧éã®èšåºæ
å ±ãåæããŠããŸãã äžèšã§èšãã°ããã®èšäºã®æå³ã¯ãç§ãã¡ãå©çšã§ãããã€ãºã®å€ãåé·ãªã®ã¬ãã€ãã®ããŒã¿ã®ããäžéšã«å«ãŸãã貎éãªç¥èãæœåºããããšã«çµãããšãã§ããŸãã
ãã®èšäºã¯ãããŒã¿ã»ããã®äŸåé¢ä¿ãèŠã€ããããšã«é¢å¿ã®ããçµ±èšåŠè
ãæ©æ¢°åŠç¿ãšã³ãžãã¢ãããã³å°éå®¶ã察象ãšããŠããŸãã ãŸãããã®èšäºã«èšèŒãããŠããè³æã¯ãããŒã¿ãã€ãã³ã°ã«ç¡é¢å¿ã§ãªãå¹
åºãèªè
ã«ãšã£ãŠè峿·±ããã®ã§ãã ãã®è³æã§ã¯ããã£ãŒãã£ãšã³ãžãã¢ãªã³ã°ã®åé¡ãç¹ã«äž»èŠã³ã³ããŒãã³ãã®åæãªã©ã®æ¹æ³ã®é©çšã«ã€ããŠã¯åãäžããŸããã
åºæå€æ°ã®å
¥å倿°ãæã€ã»ããã«ã¯åé·ãªæ
å ±ãå«ãŸããã¢ãã«ã«æ£ååæ©èœãçµã¿èŸŒãŸããŠããªãå Žåãæ©æ¢°åŠç¿ã¢ãã«ã®åãã¬ãŒãã³ã°ã«ã€ãªããå¯èœæ§ãããããšã確ç«ãããŠããŸãã æçãªç¹åŸŽïŒä»¥é
IPR ïŒãéžæããæ®µéã¯ãå®éšäžã®ããŒã¿ã®ååŠçã§å¿
èŠãªã¹ãããã§ããããšããããããŸãã
- ãã®èšäºã®æåã®éšåã§ã¯ãç¹æ§ãéžæããæ¹æ³ã®ããã€ãã確èªãããã®ããã»ã¹ã®çè«çãªãã€ã³ããæ€èšããŸãã ãã®ã»ã¯ã·ã§ã³ã¯ããããç§ãã¡ã®èŠè§£ã®äœç³»åã§ãã
- èšäºã®ç¬¬2éšã§ã¯ã人工ããŒã¿ã»ãããäŸãšããŠäœ¿çšããŠãæçãªæ©èœã®éžæã詊ã¿ãçµæãæ¯èŒããŸãã
- 第3éšã§ã¯ãè°è«äžã®åé¡ã«é©çšãããæ
å ±çè«ããã®æž¬å®å€ã䜿çšããçè«ãšå®è·µã«ã€ããŠèª¬æããŸãã ãã®ã»ã¯ã·ã§ã³ã§ç޹ä»ããæ¹æ³ã¯æ¬æ°ã§ãããããŸããŸãªããŒã¿ã»ããã«å¯ŸããŠå€æ°ã®ãã§ãã¯ãå¿
èŠã§ãã
èšäºã§è¡ãããå®éšã¯ãç¹å®ã®æ¹æ³ã®æé©æ§ã«å¯Ÿããåæçæ£åœåãäžããããªããšããäºå®ã®ããã«ç§åŠçã§ãããµãããããèªè
ã¯ããè©³çŽ°ã§æ°åŠçã«æ£ç¢ºãªãã¬ãŒã³ããŒã·ã§ã³ã®ããã«æ
å ±æºãåç
§ããŸãã ããã«ãå
責äºé
ã¯ãç¹å®ã®ã¡ãœããã®å€ãä»ã®ããŒã¿ã§å€ãããããã¿ã¹ã¯å
šäœãç¥çã«é
åçã§ãããšããäºå®ã«åºã¥ããŠããŸãã
èšäºã®æåŸã«ãå®éšçµæããŸãšããããGitã®å®å
šãªRã³ãŒããžã®ãªã³ã¯ãäœæãããŸãã
åºçåã«ãã®è³æãèªãã§ããããã¹ãŠã®äººã
ãç¹ã«Vlad ShcherbininãšAlexey Seleznevã«æè¬ããŸãã1ïŒæçãªæ©èœãéžæããæ¹æ³ã𿹿³ã
Wikiã«é£çµ¡ããŠãIPRã¡ãœãããåé¡ããäžè¬çãªã¢ãããŒããèŠãŠã¿ãŸãããã
æçãªæ©èœãéžæããããã®ã¢ã«ãŽãªãºã ã¯ãã©ãããŒïŒã©ããã³ã°ïŒããã£ã«ã¿ãŒïŒãã£ã«ã¿ãªã³ã°ïŒãããã³çµã¿èŸŒã¿ïŒçµã¿èŸŒã¿ãã·ã³ïŒã®ã°ã«ãŒãã§è¡šãããšãã§ããŸãã ïŒãã·ã¢èªåã®ã³ãã¥ããã£ã«å¯Ÿããé³ã®ææ§ããèæ
®ããŠããããã®çšèªãæ£ç¢ºãªç¿»èš³ãªãã§æ®ããŸã-ç§ã®ã³ã¡ã³ããïŒ
ã©ããã³ã°ã¢ã«ãŽãªãºã ã¯ãå¯èœãªå
¥å倿°ã®ç©ºéã§ã®æ€çŽ¢ã䜿çšããŠãµãã»ãããäœæããå©çšå¯èœãªããŒã¿ã§å®å
šãªã¢ãã«ããã¬ãŒãã³ã°ããããšã«ãããå
¥åã®çµæã®ãµãã»ãããè©äŸ¡ããŸãã ã©ããã³ã°ã¢ã«ãŽãªãºã ã¯éåžžã«é«äŸ¡ã§ãããã¢ãã«ãåãã¬ãŒãã³ã°ãããªã¹ã¯ããããŸãã ïŒæ€èšŒãµã³ããªã³ã°ã䜿çšãããŠããªãå Žå-ç§ã®ã³ã¡ã³ããïŒ
ãã£ã«ã¿ãªã³ã°ã¢ã«ãŽãªãºã ã¯ãå
¥åããŒã¿ã®ãµãã»ãããæ€çŽ¢ãããšããç¹ã§ã©ããã³ã°ã¢ã«ãŽãªãºã ãšäŒŒãŠããŸãããå®å
šãªã¢ãã«ãèµ·åãã代ããã«ãããåçŽãªïŒãã£ã«ã¿ãªã³ã°ïŒã¢ã«ãŽãªãºã ã䜿çšããŠåºå倿°ã®ãµãã»ããã®éèŠæ§ãæšå®ããŸãã
ãã·ã³ã«çµã¿èŸŒãŸããã¢ã«ãŽãªãºã ã¯ããã¬ãŒãã³ã°ã§äºåå®çŸ©ããããã¥ãŒãªã¹ãã£ãã¯ã®å©ããåããŠãå
¥åæ©èœã®éèŠæ§ãè©äŸ¡ããŸãã
åºæäŸã
IPRã©ããã³ã°ã¢ã«ãŽãªãºã ã¯ãéžæãããããŒã¿ã§å
¥å倿°ã®ãµãã»ãããæ€çŽ¢ãããã®åŸã®ãã¬ãŒãã³ã°ïŒã©ã³ãã ãã©ã¬ã¹ããªã©ïŒãå®è¡ãããã亀差æ€èšŒã§ãã®ãšã©ãŒãè©äŸ¡ããããããªã©ãã¡ãœããã®çµã¿åãããšåŒã¶ããšãã§ããŸãã ã€ãŸããå埩ããšã«ãã·ã³å
šäœããã¬ãŒãã³ã°ããŸãïŒãã§ã«å®éã«äœ¿çšã§ããç¶æ
ã«ãªã£ãŠããŸãïŒã
IPRã®ãã£ã«ã¿ãªã³ã°ã¢ã«ãŽãªãºã ã¯å
¥å倿°ã®åæãšåŒã°ããéžæããã倿°ãšåºåã®éã®é¢ä¿ã®çµ±èšçæ€å®ã«ãã£ãŠè£å®ãããŸãã å
¥åãšåºåãã«ããŽãªã«ã«ã§ããå Žåãå
¥åïŒãŸãã¯å
¥åã®çµã¿åããã»ããïŒãšåºåãšã®éã®ç¬ç«æ§ã«ã€ããŠã«ã€äºä¹æ€å®ãå®è¡ããŠãpå€ãããã³çµæãšããŠãéžæããã屿§ã»ããã®éèŠæ§ãŸãã¯éèŠæ§ã«é¢ãããã€ããªçµè«ãè¡ãããšãã§ããŸãã ãã£ã«ã¿ãªã³ã°ã¢ã«ãŽãªãºã ã®ä»ã®äŸã¯æ¬¡ã®ãšããã§ãã
- å
¥åãšåºåã®ç·åœ¢çžé¢ã
- ã«ããŽãªãŒå
¥åãšé£ç¶åºåã®å Žåã®å¹³åã®å·®ã®çµ±èšçæ€å®;
- Fæ€å®ïŒåæ£åæïŒã
çµã¿èŸŒã¿ã®IPRã¢ã«ãŽãªãºã ã¯ãããšãã°ãç·åœ¢ååž°ä¿æ°ã«å¯Ÿå¿ããpå€ã§ãã ãã®å Žåãpå€ã«ãããä¿æ°ãšãŒãã®ææå·®ã«ã€ããŠãã€ããªã®çµè«ãåºãããšãã§ããŸãã ã¢ãã«ã®ãã¹ãŠã®å
¥åãã¹ã±ãŒãªã³ã°ãããšãéã¿ã®ã¢ãžã¥ãŒã«ã¯éèŠæ§ã®ææšãšããŠè§£éã§ããŸãã R ^ 2ã¢ãã«ã䜿çšããããšãã§ããŸããããã¯ãã·ãã¥ã¬ãŒããããå€ã§ããã»ã¹ã®åæ£ã説æãã尺床ã§ãã å¥ã®äŸã¯ãã©ã³ãã ãã©ã¬ã¹ãã«çµã¿èŸŒãŸããå
¥å倿°ã®éèŠæ§ãè©äŸ¡ãã颿°ã§ãã ããã«ã人工ãã¥ãŒã©ã«ãããã¯ãŒã¯ã®å
¥åã«å¯Ÿå¿ããéã¿ã®ã¢ãžã¥ãŒã«ã䜿çšã§ããŸãã ãã®ãªã¹ãã¯ããã§çµããã§ã¯ãããŸããã
ãã®æ®µéã§ã¯ããã®åºå¥ãå®éã«IPRã®
ãã£ãããã¹é¢æ°ã®éããã€ãŸãã解決ãããŠããåé¡ã«é¢ããå
¥åãã£ãŒãã£ã®èŠã€ãã£ããµãã»ããã®é¢é£æ§ã®å°ºåºŠã瀺ãããšãçè§£ããããšãéèŠã§ãã ãã®åŸããã£ãããã¹é¢æ°ãéžæããåé¡ã«æ»ããŸãã
IPRã¡ãœããã®ã¡ã€ã³ã°ã«ãŒãã«å°ãçŠç¹ãåãããã®ã§ãå
¥å倿°ã®ãµãã»ãããæ£ç¢ºã«åæããããã«ã©ã®ã¡ãœããã䜿çšããããã«æ³šæãæãããšãææ¡ããŸãã wikiããŒãžã«æ»ããŸãããã
ã¹ã¯ãªãŒãã³ã°ã®ã¢ãããŒãã¯æ¬¡ã®ãšããã§ãã
- å®å
šæ€çŽ¢
- æåã®æé«ã®åè£è
- ã·ãã¥ã¬ãŒãããã¢ããŒãªã³ã°
- éºäŒçã¢ã«ãŽãªãºã
- å
å«ã®è²ªæ¬²ãªæ€çŽ¢
- 貪欲ãªäŸå€æ€çŽ¢
- ç²å矀æé©å
- ã¿ãŒã²ãããçµã£ãæåœ±è¿œè·¡
- æ£åžæ€çŽ¢
- å¯å€è¿åæ€çŽ¢
åºæãã·ã¢èªã®è§£éã«ç²ŸéããŠããªããããäžéšã®ã¢ã«ãŽãªãºã ã®ååãæå³çã«ç¿»èš³ããŸããã§ããã
åºåã§æ¬¡ã®åœ¢åŒã®å
¥åã€ã³ããã¯ã¹ãè¡šãæŽæ°ã®ãã¯ãã«ãååŸãããããäºæž¬åã®ãµãã»ãããèŠã€ããããšã¯é¢æ£ã¿ã¹ã¯ã§ãã
å
¥åïŒ1 2 3 4 5 6 ... 1000
éžæïŒ0 0 1 1 1 0 ... 1
åŸã§ãã®æ©èœã«æ»ããå®éã«ã©ãã«ã€ãªãããã説æããŸãã
å®éšå
šäœã®çµæã¯ãå
¥åãã£ãŒãã£ã®ãµãã»ããã®æ€çŽ¢ã®æ§ææ¹æ³ã«å€§ããäŸåããŸãã ãããã®ã¢ãããŒãã®äž»ãªéããçŽæçã«çè§£ããããã«ãèªè
ã«ãããã貪欲ãšé貪欲ã®2ã€ã®ã°ã«ãŒãã«åããããšããå§ãããŸãã
è²ªæ¬²ãªæ€çŽ¢ã¢ã«ãŽãªãºã ã
ãããã¯é«éã§ãããå€ãã®ã¿ã¹ã¯ã§è¯ãçµæããããããããé »ç¹ã«äœ¿çšãããŸãã ã¢ã«ãŽãªãºã ã®æ¬²æ±ã¯ãæçµãµãã»ããã«å
¥ãåè£ã®1ã€ãéžæïŒãŸãã¯é€å€ïŒãããå Žåããã®äžã«æ®ãïŒè²ªæ¬²ãªå
å«ã®å ŽåïŒããæ°žä¹
ã«ååšããªãïŒè²ªæ¬²ãªäŸå€ã®å ŽåïŒãšããäºå®ã«ãããŸãã ãããã£ãŠãåè£Aãåæã®å埩ã§éžæãããå ŽåãåŸã®å埩ã§ãµãã»ããã¯åžžã«åœŒãšä»ã®åè£ãå«ã¿ãAãšãšãã«åºå倿°ã®ãµãã»ããã®éèŠæ§ã®ã¡ããªãã¯ã®æ¹åã瀺ããŸãã å察ã®ç¶æ³ã¯é€å€ã§ãïŒåè£Aãé€å€ãããåŸãéèŠåºŠæž¬å®åºæºãžã®åœ±é¿ãæãå°ãªããæ¹åãããããã«åé€ãããå Žåãç ç©¶è
ã¯éèŠåºŠæž¬å®åºæºã«é¢ããæ
å ±ãåä¿¡ããŸããããµãã»ããã«ã¯Aãšä»ã®åè£ãåŸã§é€å€ãããŸãã
倿¬¡å
空éã§æå€§å€ïŒæå°å€ïŒã®æ€çŽ¢ãšäžŠè¡ããŠæç»ãããšã貪欲ã¢ã«ãŽãªãºã ã¯å±æçãªæå°å€ïŒããå ŽåïŒã§ã¹ã¿ãã¯ããããåäžã®æå°å€ãããå ŽåïŒã°ããŒãã«ïŒã«æé©ãªãœãªã¥ãŒã·ã§ã³ããã°ããèŠã€ããŸãã
äžæ¹ããã¹ãŠã®è²ªæ¬²ãªãªãã·ã§ã³ã®åæã¯æ¯èŒçé«éã§ãããå
¥åéã®
çžäºäœçšãèæ
®ããããšãã§ããŸãã
貪欲ãªã¢ã«ãŽãªãºã ã®äŸã«ã¯ãåæ¹éžæãåæ¹ãžã®ã¹ããããåŸæ¹ãžã®é€å»ãåŸæ¹ãžã®ã¹ããããå«ãŸããŸãã ãªã¹ãã¯ããã«éå®ãããŸããã
äžèŠãªæ€çŽ¢ã¢ã«ãŽãªãºã ã
é貪欲ã¢ã«ãŽãªãºã ã®åäœåçã¯ã屿çæå°å€ãåé¿ããããã«ãå®å
šã«ãŸãã¯éšåçã«åœ¢æãããæ©èœã®ãµãã»ããããµãã»ããéã®çµã¿åãããç Žæ£ãããµãã»ããã«ã©ã³ãã ãªå€æŽãå ããæ©èœãæå³ããŸãã
倿¬¡å
空éã§ã®ãã£ãããã¹é¢æ°ã®æå€§å€ïŒæå°å€ïŒã®æ€çŽ¢ãšã®é¡äŒŒæ§ãåŒãåºããšã鿬²åŒµãã¢ã«ãŽãªãºã ã¯ããå€ãã®é£æ¥ãã€ã³ããèæ
®ããã©ã³ãã ãªé åã«å€§ããªãžã£ã³ããè¡ãããšããã§ããŸãã
ãããã®ã¿ã€ãã®ã¢ã«ãŽãªãºã ã®äœæ¥ãã°ã©ãã£ã«ã«ã«è¡šç€ºã§ããŸãã
æåã®è²ªæ¬²ãªå
å«ïŒ

貪欲ã§ãªã-確çç-æ€çŽ¢ïŒ

ã©ã¡ãã®å Žåããã€ã³ããã¯ã¹ãã°ã©ãã®è»žã«æ²¿ã£ãŠããããããã2ã€ã®å
¥åã®æé©ãªçµã¿åããã1ã€éžæããå¿
èŠããããŸãã 貪欲ãªã¢ã«ãŽãªãºã ã¯ãæé©ãªå
¥åã1ã€éžæããã€ã³ããã¯ã¹ãæ°Žå¹³ã«äžŠã¹æ¿ããããšããå§ãŸããŸãã ãããŠãéžæãããå
¥åã«2çªç®ã®å
¥åã远å ããŠããããã®åèšé¢é£æ§ãæå€§ã«ãªãããã«ããŸãã ããããèãããããã¹ãŠã®çµã¿åããã®ãã¡ãéšåã®1/37ã®ã¿ãå®å
šã«ééããããšãããããŸãã å¥ã®æ¬¡å
ã远å ãããšãæž¡ãããã»ã«ã®æ°ã¯ããã«å°ãããªããŸãïŒçŽ1/37 ^ 2ã
åæã«ã貪欲ãªã¢ã«ãŽãªãºã ãèŠã€ããæé©ãªçµã¿åããã§ã¯ãªãå Žåãå®çšçãªç¶æ³ãå¯èœã§ãã ããã¯ã2ã€ã®å
¥åã®ãããããåå¥ã«ã¿ã¹ã¯ã«æé©ãªé¢é£æ§ã瀺ããªãå ŽåïŒããã³æåã®ã¹ãããã§éžæãããŠããªãå ŽåïŒã«çºçããå¯èœæ§ããããŸãã
倱æããã¢ã«ãŽãªãºã ã¯ãã£ãšé·ãèŠããŸãïŒ
ïŒOïŒ= 2 ^ n
ããã«å¯èœãªå
¥åã®çµã¿åããã確èªããŸãã ãããã圌ã¯ãåé¡ã«å¯Ÿãããã¹ãŠã®å€æŽãäžæ°ã«æ€çŽ¢ããããšã§ããããããããã«åªããå
¥åã®ãµãã»ãããèŠã€ããæ©äŒããããŸãã
貪欲/鿬²åŒµãã®ç¢ºç«ãããäºåæ³ãè¶
ããæ€çŽ¢ã¢ã«ãŽãªãºã ããããŸãããã®ãããªåå¥ã®æ€çŽ¢ã®äŸãšããŠã¯ãå
¥åãåå¥ã«åå¥ã«ãœãŒãããåºå倿°ã«å¯Ÿããåã
ã®éèŠæ§ãè©äŸ¡ããããšã§ãã ã¡ãªã¿ã«ãæåã®æ³¢ã¯ã倿°ã®è²ªæ¬²ãªå
å«ã®ããã®ã¢ã«ãŽãªãºã ã§å§ãŸããŸãã ããããéåžžã«é«éã§ããããšãé€ããŠããã®ãœãŒãã§äœãè¯ãã®ã§ããããïŒ åå
¥å倿°ã¯ãç空äžãã«ååšãå§ããŸããã€ãŸããéžæããå
¥åãšåºåã®éã®æ¥ç¶ã«å¯Ÿããä»ã®å
¥åã®åœ±é¿ãèæ
®ããŸããã åæã«ãã¢ã«ãŽãªãºã ã®å®äºåŸãåºåã«å¯Ÿããåã
ã®éèŠæ§ã瀺ãå
¥åã®çµæãªã¹ãã¯ãåäºæž¬åã®åã
ã®éèŠæ§ã«é¢ããæ
å ±ã®ã¿ãæäŸããŸãã ãã®ãªã¹ãã«åŸã£ãŠãç¹å®ã®æ°ã®æãéèŠãªäºæž¬åãçµã¿åããããšãããã€ãã®åé¡ãçºçããå¯èœæ§ããããŸãã
- åé·æ§ïŒäºæž¬åå士ã®çžé¢ã®å ŽåïŒ;
- éžææ®µéã§ã®äºæž¬åã®çžäºäœçšã®ç¡èŠã«ããäžååãªæ
å ±ã
- äºæž¬å€æ°ãååŸããå¿
èŠãããå¢çç·ã®ãŒããã
ã芧ã®ãšãããåé¡ã¯ããããªãã®ã§ã¯ãããŸããã
IPRåé¡ã®äž»ãªåé¡ã¯ããµãã»ããæ€çŽ¢æ³ãšãã£ãããã¹é¢æ°ã®æé©ãªçµã¿åãããšããŠå®åŒåãããŸãããã®å£°æãããã«è©³ããèããŠã¿ãŸãããã IPRåé¡ã¯ã2ã€ã®ä»®èª¬ã§èª¬æã§ããŸãã
aïŒãšã©ãŒã®è¡šé¢ã¯åçŽãŸãã¯è€éã§ãã
bïŒããŒã¿ã«åçŽãªäŸåé¢ä¿ãŸãã¯è€éãªäŸåé¢ä¿ãããã
ãããã®è³ªåã«å¯Ÿããçãã«å¿ããŠãæ€çŽ¢æ¹æ³ãšéžæããæ©èœã®é¢é£æ§ã倿ããæ¹æ³ã®ç¹å®ã®çµã¿åãããéžæããå¿
èŠããããŸãã
衚é¢ãšã©ãŒã
åçŽãªè¡šé¢ã®äŸïŒ
åºæããã§ã¯ã2ã€ã®å
¥åã®çµã¿åãããéžæããåºåãšã®é¢é£æ§ã倿ããŠãåŸé
ã®æ¹åã«æ»ãããªè¡šé¢ãäžã£ãŠè¡ããã»ãŒç¢ºå®ã«æé©ãªãã€ã³ãã«å°éããŸãã
è€éãªè¡šé¢ã®äŸïŒ
åºæãã®å Žåãåãåé¡ãè§£ããšãå€ãã®å±æçæå°å€ã«ééãããããã貪欲ãªã¢ã«ãŽãªãºã ãæãåºããªããªããŸãã åæã«ã確ççæ€çŽ¢ã䜿çšããã¢ã«ãŽãªãºã ã§ã¯ãããæ£ç¢ºãªè§£ãèŠã€ããå¯èœæ§ãé«ããªããŸãã
å
ã»ã©ãäºæž¬åã®ãµãã»ãããèŠã€ããããšã¯åå¥ã®ã¿ã¹ã¯ã§ãããšè¿°ã¹ãŸããã å
¥åãžã®åºåã®äŸåæ§ã«çžäºäœçšãå«ãŸããå Žåã空éå
ã®ããç¹ããæ¬¡ã®ç¹ãžã®é·ç§»äžã«ããã£ãããã¹é¢æ°ã®å€ã®æ¥æ¿ãªãžã£ã³ãã芳å¯ã§ããŸãã ç§ãã¡ã®å Žåã®ãšã©ãŒè¡šé¢ã¯ãå€ãã®å Žåãæ»ããã§ã¯ãªãã埮åäžå¯èœã§ãã

ããã¯ã2ã€ã®å
¥åã®ãµãã»ãããšãåºå倿°ã®ãµãã»ããã®é¢é£æ§é¢æ°ã®å¯Ÿå¿ããå€ãèŠã€ããäŸã§ãã 衚é¢ã¯æ»ããã§ã¯ãªããããŒã¯ããããã»ãŒåãå€ã®å¹åžã®ãããã©ããŒãå«ãŸããŠããããšãããããŸãã æ¬²åŒµãåŸé
éäžæ³ã®æªå€¢ã
äŸåé¢ä¿ã
åé¡ã®æž¬å®æ°ã®å¢å ã«äŒŽããåºå倿°ã®äŸåæ§ãéåžžã«è€éãªæ§é ãæã¡ãå€ãã®å
¥åã䌎ããšããçè«äžã®å¯èœæ§ãå¢å ããŸãã ããã«ãäŸåé¢ä¿ã¯ç·åœ¢ãšéç·åœ¢ã®äž¡æ¹ã«ãªããŸãã äŸåé¢ä¿ãäºæž¬åãšéç·åœ¢åœ¢åŒã®çžäºäœçšãæå³ããå Žåãããšãã°ãã©ã³ãã ãã©ã¬ã¹ããã¬ãŒãã³ã°ãŸãã¯ãã¥ãŒã©ã«ãããã¯ãŒã¯ã䜿çšããŠããããã®äž¡æ¹ã®ãã€ã³ãã®ã¿ãèæ
®ããŠãããèŠã€ããããšãã§ããŸãã äŸåé¢ä¿ãåçŽã§ç·åœ¢ã§ããå Žåããã¹ãŠã®äºæž¬å€æ°ã®ããäžéšããå«ãŸããŠããªãããããããèŠã€ããããã®ã¢ãããŒãã¯ãçµæãšããŠIPRã«è³ããŸã§ãã¢ãã«ã®å質ãè©äŸ¡ããŠãç·åœ¢ååž°ã¢ãã«ã«1ã€ãŸãã¯è€æ°ã®å
¥åãå«ããããšã«æžããããšãã§ããŸãã
åçŽãªäŸåé¢ä¿ã®äŸïŒ

ãã®å Žåãåºåè»žã«æ²¿ã£ãå€ã®input1ããã³input2ã®å€ãžã®äŸåæ§ã¯ã空éå
ã®å¹³é¢ã«ãã£ãŠèšè¿°ãããŸãã
åºå=å
¥å1 * 10 +å
¥å2 * 10
ãã®äŸåé¢ä¿ã®ã¢ãã«ã¯éåžžã«åçŽã§ãããç·åœ¢ååž°ã§è¿äŒŒã§ããŸãã
è€éãªäŸåé¢ä¿ã®äŸïŒ

ãã®éç·åœ¢äŸåæ§ã¯ãç·åœ¢ã¢ãã«ãæ§ç¯ããŠãæ€åºã§ããªããªããŸããã 圌女ã®å€èŠ³ã¯æ¬¡ã®ãšããã§ãã
åºå=å
¥å1 ^ 2 +å
¥å2 ^ 2
åé¡ã®æ¬¡å
ãèæ
®ããããšãå¿
èŠã§ãã
å
¥å倿°ã®æ°ãå€ãå Žåãå¯èœæ§ã®ãããã¹ãŠã®ãµãã»ããã®ç·æ°ã¯æ¬¡ã®åŒã§äžããããããã確ççæ¹æ³ïŒæ¬²åŒµãã§ãªãïŒã«ããæé©ãªãµãã»ããã®æ€çŽ¢ã¯éåžžã«é«äŸ¡ã«ãªãå¯èœæ§ããããŸã
m = 2 ^ nã
ããã§ãnã¯ãã¹ãŠã®å
¥åãã£ãŒãã£ã®æ°ã§ãã
ãããã£ãŠããã®ãããªå€æ§æ§ã®æå°å€ã®æ€çŽ¢ã¯éåžžã«é·ããªãå¯èœæ§ããããŸãã äžæ¹ãè²ªæ¬²ãªæ€çŽ¢ã䜿çšãããšãããšãããã屿çãªæå°å€ã§ãããç ç©¶è
ãããã«ã€ããŠç¥ã£ãŠãããšããŠãã劥åœãªæéã§æåã®è¿äŒŒãè¡ãããšãå¯èœã«ãªããŸãã
調æ»äžã®çŸè±¡ã«é¢ãã客芳çãªç¥èããªãå Žåãå
¥å倿°ãšåºåã®äŸåé¢ä¿ãã©ãã»ã©è€éã«ãªãããå
¥åã®æé©ãªãµãã»ãããéžæããåé¡ã®è¿äŒŒãŸãã¯æ£ç¢ºãªè§£ãèŠã€ããããã«æé©ãªå
¥åã®æ°ãäºåã«èšãããšã¯äžå¯èœã§ãã ãŸããIPRã®ãšã©ãŒãµãŒãã§ã¹ãæ»ããã§åçŽã§ããããè€éã§é äžã§ããããäºæž¬ããããšãå°é£ã§ãã
ãŸããç§ãã¡ã¯åžžã«ãªãœãŒã¹ã«éãããããæé©ãªæ±ºå®ãäžããªããã°ãªããŸããã IPRãžã®ã¢ãããŒããéçºããéã®å°ããªå©ããšããŠã次ã®è¡šã䜿çšã§ããŸãã

ãããã£ãŠãå
¥åã®ãµãã»ãããšãã£ãããã¹é¢é£é¢æ°ã®æ€çŽ¢æ¹æ³ã®ããã€ãã®çµã¿åãããèæ
®ããæ©äŒãåžžã«ãããŸãã æãé«äŸ¡ã§ãããããæã广çãªçµã¿åããã¯ãè²ªæ¬²ãªæ€çŽ¢ãšã©ãããŒãã£ãããã¹é¢æ°ã§ãã å埩ããšã«ãã¬ãŒãã³ã°ãããã¢ãã«ïŒããã³æ€èšŒã®ç²ŸåºŠïŒããããããéžæããå
¥åã®é¢é£æ§ã®æãæ£ç¢ºãªæž¬å®å€ãæäŸããªãããããŒã«ã«ãããã ãåé¿ã§ããŸãã
æãå®äŸ¡ã§ãããåžžã«æã广çã§ã¯ãªãã¢ãããŒãã¯ãè²ªæ¬²ãªæ€çŽ¢ãšãã£ã«ã¿ãŒé¢æ°ã®çµã¿åããã§ããããã¯ãçµ±èšçæ€å®ãçžé¢ä¿æ°ããŸãã¯çžäºæ
å ±éã§ãã
ããã«ãçµã¿èŸŒã¿ã¡ãœããã䜿çšãããšãã·ãã¥ã¬ãŒã·ã§ã³ã®ç²ŸåºŠã倧å¹
ã«æãªãããšãªããã¢ãã«ã®ãã¬ãŒãã³ã°çŽåŸã«ã¢ã«ãŽãªãºã ã®èгç¹ããäžèŠãªå€æ°ã®å
¥åãé€å€ã§ããŸãã
åé¡ãããŸããŸãªæ¹æ³ã§è§£æ±ºããæé©ãªãã®ãéžæããããã«äœåºŠã詊ããŠã¿ãã®ãè¯ãæ¹æ³ã§ãã
äžè¬çã«èšãã°ãæçãªç¹åŸŽã®éžæã¯ã倿¬¡å
空éã§ã®æ€çŽ¢æ¹æ³ã®æé©ãªçµã¿åãããšãåºå倿°ã«é¢ããéžæããããµãã»ããã®é¢é£æ§ã«å¯Ÿããæé©ãªãã£ãããã¹é¢æ°ã®éžæã§ãã
åºæ2ïŒåæããŒã¿ã«é¢ããæçãªç¹åŸŽã®éžæã«é¢ããå®éšã
å®éšããŒã¿ã»ããïŒStanford BunnyïŒïŒ

ãŠãµã®ã倧奜ãã§ãã
ãã€ã³ãã®é«ãïŒZ軞ïŒã®ç·¯åºŠãšçµåºŠãžã®äŸåæ§ã調ã¹ãŸãã åæã«ã2ã€ã®æ
å ±å
¥åïŒXãšYïŒã®æ··åã«ã»ãŒå¯Ÿå¿ããååžãæã€10åã®ãã€ãºå€æ°ãã»ããã«è¿œå ããŸããã倿°Zãšã¯é¢ä¿ãããŸããã
倿°XãYãZããã³ãã€ãºå€æ°ã®1ã€ã®ååžå¯åºŠã®ãã¹ãã°ã©ã ãèŠãŠã¿ãŸãããã




ä»»æã®ãã©ã¡ãŒã¿ãæã€ååžãèŠãããŸãã ããã«ããã¹ãŠã®ãã€ãºå€æ°ã¯ãç¹å®ã®ç¯å²ã®å€ã«å°ããªããŒã¯ãããããã«ååžããŠããŸãã
ããã«ãããŒã¿ã»ããã¯ããã¬ãŒãã³ã°ãšæ€èšŒã®2ã€ã®éšåã«ã©ã³ãã ã«åå²ãããŸãã
ããŒã¿æºåã
ã³ãŒãlibrary(onion) data(bunny)
å®éšïŒ1ïŒéèŠåºŠãè©äŸ¡ããç·åœ¢é¢æ°ã䜿çšããå
¥åã®ãµãã»ããã®è²ªæ¬²æ€çŽ¢ïŒãã£ãããã¹é¢æ°ãšããŠãæ€èšŒãµã³ãã«ã§ãã¬ãŒãã³ã°æžã¿ã¢ãã«ã®æ±ºå®ä¿æ°ãæšå®ããã©ãããŒãªãã·ã§ã³ã䜿çšããŸãïŒçµæïŒ
> subset <b>[1] "x" "y" "input_noise_2" "input_noise_5" "input_noise_6" "input_noise_8" "input_noise_9"</b>
ãã€ãºå€æ°ã§ããããšã倿ããŸããã
èšç·Žãããã¢ãã«ãèŠãŠã¿ãŸãããïŒ
> summary(lm_m) Call: lm(formula = z ~ ., data = dat, model = T) Residuals: Min 1Q Median 3Q Max -0.060613 -0.022650 -0.000173 0.024939 0.048544 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0232453 0.0005581 41.651 < 2e-16 *** <b>x -0.0257686 0.0052998 -4.862 1.17e-06 *** y -0.1572786 0.0052585 -29.910 < 2e-16 ***</b> input_noise_2 -0.0017249 0.0027680 -0.623 0.533 input_noise_5 -0.0027391 0.0027848 -0.984 0.325 input_noise_6 0.0032417 0.0027907 1.162 0.245 input_noise_8 0.0044998 0.0027723 1.623 0.105 input_noise_9 0.0006839 0.0027808 0.246 0.806 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02742 on 17965 degrees of freedom Multiple R-squared: 0.04937, Adjusted R-squared: 0.049 F-statistic: 133.3 on 7 and 17965 DF, p-value: < 2.2e-16
å®éã«ã¯ãå
ã®å
¥åãšæ¹çšåŒã®èªç±é
ã®ã¿ãçµ±èšçæææ§ãåãå
¥ããŠããããšãããããŸãã
次ã«ã貪欲ãªå€æ°ã®é€å€ãè¡ããŸãã
ã³ãŒã subset <- backward.search(attributes = names(sampleA)[1:(ncol(sampleA) - 1)], eval.fun = linear_fit)
çµæïŒ
> subset <b>[1] "x" "y" "input_noise_2" "input_noise_5" "input_noise_6" "input_noise_8" "input_noise_9"</b>
ã¢ãã«ã«ã¯ãã€ãºãå«ãŸããŠããŸããã
èšç·Žãããã¢ãã«ãèŠãŠã¿ãŸãããïŒ
> summary(lm_m) Call: lm(formula = z ~ ., data = dat, model = T) Residuals: Min 1Q Median 3Q Max -0.060613 -0.022650 -0.000173 0.024939 0.048544 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0232453 0.0005581 41.651 < 2e-16 *** <b>x -0.0257686 0.0052998 -4.862 1.17e-06 *** y -0.1572786 0.0052585 -29.910 < 2e-16 ***</b> input_noise_2 -0.0017249 0.0027680 -0.623 0.533 input_noise_5 -0.0027391 0.0027848 -0.984 0.325 input_noise_6 0.0032417 0.0027907 1.162 0.245 input_noise_8 0.0044998 0.0027723 1.623 0.105 input_noise_9 0.0006839 0.0027808 0.246 0.806 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02742 on 17965 degrees of freedom Multiple R-squared: 0.04937, Adjusted R-squared: 0.049 F-statistic: 133.3 on 7 and 17965 DF, p-value: < 2.2e-16
åæ§ã«ãã¢ãã«å
ã§ã¯å
ã®å
¥åã®ã¿ãéèŠã§ããããšãããããŸãã
倿°Xããã³Yã®ã¿ã§ã¢ãã«ããã¬ãŒãã³ã°ãããšã次ã®ããã«ãªããŸãã
> print(subset) <b>[1] "x" "y"</b> > print(r_sq_validate) <b>[1] 0.05185492</b> > summary(lm_m) Call: lm(formula = z ~ ., data = dat, model = T) Residuals: Min 1Q Median 3Q Max -0.059884 -0.022653 -0.000209 0.024955 0.048238 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0233808 0.0005129 45.590 < 2e-16 *** <b>x -0.0257813 0.0052995 -4.865 1.15e-06 *** y -0.1573098 0.0052576 -29.920 < 2e-16 ***</b> --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02742 on 17970 degrees of freedom Multiple R-squared: 0.04908, Adjusted R-squared: 0.04898 F-statistic: 463.8 on 2 and 17970 DF, p-value: < 2.2e-16
äºå®ãæ€èšŒã§ã¯ããã€ãºå€æ°ããªãã«ãããšãã®R ^ 2ãé«ããªããŸããã
å¥åŠãªçµæïŒ ãããããããŒã¿æ§é ã«ããããã€ãºã¯ã¢ãã«ã«æªåœ±é¿ãäžããŸããã
ããããäºæž¬å€æ°ã®çžäºäœçšãèæ
®ã«å
¥ããããšã¯ãŸã ããŠããŸããã
ã³ãŒã lm_m <- lm(formula = z ~ x * y, data = dat, model = T)
ããã¯ããªãããŸããã£ãïŒ
> summary(lm_m) Call: lm(formula = z ~ x * y, data = dat, model = T) Residuals: Min 1Q Median 3Q Max -0.057761 -0.023067 -0.000119 0.024762 0.049747 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0196766 0.0006545 30.062 <2e-16 *** x -0.1513484 0.0148113 -10.218 <2e-16 *** y -0.1084295 0.0075183 -14.422 <2e-16 *** x:y 1.3771299 0.1517363 9.076 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02736 on 17969 degrees of freedom Multiple R-squared: 0.05342, Adjusted R-squared: 0.05327 F-statistic: 338.1 on 3 and 17969 DF, p-value: < 2.2e-16
XãšYã®çžäºäœçšã¯éèŠã§ãã æ€èšŒã«é¢ããR ^ 2ã¯ã©ãã§ããïŒ
> lm_predict <- predict(lm_m, + newdata = sampleB) > 1 - sum((sampleB$z - lm_predict) ^ 2) / sum((sampleB$z - mean(sampleB$z)) ^ 2) <b>[1] 0.05464066</b>
ããã¯ç§ãã¡ãèŠãäžã§æé«ã®å€ã§ãã æ®å¿µãªãããããã¯ãã£ãããã¹é¢æ°ã«èŠå®ãããŠããªãçžäºäœçšãªãã·ã§ã³ã§ããããã®å
¥åã®çµã¿åãããéããŸããã
å®éšïŒ2ïŒéèŠåºŠãè©äŸ¡ããç·åœ¢é¢æ°ã䜿çšããå
¥åã®ãµãã»ããã®è²ªæ¬²æ€çŽ¢ïŒåã蟌ã¿ãªãã·ã§ã³ã¯ãã£ãããã¹é¢æ°ãšããŠäœ¿çšãããŸã-ãã¬ãŒãã³ã°ãµã³ãã«ã®ãã¬ãŒãã³ã°ã¢ãã«ã®fçµ±èšéïŒãã³ãŒã linear_fit_f <- function(subset){ dat <- sampleA[, c(subset, "z")] lm_m <- lm(formula = z ~., data = dat, model = T) print(subset) print(summary(lm_m)$fstatistic[[1]]) return(summary(lm_m)$fstatistic[[1]]) }
倿°ã®é 次å
å«ã®çµæã¯ã1ã€ã®äºæž¬åYã®ã¿ã§ãããã®ãããFçµ±èšã¯æå€§åãããŸããã ã€ãŸãããã®å€æ°ã¯éåžžã«éèŠã§ãã ããããäœããã®çç±ã§å€æ°Xã¯å¿ããããŠããŸãã
ãããŠã倿°ã®é 次é€å€ã
çµæã¯åæ§ã§ã-1ã€ã®å€æ°ã®ã¿ãYã§ãã
F-Statisticå€å€æ°ã¢ãã«ãæå€§åãããšããã¹ãŠã®ãã€ãºããªãŒããŒããŒãã§ãããã¢ãã«ãããã¹ãã§ããããšã倿ããŸãããæ€èšŒã®æ±ºå®ä¿æ°ã¯ãå®éšçªå·1ã®æè¯ã®ã¢ãã«ãšã»ãŒåãã§ãã
> r_sq_validate <b>[1] 0.05034534</b>
å®éš3ïŒãã¢ãœã³çžé¢ä¿æ°ã䜿çšããŠäºæž¬å€æ°ã®åã
ã®æææ§ã亀äºã«è©äŸ¡ããŸãïŒãã®ãªãã·ã§ã³ã¯æãåçŽã§ãçžäºäœçšãèæ
®ããããã£ãããã¹é¢æ°ãåçŽã§ã-ç·åœ¢é¢ä¿ã®ã¿ãè©äŸ¡ããŸãïŒãã³ãŒã correlation_arr <- data.frame() for (i in 1:12){ correlation_arr[i, 1] <- colnames(sampleA)[i] correlation_arr[i, 2] <- cor(sampleA[, i], sampleA[, 'z']) }
çµæïŒ
> correlation_arr V1 V2 <b>1 x 0.0413782832 2 y -0.2187061876</b> 3 input_noise_1 -0.0097719425 4 input_noise_2 -0.0019297383 5 input_noise_3 0.0002143946 6 input_noise_4 -0.0142325764 7 input_noise_5 -0.0048206943 8 input_noise_6 0.0090877674 9 input_noise_7 -0.0152897433 10 input_noise_8 0.0143477495 11 input_noise_9 0.0027560459 12 input_noise_10 -0.0079526578
æãé«ãçžé¢ã¯ZãšYã®éã§ããã2çªç®ã¯Xã§ãããã ããXã®çžé¢ã¯é¡èã§ã¯ãªããå倿°ã®çžé¢ä¿æ°ãšãŒãã®å·®ã®æææ§ã®çµ±èšçæ€å®ãå¿
èŠã§ãã
äžæ¹ãå®è¡ããã3ã€ã®å®éšãã¹ãŠã§ãäºæž¬åã®çžäºäœçšïŒX * YïŒããŸã£ããèæ
®ããŸããã§ããã ããã¯ããŠãããã®æææ§ã®è©äŸ¡ãŸãã¯æ¹çšåŒãžã®äºæž¬å€æ°ã®ç·åœ¢å
å«ãæç¢ºãªçããäžããªããšããäºå®ã説æã§ããŸãã
ãã®ãããªå®éšè
ïŒ
> cor(sampleA$x * sampleA$y, sampleA$z) <b>[1] 0.1211382</b>
XãšYã®çžäºäœçšãZãšããªã匷ãçžé¢ããŠããããšã瀺ããŠããŸãã
å®éšçªå·4ïŒãã·ã³ã«çµã¿èŸŒãŸããã¢ã«ãŽãªãºã ã«ããäºæž¬åã®éèŠæ§ã®è©äŸ¡ïŒè²ªæ¬²æ€çŽ¢ã®å€åœ¢ãšGBMã®å
¥åã®éèŠæ§ã®åã蟌ã¿ãã£ãããã¹é¢æ°ïŒãGradient Boosted TreesïŒgbmïŒããã¬ãŒãã³ã°ãã倿°ã®éèŠæ§ã調ã¹ãŸãã GBMã®äœ¿çšã«é¢ãã詳现ãªèšäºïŒ
åŸé
ããŒã¹ãã£ã³ã°ãã·ã³ããã¥ãŒããªã¢ã«ã倩äºããåŠç¿ãã©ã¡ãŒã¿ãŒãååŸããéåžžã«äœãåŠç¿é床ãèšå®ããŠã匷ãåãã¬ãŒãã³ã°ãé¿ããŸãã æ±ºå®æšã¯è²ªæ¬²ã§ãããå€ãã®ã¢ãã«ã远å ããŠã¢ãã«ãæ¹åããããšã¯ã芳枬å€ãšå
¥åããµã³ããªã³ã°ããããšã«ãã£ãŠéæãããããšã«æ³šæããŠãã ããã
ã³ãŒã library(gbm) gbm_dat <- bunny_dat[, c("x", "y", "input_noise_1", "input_noise_2", "input_noise_3", "input_noise_4", "input_noise_5", "input_noise_6", "input_noise_7", "input_noise_8", "input_noise_9", "input_noise_10", "z")] gbm_fit <- gbm(formula = z ~., distribution = "gaussian", data = gbm_dat, n.trees = 500, interaction.depth = 12, n.minobsinnode = 100, shrinkage = 0.0001, bag.fraction = 0.9, train.fraction = 0.7, n.cores = 6) gbm.perf(object = gbm_fit, plot.it = TRUE, oobag.curve = F, overlay = TRUE) summary(gbm_fit)
çµæïŒ
> summary(gbm_fit) var rel.inf <b>yy 69.7919 xx 30.2081</b> input_noise_1 input_noise_1 0.0000 input_noise_2 input_noise_2 0.0000 input_noise_3 input_noise_3 0.0000 input_noise_4 input_noise_4 0.0000 input_noise_5 input_noise_5 0.0000 input_noise_6 input_noise_6 0.0000 input_noise_7 input_noise_7 0.0000 input_noise_8 input_noise_8 0.0000 input_noise_9 input_noise_9 0.0000 input_noise_10 input_noise_10 0.0000
ãã®ã¢ãããŒãã¯ã¿ã¹ã¯ã«å®å
šã«å¯Ÿå¿ããéãã€ãºå
¥åã匷調ããä»ã®ãã¹ãŠã®å
¥åããŸã£ããéèŠã§ã¯ãããŸããã§ããã
ããã«ããã®å®éšã®èšå®ã¯éåžžã«é«éã§ããããã¹ãŠãã»ãšãã©ãã®ãŸãŸã§æ©èœããããšã«æ³šæããŠãã ããã æé©ãªãã¬ãŒãã³ã°ãã©ã¡ãŒã¿ãŒãååŸããããã®äº€å·®æ€èšŒãå«ãããã®å®éšã®ããç¶¿å¯ãªèšç»ã¯ããè€éã§ãããå®çšŒåç°å¢ã§å®éã®ã¢ãã«ãæºåãããšãã«ãããè¡ãå¿
èŠããããŸãã
å®éšçªå·5ïŒéèŠåºŠãæšå®ããç·åœ¢é¢æ°ã䜿çšãã確ççæ€çŽ¢ã䜿çšããŠäºæž¬å€æ°ã®éèŠåºŠãæšå®ããŸãïŒããã¯å
¥å空éã§ã®è²ªæ¬²ã§ãªãæ€çŽ¢ã§ãããã©ãããŒãªãã·ã§ã³ã¯ãã£ãããã¹é¢æ°ãšããŠäœ¿çšãããŸã-æ€èšŒãµã³ãã«ã§ãã¬ãŒãã³ã°æžã¿ã¢ãã«ã®æ±ºå®ä¿æ°ãæšå®ããŸãïŒãä»åãåŠç¿ç·åœ¢ã¢ãã«ã«ã¯ãäºæž¬åéã®ãã¢ã¯ã€ãºçžäºäœçšãå«ãŸããŸããã©ãããã®ïŒ
<b>[1] "5.53%"</b> > final_vector <- c((sao$par >= threshold), T) > names(sampleA)[final_vector] <b>[1] "x" "y" "input_noise_7" "input_noise_8" "input_noise_9" "z" </b> > summary(lm_m) Call: lm(formula = z ~ .^2, data = sampleA[, final_vector], model = T) Residuals: Min 1Q Median 3Q Max -0.058691 -0.023202 -0.000276 0.024953 0.050618 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0197777 0.0007776 25.434 <2e-16 *** <b>x -0.1547889 0.0154268 -10.034 <2e-16 *** y -0.1148349 0.0085787 -13.386 <2e-16 ***</b> input_noise_7 -0.0102894 0.0071871 -1.432 0.152 input_noise_8 -0.0013928 0.0071508 -0.195 0.846 input_noise_9 0.0026736 0.0071910 0.372 0.710 <b>x:y 1.3098676 0.1515268 8.644 <2e-16 ***</b> x:input_noise_7 0.0352997 0.0709842 0.497 0.619 x:input_noise_8 0.0653103 0.0714883 0.914 0.361 x:input_noise_9 0.0459939 0.0716704 0.642 0.521 y:input_noise_7 0.0512392 0.0710949 0.721 0.471 y:input_noise_8 0.0563148 0.0707809 0.796 0.426 y:input_noise_9 -0.0085022 0.0710267 -0.120 0.905 input_noise_7:input_noise_8 0.0129156 0.0374855 0.345 0.730 input_noise_7:input_noise_9 0.0519535 0.0376869 1.379 0.168 input_noise_8:input_noise_9 0.0128397 0.0379640 0.338 0.735 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.0274 on 17957 degrees of freedom Multiple R-squared: 0.05356, Adjusted R-squared: 0.05277 F-statistic: 67.75 on 15 and 17957 DF, p-value: < 2.2e-16
èŠéããŠããããšãããããŸãã ãã€ãºãå«ãŸããŠããŸãã
ã芧ã®ãšãããæ€èšŒçšã®æ±ºå®ä¿æ°ã®æé©å€ã¯ããã€ãºå€æ°ãå«ããããšã§éæãããŸããã ããã«ãæ€çŽ¢ã¢ã«ãŽãªãºã ã®åæã¯åŸ¹åºçã§ãïŒ

ãã£ãããã¹é¢æ°ã倿ŽããŠãæ€çŽ¢æ¹æ³ãä¿åããŠã¿ãŸãããã
å®éš6ïŒéèŠåºŠãæšå®ããç·åœ¢é¢æ°ã䜿çšãã確ççæ€çŽ¢ã䜿çšããŠäºæž¬åã®éèŠåºŠãè©äŸ¡ããŸãïŒããã¯å
¥å空éã§ã®æ¬²åŒµãã§ãªãæ€çŽ¢ã§ãããé©ååºŠé¢æ°ã¯ã¢ãã«ä¿æ°ã«å¯Ÿå¿ããpå€ãåã蟌ãŸããŠããŸãïŒãã¢ãã«ã«å«ãŸããä¿æ°ã®å¹³åpå€ãæå°ã«ãªããããªäºæž¬åã®ã»ãããéžæããŸãã
çµæïŒ
> percent(- sao$value) <b>[1] "-4.7e-208%"</b> > final_vector <- c((sao$par >= threshold), T) > names(sampleA)[final_vector] <b>[1] "y" "z"</b>
ä»åã¯ãã¹ãŠãããŸããããŸããã å
ã®äºæž¬å€ã®ã¿ãéžæãããŸãããpå€ãéåžžã«å°ããããã§ãã
ã¢ã«ãŽãªãºã ã®åæã¯60ç§ã§è¯å¥œã§ãïŒ
å®éš7ïŒåŠç¿æžã¿ã¢ãã«ã®å質ã«ããéèŠåºŠã®è©äŸ¡ã䌎ã貪欲æ€çŽ¢ã䜿çšããŠäºæž¬åã®éèŠåºŠãè©äŸ¡ããŸãïŒããã¯å
¥å空éã§ã®è²ªæ¬²ãªæ€çŽ¢ã§ãããé©ååºŠé¢æ°ã¯ã©ãããŒã§ããããã¯ãæ€èšŒã¢ãã«ã§ããŒã¹ããããæ±ºå®æšã®æ±ºå®ä¿æ°ã«å¯Ÿå¿ããŸãïŒãäºæž¬åã貪欲ã«å«ãçµæïŒ
> subset <b>[1] "x" "y"</b> > r_sq_validate <b>[1] 0.2363794</b>
ãã€ã³ããæã€ïŒ
äºæž¬åã®è²ªæ¬²ãª
é€å€ã®çµæïŒ
> subset <b> [1] "x" "y" "input_noise_1" "input_noise_2" "input_noise_3" "input_noise_4" "input_noise_5" "input_noise_6" "input_noise_7" [10] "input_noise_9" "input_noise_10"</b> > r_sq_validate <b>[1] 0.2266737</b>
æªåããŠããŸãã ã¢ãã«ã«ãã€ãºäºæž¬åãå«ããŠããæ€èšŒæã®äºæž¬ã®å質ã¯å€§å¹
ã«äœäžããŸããã§ããã ãŸããããã«ã¯èª¬æããããŸããã©ã³ãã æ±ºå®ãã©ã¬ã¹ãã«ã¯æ£èŠåæ©èœãçµã¿èŸŒãŸããŠãããåŠç¿ããã»ã¹ã§éæ
å ±å
¥åãç¡èŠã§ããŸãã
ããã§ãæšæºçãªæ¹æ³ã䜿çšããŠIPRã«é¢ããå®éšã®ã»ã¯ã·ã§ã³ãå®äºããŸãã ãããŠæ¬¡ã®ã»ã¯ã·ã§ã³ã§ã¯ãçµ±èšçã«ä¿¡é Œã§ãããã®ä»äºãããŸãè¡ã£ãŠãããæ
å ±ã¡ããªãã¯ã«åºã¥ããæ¹æ³ã®å®çšçãªã¢ããªã±ãŒã·ã§ã³ãæ£åœåããŠç€ºããŸãã
3ïŒIPRã®é©å¿åºŠé¢æ°ãæ§ç¯ããããã®æ
å ±çè«ã®äœ¿çšã
ãã®ã»ã¯ã·ã§ã³ã®éèŠãªè³ªåã¯ãäŸåã®æŠå¿µãã©ã®ããã«èª¬æããæ
å ±çè«çãªæå³ã§ãããå®åŒåãããã§ãã
åºææ
å ±ãšã³ããããŒã®æŠå¿µããå§ããå¿
èŠããããŸãã ãšã³ããããŒïŒã·ã£ãã³ïŒã¯äžç¢ºå®æ§ã®å矩èªã§ãã ã©ã³ãã 倿°ã®å€ã«ã€ããŠäžç¢ºå®ã§ããã»ã©ããšã³ããããŒïŒæ
å ±ã®å¥ã®å矩èªïŒããã®å€æ°ã®å®çŸã«ãªããŸãã ã³ã€ã³ããªããã®äŸãèæ
®ãããšãä»ã®ãã¹ãŠã®ã³ã€ã³ãªãã·ã§ã³ã®å¯Ÿç§°ã³ã€ã³ã¯ã次ã®ããªããçµæã§æå€§ã®äžç¢ºå®æ§ããããããæå€§ã®ãšã³ããããŒãæã¡ãŸãã
ã·ã£ãã³ãšã³ããããŒã®åŒïŒ

äžæ¯ãšã¯äœã§ããïŒ
ã³ã€ã³ãæ°åã²ã£ããè¿ããšä»®å®ããŸãã åã®ã¹ããŒã®çµæãèŠãåŸãã¹ããŒã®æ¬¡ã®çµæã«é¢ããäžç¢ºå®æ§ãæžå°ãããšèšããŸããïŒ
é ãèœãšããåŸã2/3ã®ç¢ºçã§ã€ãŒã°ã«ãšçéžããã³ã€ã³ããããšããŸããéã«ãã€ãŒã°ã«ã倱ã£ãåŸã2/3ã®ç¢ºçã§çéžããŸãã ãã®å Žåãã¯ã·ããã³å°Ÿã®æå€±ã®ç¡æ¡ä»¶ã®é »åºŠã¯50/50ã®ãŸãŸã§ãã
ãã®ãããªã³ã€ã³ã®å Žåãã¯ã·ã®æå€±åŸãå°Ÿã®æå€±ã®é »åºŠã¯1/2ã§ãªããªãããã®éãåæ§ã§ãã ãã®ãããã¹ããŒã®æ¬¡ã®çµæã«ã€ããŠã®äžç¢ºå®æ§ã¯æžå°ããŸããïŒ50/50ã¯ãã¯ãäºæ³ãããŸããïŒã
äŸåã®çŸè±¡ãçè§£ããããã«ã確çè«ã§ã¯ç¬ç«æ§ã次ã®ããã«å®çŸ©ããŠããããšãæãåºããŠãã ããã
pïŒxãyïŒ== pïŒxïŒ* pïŒyïŒ
ãããã£ãŠãã€ãã³ãã®å
±åå®çŸã®ç¢ºçã¯ãã€ãã³ãèªäœã®å®çŸã®ç¢ºçã®ç©ã«çãããªããŸãã
ããã芳å¯ãããå Žåãã€ãã³ãã¯æ°åŠçãªæå³ã§ç¬ç«ããŠããŸãã ãã
pïŒxãyïŒïŒ= pïŒxïŒ* pïŒyïŒ
ãã®å Žåãã€ãã³ãã¯æ°åŠçã«ç¬ç«ã§ã¯ãããŸããã
åãåçããæ
å ±çè«ã«ããã2ã€ïŒãŸãã¯ãã以äžïŒã®ç¢ºç倿°éã®é¢ä¿ã枬å®ããããã®å
¬åŒã®åºç€ã«ãªã£ãŠããŸãã
ããã§ã¯ãäŸåé¢ä¿ã¯ç¢ºççãªæå³ã§çè§£ãããããšã匷調ããŸãã å æé¢ä¿ã®åæã«ã¯ãããå
æ¬çãªã¬ãã¥ãŒãå¿
èŠã§ããããã«ã¯ã誀ã£ãçžé¢é¢ä¿ãšåé·æ§ã®åæïŒçžäºæ
å ±ã®äœ¿çšã«ããïŒãšãç 究察象ã®ãªããžã§ã¯ãã«é¢ããå°éç¥èã®é
åã®äž¡æ¹ãå«ãŸããŸãã
çžäºæ
å ±

çžäºæ
å ±ã¯ãšã³ããããŒã«ãã£ãŠãå°åºã§ããŸãã

ç°¡åãªèšèã§èšãã°ãçžäºæ
å ±éã¯ãäºæž¬å€æ°ïŒãŸãã¯è€æ°ã®å€æ°ïŒãããå Žåã«ã·ã¹ãã ããåºããšã³ããããŒïŒäžç¢ºå®æ§ïŒã®éã§ãã ããšãã°ãã©ã³ãã 倿°ã®ãšã³ããããŒã¯3ãããã§ãã çžäºæ
å ±ã¯2ãããã§ãã ããã¯ãã©ã³ãã 倿°ã®å®è£
ã«é¢ããäžç¢ºå®æ§ã2/3ãŸã§ã«äºæž¬åã®ååšã«ãã£ãŠè£åãããããšãæå³ããŸãã
çžäºæ
å ±ã«ã¯ã次ã®ããããã£ããããŸãã
- 察称æ§
- 颿£å€æ°ãšé£ç¶å€æ°ã«å¯ŸããŠå®çŸ©ã§ããŸã
- XãšYãç¬ç«ããŠããå Žåã¯æ¶ããŸã
- 2ã€ã®å€æ°ãäºãã«å®å
šã«æ±ºå®ããå Žåãçžäºæ
å ±ã®æ£èŠåãããæž¬å®å€ã¯1ã«ãªããŸãã
çžäºæ
å ±ã¯ã次ã®ããã«å®åŒåããããäŸåã®çæ³çãªå°ºåºŠã®ããã€ãã®èŠä»¶ãæºãããŠããŸãã
GrangerãCãEãMaasoumi e J. Racineããéç·åœ¢ããã»ã¹ã®å¯èœæ§ãããäŸåã¡ããªãã¯ããJournal of Time Series Analysis 25ã2004ã649-669ã
ãã®æž¬å®ãé©çšãããšãã«ç¥ã£ãŠãããšäŸ¿å©ãªçžäºæ
å ±ã®ããããã£ïŒ
VI ïŒã1ã€ãããŸãã
- VIã¯ãå
¥åããã³åºåïŒäºæž¬å€æ°ããã³åŸå±å€æ°ïŒã®ãšã³ããããŒå€ã®ãã¡å°ããæ¹ã«çããæå€§å€ã«å°éã§ããŸãã
ã€ãŸããå
¥å倿°ã®ãšã³ããããŒã10ãããã§ãåºåã®ãããæ°ã3ãããã®å Žåãå
¥å倿°ãåºåãéä¿¡ã§ããæå€§æ
å ±ã¯3ãããã§ãã ããã¯ããã®ãããªã·ã¹ãã ã§äœ¿çšã§ããæå€§ã®VIã§ãã
å¥ã®ãªãã·ã§ã³ã¯ãå
¥å倿°ã3ãããã®ãšã³ããããŒãæã¡ãåºåã10ããããæã€ããšã§ãã å
¥åãåºåã«äŒããããšãã§ããæ
å ±ã®æå€§å€ã¯3ãããã§ãããããã¯VIã®å¯èœãªæå€§å€ã§ãã
VIã®å€ãåŸå±å€æ°ã®ãšã³ããããŒã§é€ç®ãããšãç¯å²[0ã1]ã®å€ãåŸãããŸããããã¯ãçžé¢ä¿æ°ãšåæ§ã«ãå€ã®ã¹ã±ãŒã«ãèæ
®ããã«ãå
¥å倿°ãåŸå±å€æ°ã決å®ããæ¹æ³ã瀺ããŸãã
VIã䜿çšããããšãæãŸãããšããå€ãã®è°è«ããããŸãããå€ãã®ãã¬ãŒããªãã䌎ããŸãã- ãã®æ¹æ³ã§ã¯ãä»»æã®æ¬¡å
ã®ç©ºéã§ä»»æã®åœ¢åŒïŒç·åœ¢ããã³éç·åœ¢ïŒã®äŸåé¢ä¿ãèŠã€ããããšãã§ããŸãã
- åèšãŸãã¯åã
ã®æ
å ±ã®å
容ãçµ±èšçã«ææã§ããå Žåããã®ã¡ãœããã¯ãã¹ãŠã®å
¥å倿°ãé€å€ããŸãã
- ãã®æ¹æ³ã®åŒ±ç¹ã¯ãVIã¡ããªãã¯ã®æ°å€çã«èšç®ãããåäœæ°ãé©çšããåŸãçµ±èšãã€ãºã®ã¬ãã«ãã»ãšãã©è¶
ããªã匱ãäŸåé¢ä¿ãç ç©¶è
ã®ç®ããé ããããšããäºå®ã§ãã
- å°æ°ã®éèŠãªé¢æ£å
¥å倿°ã䜿çšããŠãç ç©¶è
ã¯ãèŠã€ãã£ãäŸåé¢ä¿ã®è§£éãå¯èœã«ãã人éãå€èªã§ããäžé£ã®ã«ãŒã«ãæ§ç¯ã§ããŸãã
- ( random forest busting machines), « » , , ;
- , , -, ;
- , , , .
.
, , . , , , .
â (Multiinformation) (Total Correlation).
ãœãŒã¹ïŒ
:

:
Watanabe S (1960). Information theoretical analysis of multivariate correlation, IBM Journal of Research and Development 4, 66â82
(
) , . , ( 1 n) ( 1 m), . , - .
, .
, , . :
- 1) . , â , , , â . , , 2. , , .
- 2) . , , . , , , â .
- 3) . , .
., , :
, . . , . ., , 1973 â 512 .
.. â « ».
, .
, , , , .
, n N.
. , .
100 .

1000 .

, , .
, , . , . .:
1.2., , . . , ( ). , .
, .. . , , . .
( ), .
. . , , , . , 1 000 51% , - . .
.- .
- ) .
- ) , , , ( ) « » .
- ) numeric , [0, 1] (0) (1) â SA.
- ) , , -, .
- ) - ; (, 0.9) .
- ) . ããå€ãã®ãããè¯ãã
- ) , .
«» . â , :
optim_var_num < â log(x = sample_size / 100, base = round(mean_levels, 0))
, ,
, , , n , n 100. , , , .
, , .
, , ( ), :
threshold < â 1 â optim_var_num / predictor_number
, . .
, : , .
17 973 , 12 , 5 . , 3,226.
, 0,731.
?

3 . , 5 ^ 3,226 178 , 100 .
.
â8 , () ( , - â ).0,9 ( ) 10 -, 1 1 ( ). 10 .
çµæïŒ
> percent(- sao$value) <b>[1] "18.1%"</b> > final_vector <- c((sao$par >= threshold), T) > names(sampleA)[final_vector] <b>[1] "x_discrete" "y_discrete" "input_noise_2_discrete" "z_discrete" </b> > final_sample <- as.data.frame(sampleA[, final_vector]) > > Sys.time() - start Time difference of 10.00453 mins
ã»ãšãã©èµ·ãã£ãã . â .
20 , - 1 .
çµæïŒ
> percent(- sao$value) <b><b>[1] "18.2%"</b></b> > final_vector <- c((sao$par >= threshold), T) > names(sampleA)[final_vector] <b><b>[1] "x_discrete" "y_discrete" "input_noise_1_discrete" </b>"z_discrete" </b> > final_sample <- as.data.frame(sampleA[, final_vector]) > > Sys.time() - start Time difference of 20.00585 mins
â .
18% . .
( ):
Minimum-redundancy-maximum-relevance (mRMR). :
«Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,»
Hanchuan Peng, Fuhui Long, and Chris Ding
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 27, No. 8, pp.1226-1238, 2005, ( ) .
.() , () .
() , , , . , .
() , . , ( ).
, , , , , ( ) , . , mRMR , .
() , â limited sampling bias.
mRMR . , , , , .
.
, , GBM, p-values .
, . , , , . , XY-Noise1 . , ( ), Align Technology. , , , Gradient Boosted Trees (
. ).
, , , . , , , , .
, , . . , , .
. , , , , . , .
github:
Git:
- Greedy Function Approximation: A Gradient Boosting Machine by J. Friedman
- , . . , . ., , 1973 â 512 .
- .. â « ».
- Analytical estimates of limited sampling biases in different information measures by S. Panzeri at.al.
- Estimation of Entropy and Mutual Information by L. Paninski
- Entropy-Based Independence Test by A. DionÃsio at.al.
- , , DJI, .
- Measuring dependence powerfully and equitably by Y. Reshef et.al.
- «Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,» Hanchuan Peng, Fuhui Long, and Chris Ding IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 8, pp.1226-1238, 2005
- Watanabe S (1960). Information theoretical analysis of multivariate correlation, IBM Journal of Research and Development 4, 66â82
- Gradient boosting machines, a tutorial.
- Claude E. Shannon, Warren Weaver. The Mathematical Theory of Communication. Univ of Illinois Press, 1949