
ãã®èšäºã§ã¯ãæ°åŠè£
眮ã®èåŸã«ããçŽæã®èŠ³ç¹ãããäž»æååæïŒPCAïŒã¡ãœãããã©ã®ããã«æ©èœãããã«ã€ããŠã話ããããšæããŸãã ã§ããã ãã·ã³ãã«ã§ããã詳现ã«ã
æ°åŠã¯äžè¬ã«éåžžã«çŸãããšã¬ã¬ã³ããªç§åŠã§ããããã®çŸããã¯å€ãã®æœè±¡åå±€ã®èåŸã«é ããŠããå ŽåããããŸãã çµå±ã®ãšããããã¹ãŠãäžèŠããããã«èŠãããããã¯ããã«ç°¡åã§ããããšãå€æããŠãããããæãéèŠãªããšã¯ç解ããŠæ³åããããšã§ãã
ä»ã®åæãšåæ§ã«ãããŒã¿åæã§ã¯ãç¶æ³ãã§ããéãæ£ç¢ºã«èšè¿°ããåçŽåãããã¢ãã«ãäœæãããšåœ¹ç«ã€å ŽåããããŸãã å€ãã®å Žåãå
åã¯çžäºã«å€§ããäŸåããŠããããããã®åæååšã¯åé·ã§ãã
ããšãã°ãåœç€Ÿã®çææ¶è²»éã¯100ããã¡ãŒãã«ãããã®ãªããã«ã§æž¬å®ãããç±³åœã§ã¯ã¬ãã³ãããã®ãã€ã«ã§æž¬å®ãããŸãã äžèŠãå€ã¯ç°ãªããŸãããå®éã«ã¯äºãã«å³å¯ã«äŸåããŠããŸãã ãã€ã«1600mãã¬ãã³3.8lã 1ã€ã®å
åã¯å³å¯ã«ä»ã®å
åã«äŸåããŸãã
ããããã¯ããã«é »ç¹ã«ããµã€ã³ãããã»ã©å³å¯ã§ã¯ãªããäºãã«äŸåããŠããããšãèµ·ãããŸãïŒããã¯éèŠã§ãïŒïŒæ瀺çã§ã¯ãããŸããã ãšã³ãžã³å
šäœã®äœç©ã¯ã100 km / hãŸã§ã®å éã«ãã©ã¹ã®åœ±é¿ãäžããŸãããããã¯åžžã«æ£ãããšã¯éããŸããã ãããŠãäžèŠãããšèŠããªãèŠå ïŒçæã®å質ã®æ¹åããã軜ãææã®äœ¿çšããã®ä»ã®è¿ä»£çãªææãªã©ïŒãèæ
®ãããšãèªåè»ã®å¹Žã¯åŒ·ããªãããšãããããŸãããããã¯å éã«ã圱é¿ããŸãã
äŸåé¢ä¿ãšãã®åŒ·ããç¥ã£ãŠããã®ã§ãããã€ãã®æ©èœã1ã€ã§è¡šçŸããããã°çµ±åããŠãããåçŽãªã¢ãã«ã§äœæ¥ã§ããŸãã ãã¡ãããæ
å ±ã®æ倱ãåé¿ããããšã¯äžå¯èœã§ããå¯èœæ§ãæãé«ããªããŸãããPCAã¡ãœããã ãã§ãããæå°éã«æããããšãã§ããŸãã
ããå³å¯ã«è¡šçŸãããšããã®æ¹æ³ã¯èŠ³æž¬å€ã®n次å
ã¯ã©ãŠããæ¥åäœïŒn次å
ïŒã«è¿äŒŒãããã®å軞ãå°æ¥ã®äž»èŠã³ã³ããŒãã³ãã«ãªããŸãã ãããŠããã®ãããªè»žã«æ圱ãããšãïŒæ¬¡å
åæžïŒãæ倧éã®æ
å ±ãä¿åãããŸãã
ã¹ããã1.ããŒã¿ã®æºå
ããã§ã¯ãäŸãç°¡åã«ããããã«ãæ°ååã®å
åãšæ°çŸåã®èŠ³å¯ã®ããã®å®éã®ãã¬ãŒãã³ã°ããŒã¿ã»ããã¯äœ¿çšããŸããããã§ããã ãåçŽãªããã¡ãã®äŸãäœæããŸãã 2ã€ã®å
åãš10ã®èŠ³å¯çµæã¯ãã¢ã«ãŽãªãºã ã®è
žã§äœãããããŠæãéèŠãªã®ã¯ãªãèµ·ããã®ãã説æããã®ã«ååã§ãã
ãµã³ãã«ãçæããŸãã

x = np.arange(1,11) y = 2 * x + np.random.randn(10)*2 X = np.vstack((x,y)) print X OUT: [[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. ] [ 2.73446908 4.35122722 7.21132988 11.24872601 9.58103444 12.09865079 13.78706794 13.85301221 15.29003911 18.0998018 ]]
ãã®ãµã³ãã«ã§ã¯ãââäºãã«åŒ·ãçžé¢ãã2ã€ã®æ©èœããããŸãã PCAã¢ã«ãŽãªãºã ã䜿çšãããšãæ©èœã®çµã¿åãããç°¡åã«èŠã€ããããšãã§ããæ
å ±ãç ç²ã«ããŠãããã®æ©èœã®äž¡æ¹ã1ã€ã®æ°ããæ©èœã§è¡šçŸã§ããŸãã ããã§ã¯æ£ãããã£ãŠã¿ãŸãããïŒ
ãŸããããã€ãã®çµ±èšã ã¢ãŒã¡ã³ãã¯ã©ã³ãã å€æ°ãèšè¿°ããããã«äœ¿çšãããããšãæãåºããŠãã ããã ããããå¿
èŠã§ãã æåŸ
ãšåæ£ã ãã®ããããšèšããŸãã æåŸ
ã¯å€§ããã®ãéå¿ãã§ãããåæ£ã¯ãã®ããµã€ãºãã§ãã 倧ãŸãã«èšãã°ããããã æåŸ
å€ã¯ã©ã³ãã å€æ°ã®äœçœ®ãèšå®ããåæ£ã¯ãã®ãµã€ãºïŒããæ£ç¢ºã«ã¯ã¹ãã¬ããïŒã決å®ããŸãã
æ
å ±ã®æ倱ãæå°éã«æããããã«ããã¯ãã«ã¯ãµã³ãã«ã®äžå¿ãééããå¿
èŠãããããããã¯ãã«ã«æ圱ããããã»ã¹ã¯å¹³åå€ã«åœ±é¿ããŸããã ãããã£ãŠããµã³ãã«ãäžå€®ã«é
眮ããŠãæ§ããŸãã-å±æ§ã®å¹³åå€ã0ã«ãªãããã«ãµã³ãã«ãç·åœ¢ã«ã·ããããŸããããã«ããã以éã®èšç®ã倧å¹
ã«ç°¡çŽ åãããŸãïŒäžå€®ã«é
眮ããªããŠãå®è¡ã§ããããšã«æ³šæããŠãã ããïŒã
ã·ããæŒç®åã®éã¯ãåæå¹³åå€ã®ãã¯ãã«ã«çãããªããŸã-ãµã³ãã«ãå
ã®æ¬¡å
ã«åŸ©å
ããå¿
èŠããããŸãã

Xcentered = (X[0] - x.mean(), X[1] - y.mean()) m = (x.mean(), y.mean()) print Xcentered print "Mean vector: ", m OUT: (array([-4.5, -3.5, -2.5, -1.5, -0.5, 0.5, 1.5, 2.5, 3.5, 4.5]), array([-8.44644233, -8.32845585, -4.93314426, -2.56723136, 1.01013247, 0.58413394, 1.86599939, 7.00558491, 4.21440647, 9.59501658])) Mean vector: (5.5, 10.314393916)
äžæ¹ãåæ£ã¯ãã©ã³ãã å€æ°ã®å€ã®é åºã«åŒ·ãäŸåããŸãã ã¹ã±ãŒãªã³ã°ã«ææã ãããã£ãŠããã£ãŒãã£ã®æž¬å®åäœã®é åºã倧å¹
ã«ç°ãªãå Žåã¯ãããããæšæºåããããšã匷ããå§ãããŸãã ãã®å Žåãå€ã®é åºã¯ããã»ã©å€ãããªãã®ã§ãäŸãç°¡åã«ããããã«ããã®æäœã¯å®è¡ããŸããã
ã¹ããã2.å
±åæ£è¡å
å€æ¬¡å
ã©ã³ãã å€æ°ïŒã©ã³ãã ãã¯ãã«ïŒã®å Žåãäžå¿ã®äœçœ®ã¯ãŸã åèŽããŸãã 軞äžã®äºæž¬ã®æåŸ
ã ãããããã®åœ¢ç¶ã説æããã«ã¯ã軞ã«æ²¿ã£ãåæ£ã ãã§ã¯ååã§ã¯ãããŸããã ãããã®ã°ã©ããèŠãŠãã ããã3ã€ã®ã©ã³ãã å€æ°ã¯ãã¹ãŠåãæåŸ
å€ãšåæ£ãæã¡ã軞äžã§ã®å
šäœã®æ圱ã¯åãã§ãã
ã©ã³ãã ãã¯ãã«ã®åœ¢ç¶ãèšè¿°ããã«ã¯ãå
±åæ£è¡åãå¿
èŠã§ããããã¯ã
ïŒiãjïŒèŠçŽ ãç¹åŸŽïŒX
i ãX
j ïŒã®çžé¢ã§ããè¡åã§ãã å
±åæ£ã®åŒãæãåºããŠãã ããã
ç§ãã¡ã®å Žåã
EïŒX i ïŒ= EïŒX j ïŒ= 0ãªã®ã§ ãåçŽåãã
ãŸãïŒX
i = X
jã®å Žå ïŒ
ããã¯ãä»»æã®ã©ã³ãã å€æ°ã«åœãŠã¯ãŸããŸãã
ãããã£ãŠã察è§ç·äžã®ãããªãã¯ã¹ã§ã¯å±æ§ã®åæ£ãããïŒi = jã§ããããïŒãæ®ãã®ã»ã«ã§ã¯å¯Ÿå¿ããå±æ§ã®ãã¢ã®å
±åæ£ããããŸãã ãŸããå
±åæ£ã®å¯Ÿç§°æ§ã«ãããè¡åã察称ã«ãªããŸãã
泚ïŒå
±åæ£è¡åã¯ãå€æ¬¡å
ã©ã³ãã å€æ°ã®å Žåã®åæ£ã®äžè¬åã§ã-åæ£ã ãã§ãªããã©ã³ãã å€æ°ã®åœ¢ç¶ïŒã¹ãã¬ããïŒãèšè¿°ããŸãã
å®éã1次å
確çå€æ°ã®åæ£ã¯1x1å
±åæ£è¡åã§ããããã®å¯äžã®é
ã¯åŒCovïŒXãXïŒ= VarïŒXïŒã§äžããããŸãã
ãããã£ãŠããµã³ãã«ã®å
±åæ£è¡å
ΣãäœæããŸãã ãããè¡ãããã«ãåæ£X
iããã³X
jãšãããã®å
±åæ£ãèšç®ããŸãã äžèšã®åŒã䜿çšã§ããŸãããPythonã䜿çšããŠããããã
numpy.covïŒXïŒé¢æ°ã䜿çšããªãã®ã¯çœªã§ãã å
¥åãšããŠãã©ã³ãã å€æ°ã®ãã¹ãŠã®å±æ§ã®ãªã¹ããåãåãããã®å
±åæ£è¡åãè¿ããŸããããã§ãXã¯n次å
ã®ã©ã³ãã ãã¯ãã«ïŒnè¡ã®æ°ïŒã§ãã ãã®é¢æ°ã¯ãäžååæ£ã®èšç®ã2ã€ã®éã®å
±åæ£ãããã³å
±åæ£è¡åã®ã³ã³ãã€ã«ã«åªããŠããŸãã
ïŒPythonã§ã¯ãè¡åã¯è¡é
åã®åé
åãšããŠè¡šãããããšãæãåºããŠãã ãããïŒ covmat = np.cov(Xcentered) print covmat, "\n" print "Variance of X: ", np.cov(Xcentered)[0,0] print "Variance of Y: ", np.cov(Xcentered)[1,1] print "Covariance X and Y: ", np.cov(Xcentered)[0,1] OUT: [[ 9.16666667 17.93002811] [ 17.93002811 37.26438587]] Variance of X: 9.16666666667 Variance of Y: 37.2643858743 Covariance X and Y: 17.9300281124
ã¹ããã3.åºæãã¯ãã«ãšå€ïŒåºæãã¢ïŒ
ããŠãã©ã³ãã å€æ°ã®åœ¢ç¶ã説æãããããªãã¯ã¹ãåŸãããããããxãšyïŒã€ãŸãX
1ãšX
2 ïŒã®æ¬¡å
ãšãå¹³é¢äžã®è¿äŒŒåœ¢ç¶ãååŸã§ããŸãã 次ã«ããµã³ãã«ã®æ圱ã®ãµã€ãºïŒåæ£ïŒãæ倧ã«ãªããããªãã¯ãã«ïŒãã®å Žåã¯1ã€ã®ã¿ïŒãèŠã€ããå¿
èŠããããŸãã
泚ïŒåæ£ã®é«æ¬¡å
ãžã®äžè¬åã¯å
±åæ£è¡åã§ãããããã2ã€ã®æŠå¿µã¯åçã§ãã ãã¯ãã«ã«æ圱ããå Žåãæ圱ã®åæ£ã¯æ倧åããã倧ããªæ¬¡æ°ã®ç©ºéã«æ圱ããå Žåããã®å
±åæ£è¡åå
šäœãæ倧åãããŸãã
ãããã£ãŠãã©ã³ãã ãã¯ãã«Xãæ圱ããåäœãã¯ãã«ãååŸããŸãããã®åŸããã®æ圱ã¯v
T Xã«çãããªããŸãããã¯ãã«ãžã®æ圱ã®åæ£ã¯ãVarïŒv
T XïŒã«ãªããŸãã äžè¬çãªåœ¢åŒã§ã¯ããã¯ãã«åœ¢åŒïŒäžå¿éã®å ŽåïŒã§ã¯ãåæ£ã¯æ¬¡ã®ããã«è¡šãããŸãã
ãããã£ãŠãæ圱ã®åæ£ïŒ
åæ£ãæ倧å€v TΣvã§æ倧åãããããšã¯ç°¡åã«ããããŸãã ããã§ã¬ã€ãªãŒã®æ
床ã¯ç§ãã¡ãå©ããŸãã æ°åŠã«æ·±ãå
¥ã蟌ãããšãªããã¬ã€ãªãŒé¢ä¿ã«ã¯å
±åæ£è¡åã®ç¹å¥ãªã±ãŒã¹ããããšã ãèšããŸãã
ãããŠ
æåŸã®åŒã¯ãè¡åãåºæãã¯ãã«ãšå€ã«å解ãããããã¯ã«ç²ŸéããŠããå¿
èŠããããŸãã xã¯åºæãã¯ãã«ã§ãããλã¯åºæå€ã§ãã åºæãã¯ãã«ãšå€ã®æ°ã¯ãè¡åã®ãµã€ãºã«çãããªããŸãïŒå€ã¯ç¹°ãè¿ãããšãã§ããŸãïŒã
ãšããã§ãè±èªã§ã¯ãåºæå€ãšãã¯ãã«ã¯ããããåºæå€ãšåºæãã¯ãã«ãšåŒã°ããŸãã
ããã¯ç§ãã¡ã®èšèãããã¯ããã«çŸããïŒãããŠç°¡æœã«ïŒèãããããã«æããŸãã
ãããã£ãŠãæ圱ã®æ倧åæ£ã®æ¹åã¯åžžã«åºæãã¯ãã«ãšäžèŽããåºæãã¯ãã«ã¯ãã®åæ£ã®å€ã«çãããªããŸã ã
ããã¯ãããå€ãã®æ¬¡å
ãžã®æ圱ã«ãåœãŠã¯ãŸããŸããm次å
空éãžã®æ圱ã®åæ£ïŒå
±åæ£è¡åïŒã¯ãæ倧åºæå€ãæã€måã®åºæãã¯ãã«ã®æ¹åã§æ倧ã«ãªããŸãã
ãµã³ãã«ã®æ¬¡å
ã¯2ã§ããã®äžã®åºæãã¯ãã«ã®æ°ã¯ãããã2ã§ããããããèŠã€ããŸãã
numpyã©ã€ãã©ãªã¯ãé¢æ°
numpy.linalg.eigïŒXïŒãå®è£
ããŸããXã¯æ£æ¹è¡åã§ãã åºæå€ã®é
åãšåºæãã¯ãã«ã®é
åïŒåãã¯ãã«ïŒã®2ã€ã®é
åãè¿ããŸãã ãããŠããã¯ãã«ã¯æ£èŠåãããŸã-ãããã®é·ãã¯1ã§ããå¿
èŠãªãã®ã ãã§ãã ãããã®2ã€ã®ãã¯ãã«ã¯ããµã³ãã«ã®æ°ããåºåºãå®çŸ©ãããã®è»žããµã³ãã«ã®è¿äŒŒæ¥åã®å軞ãšäžèŽããããã«ããŸãã

ãã®ã°ã©ãã§ã¯ããµã³ãã«ãååŸ2ã·ã°ãã®æ¥åã§è¿äŒŒããŸããïŒã€ãŸãããã¹ãŠã®èŠ³æž¬å€ã®95ïŒ
ãå«ãã¯ãã§ã-ååãšããŠããã§èŠ³æž¬ããŸãïŒã 倧ããªãã¯ãã«ãå転ããŸããïŒeigïŒXïŒé¢æ°ã¯ãããå察æ¹åã«åããŸããïŒ-ãã¯ãã«ã®åãã§ã¯ãªãæ¹åãéèŠã§ãã
ã¹ããã4.次å
åæžïŒæ圱ïŒ
æ倧ã®ãã¯ãã«ã®æ¹åã¯ååž°çŽç·ã«äŒŒãŠããããµã³ãã«ãæ圱ãããšãååž°ã®æ®ãã®é
ã®åèšã«çžåœããæ
å ±ã倱ãããŸãïŒYã®ãã«ã¿ã§ã¯ãªããè·é¢ã®ã¿ããŠãŒã¯ãªããã«ãªããŸãïŒã ç§ãã¡ã®å Žåããµã€ã³éã®é¢ä¿ã¯éåžžã«åŒ·ããããæ
å ±ã®æ倱ã¯æå°éã«æããããŸãã åã®ã°ã©ããããããããã«ãæ圱ã®ãäŸ¡æ Œã-å°ããåºæãã¯ãã«ã®åæ£-ã¯éåžžã«å°ããã§ãã
泚ïŒå
±åæ£è¡åã®å¯Ÿè§èŠçŽ ã¯å
ã®åºåºã«æ²¿ã£ãåæ£ã瀺ãããã®åºæå€ã¯æ°ããåºåºã«åŸã£ãŠïŒäž»æåãšãšãã«ïŒåæ£ã瀺ããŸãã
å€ãã®å Žåã倱ãããïŒããã³ä¿åãããïŒæ
å ±ã®éãèŠç©ããå¿
èŠããããŸãã ããŒã»ã³ããŒãžã§è¡šç€ºããã®ãæã䟿å©ã§ãã å軞ã«æ²¿ã£ãåæ£ãååŸãã軞ã«æ²¿ã£ãåæ£ã®åèšïŒã€ãŸããå
±åæ£è¡åã®ãã¹ãŠã®åºæå€ã®åèšïŒã§é€ç®ããŸãã
ãããã£ãŠã倧ãããã¯ãã«ã¯45.994 / 46.431 * 100ïŒ
= 99.06ïŒ
ãè¡šããå°ãããã¯ãã«ã¯ããããçŽ0.94ïŒ
ã§ãã å°ãããã¯ãã«ãç Žæ£ãã倧ãããã¯ãã«ã«ããŒã¿ãæ圱ãããšãæ
å ±ã®1ïŒ
æªæºãã倱ãããŸããã çŽ æŽãããçµæã§ãïŒ
泚ïŒå®éã«ã¯ãã»ãšãã©ã®å Žåãæ
å ±ã®åèšæ倱ã10ã20ïŒ
ãè¶
ããªãå Žåããã£ã¡ã³ã·ã§ã³ãå®å
šã«åæžã§ããŸãã
æ圱ãå®è¡ããã«ã¯ãã¹ããã3ã§åè¿°ããããã«ãæŒç®v
T Xãå®è¡ããå¿
èŠããããŸãïŒãã¯ãã«ã®é·ãã¯1ã§ãªããã°ãªããŸããïŒã ãŸãã¯ã1ã€ã®ãã¯ãã«ã§ã¯ãªãè¶
å¹³é¢ãããå Žåããã¯ãã«v
Tã®ä»£ããã«åºåºãã¯ãã«V
Tã®è¡åã䜿çšããŸã
ã çµæã®ãã¯ãã«ïŒãŸãã¯è¡åïŒã¯ã芳枬ã®æ圱ã®é
åã«ãªããŸãã
_, vecs = np.linalg.eig(covmat) v = -vecs[:,1]) Xnew = dot(v,Xcentered) print Xnew OUT: [ -9.56404107 -9.02021624 -5.52974822 -2.96481262 0.68933859 0.74406645 2.33433492 7.39307974 5.3212742 10.59672425]
ãããïŒXãYïŒã¯ç©ã§ãïŒãããã£ãŠãPythonã§ãã¯ãã«ãšè¡åãä¹ç®ããŸãïŒæ圱å€ãåã®ã°ã©ãã®ç»åã«å¯Ÿå¿ããŠããããšã¯ç°¡åã«ããããŸãã
ã¹ããã5.ããŒã¿åŸ©æ§
ãããžã§ã¯ã·ã§ã³ã䜿çšããŠäœæ¥ããããã«åºã¥ããŠä»®èª¬ãç«ãŠãã¢ãã«ãéçºãããšäŸ¿å©ã§ãã ããããåžžã«åãåã£ãããã§ã¯ãªãããäž»èŠãªã³ã³ããŒãã³ãã«ã¯ãéšå€è
ã«ãšã£ãŠæ確ã§ç解å¯èœãªæå³ããããŸãã ããšãã°ãæ€åºãããç°åžžå€ããã³ãŒãããŠããã®èåŸã«ãã芳枬å€ã確èªãããšäŸ¿å©ãªå ŽåããããŸãã
ãšãŠãç°¡åã§ãã ãã¹ãŠã®å¿
èŠãªæ
å ±ãã€ãŸãå
ã®åºåºã®åºåºãã¯ãã«ã®åº§æšïŒæ圱ãããã¯ãã«ïŒãšå¹³åã®ãã¯ãã«ïŒã»ã³ã¿ãªã³ã°ããã£ã³ã»ã«ããïŒããããŸãã ããšãã°ãæ倧å€ã§ãã10.596 ...ããã³ãŒãããŸãã ãããè¡ãã«ã¯ãå³åŽã«è»¢çœ®ãã¯ãã«ãæããŠãå¹³åã®ãã¯ãã«ãè¿œå ãããããµã³ãã«å
šäœã®äžè¬çãªåœ¢åŒã§æ¬¡ã®ããã«ããŸããX
T v
T + m
n = 9
éãã¯å°ããã§ãããéãã¯ãããŸãã çµå±ã倱ãããæ
å ±ã¯åŸ©å
ãããŸããã ããã§ãã粟床ãããåçŽããéèŠãªå Žåã埩å
ãããå€ã¯å
ã®å€ã«å®å
šã«è¿äŒŒããŸãã
çµè«ã®ä»£ããã«-ã¢ã«ãŽãªãºã ã®ãã§ãã¯
ãã®ãããã¢ã«ãŽãªãºã ãå解ããããã¡ãã®äŸã§ã©ã®ããã«æ©èœãããã瀺ããŸããããããã䜿çšãããããsklearnã«å®è£
ãããPCAãšæ¯èŒããã ãã§ãã
from sklearn.decomposition import PCA pca = PCA(n_components = 1) XPCAreduced = pca.fit_transform(transpose(X))
n_componentsãã©ã¡ãŒã¿ãŒã¯ãæ圱ãå®è¡ããã次å
ã®æ°ãã€ãŸãããŒã¿ã»ãããåæžãã次å
ã®æ°ã瀺ããŸãã ã€ãŸãããããã¯æ倧ã®åºæå€ãæã€nåã®åºæãã¯ãã«ã§ãã 次å
ãçž®å°ããçµæã確èªããŸãã
print 'Our reduced X: \n', Xnew print 'Sklearn reduced X: \n', XPCAreduced OUT: Our reduced X: [ -9.56404106 -9.02021625 -5.52974822 -2.96481262 0.68933859 0.74406645 2.33433492 7.39307974 5.3212742 10.59672425] Sklearn reduced X: [[ -9.56404106] [ -9.02021625] [ -5.52974822] [ -2.96481262] [ 0.68933859] [ 0.74406645] [ 2.33433492] [ 7.39307974] [ 5.3212742 ] [ 10.59672425]]
sklearnã®PCAã¯åçŽé
åãè¿ããŸãããçµæã¯èŠ³æž¬åãã¯ãã«ã®è¡åãšããŠè¿ãããŸãïŒããã¯ç·åœ¢ä»£æ°ã®èŠ³ç¹ããããæšæºçãªãã¥ãŒã§ãïŒã
ååãšããŠãããã¯éèŠã§ã¯ãããŸãããç·åœ¢ä»£æ°ã§ã¯åãã¯ãã«ãä»ããŠè¡åãèšè¿°ããããšã¯æšæºçã§ãããããŒã¿ïŒããã³ããŒã¿ããŒã¹ã«é¢é£ããä»ã®é åïŒã®åæã§ã¯ã芳枬ïŒãã©ã³ã¶ã¯ã·ã§ã³ãã¬ã³ãŒãïŒã¯éåžžè¡ã«æžã蟌ãŸããããšã«æ³šæããŠãã ããã
ä»ã®ã¢ãã«ãã©ã¡ãŒã¿ãŒããã§ãã¯ããŸããã-é¢æ°ã«ã¯ãäžéå€æ°ãžã®ã¢ã¯ã»ã¹ãèš±å¯ããããã€ãã®å±æ§ããããŸãã
-ãã¯ãã«ã®æå³ïŒ
mean_-æ圱ãã¯ãã«ïŒè¡åïŒïŒ
components_-æ圱軞ã®åæ£ïŒãªãã·ã§ã³ïŒïŒ
explain_variance_-æ
å ±ã®å
±æïŒåèšåæ£ã®å
±æïŒïŒ
explain_variance_ratio_æ³šïŒ explain_variance_ã¯ãµã³ãã«åæ£ã瀺ããcovïŒïŒé¢æ°ã¯äžååæ£ãèšç®ããŠå
±åæ£è¡åãäœæããŸãïŒ
ååŸããå€ãã©ã€ãã©ãªé¢æ°ã®å€ãšæ¯èŒããŸãã
print 'Mean vector: ', pca.mean_, m print 'Projection: ', pca.components_, v print 'Explained variance ratio: ', pca.explained_variance_ratio_, l[1]/sum(l) OUT: Mean vector: [ 5.5 10.31439392] (5.5, 10.314393916) Projection: [[ 0.43774316 0.89910006]] (0.43774316434772387, 0.89910006232167594) Explained variance: [ 41.39455058] 45.9939450918 Explained variance ratio: [ 0.99058588] 0.990585881238
å¯äžã®éãã¯åæ£ã«ãããŸããããã§ã«èª¬æããããã«ãcovïŒïŒé¢æ°ã䜿çšããŸããããã®é¢æ°ã¯ãäžååæ£ã䜿çšããŸãããexplained_variance_å±æ§ã¯éžæçãªãã®ãè¿ããŸãã ãããã¯ãæåã®ãã®ãããããåŸãããã«ïŒn-1ïŒã§å²ããšããç¹ãšã2çªç®ãnã§å²ããšããç¹ã§ã®ã¿ç°ãªããŸãã 45.99âïŒ10-1ïŒ/ 10 = 41.39ã§ããããšã確èªããã®ã¯ç°¡åã§ãã
ä»ã®ãã¹ãŠã®å€ã¯åãã§ããã€ãŸããã¢ã«ãŽãªãºã ã¯åçã§ãã æåŸã«ãã©ã€ãã©ãªã¢ã«ãŽãªãºã ã®å±æ§ã¯ããããé床ãæé©åãããŠããããå©äŸ¿æ§ã®ããã«åçŽã«å€ãäžžããŠããããïŒãŸãã¯äœããã®äžå
·åãããããïŒãã©ã€ãã©ãªã¢ã«ãŽãªãºã ã®å±æ§ã®ç²ŸåºŠãäœãããšã«æ³šæããŠãã ããã

泚ïŒã©ã€ãã©ãªã¡ãœããã¯ãåæ£ãæ倧åãã軞ã«èªåçã«æ圱ããŸãã ããã¯åžžã«åççã§ã¯ãããŸããã ããšãã°ããã®å³ã§ã¯ããã£ã¡ã³ã·ã§ã³ã®ããããªæžå°ã«ãããåé¡ãäžå¯èœã«ãªããŸãã ãã ããããå°ãããã¯ãã«ã«æ圱ãããšã次å
ãæ£åžžã«çž®å°ãããåé¡åãä¿æãããŸãã
ããã§ãPCAã¢ã«ãŽãªãºã ã®åçãšsklearnã§ã®å®è£
ãæ€èšŒããŸããã ãã®èšäºããããŒã¿åæã«ç²Ÿéãå§ããã°ããã®äººã«ã¯ååã«ç解ã§ãããã®ã¢ã«ãŽãªãºã ãããç¥ã£ãŠãã人ã«ã¯å°ãªããšãå°ãæçã§ããããšãé¡ã£ãŠããŸãã çŽæçãªè¡šç€ºã¯ãã¡ãœããã®ä»çµã¿ãç解ããã®ã«éåžžã«åœ¹ç«ã¡ãéžæããã¢ãã«ãé©åã«æ§æããã«ã¯ç解ãéåžžã«éèŠã§ãã ãæž
èŽããããšãããããŸããïŒ
PSïŒééãã®å¯èœæ§ã«ã€ããŠèè
ãauthorããªãã§ãã ããã èè
èªèº«ãããŒã¿åæã«ç²ŸéããŠããéçšã«ããããã®é©ãã¹ãç¥èã®åéãç¿åŸããéçšã§åœŒã®ãããªäººã
ãå©ãããã§ãïŒ ãããã建èšçãªæ¹å€ãšå€æ§ãªçµéšã¯ããããé¢ã§æè¿ãããŠããŸãïŒ