[@tsafin- ãã¥ãŒãªã³ã°è³ã®åè³è
ã§ãã ãã€ã±ã«ã»ã¹ããŒã³ãã¬ã€ã«ãŒã玹ä»ããå¿
èŠã¯ãããŸããã圌ãšåœŒã®ããŒã¯ã¬ãŒãšMITã®åŠçã¯ãé廿°å幎éã«ãªã¬ãŒã·ã§ãã«ããŒã¿ããŒã¹ãšéãªã¬ãŒã·ã§ãã«ããŒã¿ããŒã¹ã®ã»ãšãã©ãäœæããããã§ãã IngresãšPostgresãC-StoreãšVerticaãH-StoreãšVoltDB-ãããã¯ããã€ã±ã«ãšåœŒã®åŠçãçŽæ¥åœ±é¿ãäžãããããžã§ã¯ããšäŒæ¥ã®ã»ãã®äžéšã§ããããŸã å€ãã®ãã©ãŒã¯ãšããªããã£ãããããŸã...
T.O. NoSQLã§ããããšHadoopã§ããããšã圌ãäœããæ¹å€ãããšããæ¥çã¯å°ãªããšãè³ãåŸããã¹ãã§ããããããå€åã詊ã¿ãã¹ãã§ãã
2012幎ãš2014幎ã®èšäºã§è¡šçŸãããHadoopã«å¯Ÿãã圌ã®èŠç¹ã¯è峿·±ããã®ã§ããããã®ãããªçæéã§ãã¯ã©ã·ãã¯ãã®èŠç¹ã®çºå±ã远ãããšã¯è峿·±ããã®ã§ããã
ACMã®ã³ãã¥ãã±ãŒã·ã§ã³http://cacm.acm.org/blogs/blog-cacm/149074-possible-hadoop-trajectories/fulltextã§å
¬éãããæåã®èšäºãPossible Hadoop Trajectoriesãã¯ã2012幎5æã«Stonebreakerãšå
±åã§å·çãããŸããããžã§ã¬ããŒã»ã±ãããŒãåœæã MITã®äžçŽæè¡ã¹ã¿ãããšããŠããŸãMITæ°åŠéšãšMITã³ã³ãã¥ãŒã¿ãµã€ãšã³ã¹ãšAIã©ãã®ç ç©¶è
ãšããŠåããŠããŸããã ã³ã©ãã¬ãŒã·ã§ã³ã§æžããããã®èšäºã¯ã2幎åŸã«åœŒã«ãã£ãŠæžããã2çªç®ã®èšäºãšæ¯èŒããŠãããåããŸããç±çãªããã«èŠããŸãïŒãããŠãããã«ããæåã®èšäºã¯ç§èŠã«ãã£ãŠæé«ã®ã¹ã¿ã€ã«ã§æžãããŸããïŒã .kã ã³ã³ããã¹ãã¯é廿°å¹Žéã§å€§ããå€åããŠãããHadoop / HDFSãšã³ã·ã¹ãã ãããã«æ°ä»ããªããŸãŸã«ããã®ã¯äžæ£ã§ãã
æŠããŠãæåã®éšåã®Hadoopãžã®æ¹å€ã¯MapReduce APIã®å®è£
ã«èšåããŠããã ãã§ãããæ°å¹ŽåŸãHadoopæ¥çã¯åé¡ã解決ããããã«å€ãã®ããšãè¡ã£ãŠããŸããã ããããããã§ããæåã®èšäºã§è¿°ã¹ãHPCã³ã³ãã¥ãŒãã£ã³ã°ã¢ããªã±ãŒã·ã§ã³ã®åé¡ã解決ããããšã«åœŒå¥³ãè¿ã¥ããããšã¯ã§ããŸããã§ããã]
Hadoopã®å¯èœãªã¢ããªã±ãŒã·ã§ã³ãã¹
ãã€ã±ã«ã»ã¹ããŒã³ãã¬ã€ã«ãŒããžã§ã¬ããŒã»ã«ãŒããŒã2012幎5æ2æ¥
é廿°å¹Žã«ããã£ãŠãHadoopã¯Javaã®äžŠåã³ã³ãã¥ãŒãã£ã³ã°ãã©ãããã©ãŒã ã«ãªããŸããã ãã®ããã圌ã¯Javaã³ãã¥ããã£ã®äœçŸäžäººãã®ããã°ã©ããŒã®å®è·µã«äžŠåã³ã³ãã¥ãŒãã£ã³ã°ããããããšããç®æšãå®å
šã«éæããŸããã ãããè¡ã以åã®ãã¹ãŠã®è©Šã¿ïŒJava GrandleãJavaHPCïŒã¯ããŸãæåããŠããããäž»ã«äœæãããç°å¢ã®ã·ã³ãã«ããšã¢ã¯ã»ã·ããªãã£ã®ããã«ãHadoopã«ãã®æåãç§°è³ããŠããŸãã
ããã«ãããããããå°ãªããšãç§ãã¡ã®1人ãåããŠãããªã³ã«ãŒã³ã©ããªã©ã®ç 究宀ã§ã®ç§åŠç䜿çšåéã§ã¯ã補åã®ã€ã³ã¹ããŒã«ã§çå£ã«äœ¿çšããããã«å¿
èŠãªå€ãã®æ¹åãèŠãããŸãã ã»ãšãã©ã®å ŽåãHadoopã®ç°å¢ã§ã®äœ¿çšã¯ã䞊åã³ã³ãã¥ãŒãã£ã³ã°ïŒç§åŠåæããŒã«ãæ
å ±éçŽïŒããã³ããŒã¿ã»ããã®å±éã§ãã
ããã2ã€ã®ãŠãŒã¹ã±ãŒã¹ã詳ããèŠãŠã¿ãŸãããã
Hadoopã®ç§åŠèšç®
å€ãã®å Žåãç§åŠèšç®ãå®è¡ããã³ãŒãã§ã¯ãããŒãã¯2次å
ïŒãŸãã¯3次å
ãŸãã¯NDïŒã®é·æ¹åœ¢ã®ããŒãã£ã·ã§ã³ã°ãªããïŒã°ãªããïŒã§ç·šæãããŸãã ãããŠãåããŒãã§æ¬¡ã®ãããªãã®ãå®è¡ããŸãã
çµäºæ¡ä»¶ãŸã§{
ããŒã«ã«ããŒã¿ã®ããŒãã£ã·ã§ã³ã§ã®ããŒã«ã«èšç®
[倿Ž]ç¶æ
ã®çºè¡
ä»ã®ããŒã¿ããŒãã£ã·ã§ã³ãæ ŒçŽããŠããä»ã®ããŒãã®ãµãã»ãããšã®éã§ããŒã¿ãéåä¿¡ãã
}
ãã®ãã³ãã¬ãŒãã«ã¯ãã»ãšãã©ã®èšç®æµäœååŠïŒCFDïŒã¢ã«ãŽãªãºã ããã¹ãŠã®å€§æ°ããã³æµ·æŽã·ãã¥ã¬ãŒã·ã§ã³ã¢ãã«ãç·åœ¢ä»£æ°æŒç®ãã¹ããŒã¹ã°ã©ãæŒç®ãç»ååŠçãããã³ä¿¡å·åŠçãèšè¿°ãããŠããŸãã Hadoopã§ãã®ã¯ã©ã¹ã®åé¡ãæ€èšãããšã次ã®ã¿ã¹ã¯ãšåé¡ã解決ãããŸãã
ããŒã«ã«ã³ã³ãã¥ãŒãã£ã³ã°ã¯ãåå埩ã§åžžã«è¯å¥œãªç¶æ
ã§æ©èœããŸãã MapReduceã®ã¹ãããéã§ç¶æ
ãä¿åããã«ã¯ããã¡ã€ã«ã·ã¹ãã ãžã®æžã蟌ã¿ãå¿
èŠã§ããããã¯ãå€ãã®å Žåéåžžã«é«äŸ¡ã§ãã ãŸããå€ãã®å Žåããã®ãããªã¢ã«ãŽãªãºã ã¯ããŒãéã®çŽæ¥ã®å¯Ÿè©±ãå¿
èŠãšããŸãããããã¯MapReduceã€ã³ãã©ã¹ãã©ã¯ãã£ã§ã¯ãµããŒããããŠããŸããã å€ãã®å Žåããã®ãããªã¢ã«ãŽãªãºã ã¯ãã³ãŒããåãã°ãªããããŒãã«ãã€ã³ãããŸãããèšç®ã¢ã«ãŽãªãºã ã®ç°ãªãå埩ã§è¡ããŸãã
ç¹°ãè¿ããŸããããã®ãããªã¢ãã«ã¯MapReduceã§çŽæ¥ãµããŒããããŠããŸããã
MapReduceã¯ããªã³ã«ãŒã³ã©ãã®ãŠãŒã¶ãŒã®5ïŒ
ã§ã®ã¿æ©èœãããšæšå®ãããŠããŸãã æ®ãã®95ïŒ
ã¯ã¢ã«ãŽãªãºã ãMapReduceã¢ãã«ã®æ®é
·ãªéŠã«æŒã蟌ãããšãã1-2æ¡ã®æžéã®çµæãšããŠæ¯æããŸãã ãã®ãããªäŸ¡æ Œã«åæããç§åŠè
ã¯ã»ãšãã©ããªãã§ãããã
å€ãã®äººã¯ãé·æçã«ã¯ããã©ãŒãã³ã¹ã¯éèŠã§ã¯ãªããšäž»åŒµãããããããŸãã[@tsafin-WTFïŒ]ã ããã¯ããŒãšã³ã[ãã·ã³]ã®ã¿ã«åœãŠã¯ãŸãå ŽåããããŸããããªã³ã«ãŒã³ã©ããšç§ãã¡ãç¥ã£ãŠããä»ã®ç 究宀ã®äž¡æ¹ã§èŠãããããŒã¿é
åã®å Žåãããã©ãŒãã³ã¹ã¯ããã§éåžžã«éèŠã§ãããæ±ºããŠèµ·ãããŸããååãªã³ã³ãã¥ãŒãã£ã³ã°ãªãœãŒã¹ã ããã«ãããšãã°ãåœç€Ÿã®çµç¹ã¯ã次äžä»£ã®ã¹ãŒããŒã³ã³ãã¥ãŒã¿ãŒã»ã³ã¿ãŒãæ°Žåçºé»ãã ã®é£ã«é
眮ããŠãäºé
žåççŽ æåºéã1æ¡åæžããããã«1åãã«ãæè³ããŠããŸãã ãŸããHadoopã®å®è£
ã«äŒŽãããã©ãŒãã³ã¹ã®äœäžã¯ã蚱容ã§ããªã远å ã³ã¹ãã§ãã
ããªã¥ãŒã ãå°ãããŠããHadoopã®ãããªéå¹ççãªã·ã¹ãã ã䜿çšããããšã¯ãéåžžã«ç°å¢ã«åªããã¹ãããã§ãããå€ãã®å Žåããšãã«ã®ãŒã®æå€±ã«ãããŸãããèŠããã«ã[ç§åŠçãª]ã³ã³ãã¥ãŒãã£ã³ã°ç°å¢ã§Hadoopã䜿çšããå Žåãæ¬¡ã®æé ã芳å¯ããŸããã
- ã¹ããã1.ãã€ããããããžã§ã¯ãã§Hadoopã詊ããŠãã ããã
- ã¹ããã2.飿é貚çšã«Hadoopãæ¡åŒµããŸãã
- ã¹ããã3.äžèšã®åé¡ãåå ã§å£ã«ã¶ã€ããã
- æé 4.決å®ã®åœ¢åŒã倿ŽããŠãå¶éãåé¿ããŸãã
ãªã³ã«ãŒã³ã©ãã§ã¯ã4ã€ã®å·ã®ããããã«ãããžã§ã¯ãããããŸãã
ç§ãã¡ã®ç°å¢ã§Hadoopãåç¶ãããã«ã¯ã䞊åã³ã³ãã¥ãŒãã£ã³ã°ã¢ãã«ã®åŒ·åãªæ¹èšãå¿
èŠã§ãããã§ããã°ãã¿ã¹ã¯ã¹ã±ãžã¥ãŒã©ã倿Žããããã®ææ°ã®Hadoopã§ã®äœæ¥ã«ãã£ãŠè£å®ããããšãæãŸããã§ãã ãããã®åé¡ã解決ãããšãçŸä»£ã®Hadoopãå°æ¥ã®ã·ã¹ãã ã§èªèã§ããªããªãããšãäºæ³ãããŸãã
ä»ã®ãªãã£ã¹ã§ã¯ããã®ãŠãŒã¶ãŒã«é¢é£ããã¿ã¹ã¯ãæ··åšããŠãããšãMapReduceã€ã³ãã©ã¹ãã©ã¯ãã£ãšã®äºææ§ãåäžããå¯èœæ§ããããŸãã ããã«ãããããããç§ãã¡ã®ææ
ã¯ãç§ãã¡ã¯äŸå€ãããèŠç¯ã§ãããšæããŠãããŸãã GoogleãMapReduceããä»ã®ã¢ãã«ã«ç§»è¡ããããšã¯ããã®ãããªç念ãè£ä»ããŠããŸãã ãããã£ãŠãHadoopã€ã³ãã©ã¹ãã©ã¯ãã£ã®åçãªå€åãäºæ³ãããŸãã
Hadoopã®ããŒã¿ç®¡ç
æ¥çã«ãããDBMSã®40幎ã«ãããç ç©¶ãšå¿çšã«ããã1970幎ã«ã¯Ted Coddã®è«æã確èªãããŠããŸããããã°ã©ããŒãšã·ã¹ãã ã®å¹çã¯äžè¬çã«é«ããé«ã¬ãã«èšèªã§ã¯é«ã¬ãã«ã®ããŒã¿æäœæäœã䜿çšãããèšèªã§äœæ¥ããå¿
èŠãããå Žåã¯[å¹çãäœã]ããæç¹ã§ã®ã¬ã³ãŒãã®æäœã Hadoopã¯ãäžåºŠã«èšé²ããèšèªãšæ¯èŒããŠéåžžã«é«ã¬ãã«ã§ãããMapReduceãçŽæ¥äœ¿çšããããããHiveã䜿çšããŠããŒã¿ãªã¯ãšã¹ãããšã³ã³ãŒãããæ¹ãç°¡åã§ãã ãããã£ãŠããã¹ãŠã®HadoopããŒã¿ç®¡çããŒã«ããSQLãSQLã«äŒŒãèšèªãªã©ã®é«ã¬ãã«èšèªã«ç§»è¡ããããšã¯å¯èœã ãšæãããŸãã
ããšãã°ãDavid Devittã®ã¬ããŒã
[1-1]ã«ãããšãFacebookã®Hadoopã¯ã©ã¹ã¿ãŒã¯ãSQLã«éåžžã«ãã䌌ãããŒã¿ã«ã¢ã¯ã»ã¹ããããã®é«ã¬ãã«èšèªã§ããHiveã§ã»ãŒå®å
šã«ããã°ã©ã ãããŠããŸãã ãªã³ã«ãŒã³ã©ãã¯ãã¹ããŒã¹ããŒã¿ã«ã¢ã¯ã»ã¹ããããã®ããªãé«ã¬ãã«ã®ä»£æ°ã€ã³ã¿ãŒãã§ã€ã¹ãåããä»ã®é«ã¬ãã«èšèªïŒHiveã§ã¯ãªãïŒãéžæããŠããŸãããç§»åã®è»è·¡ã¯éåžžã«äŒŒãŠããŸã[
1-2ã1-3 ]ã
ãã®ãããMapReduceã¯DBMSã®[å
éšã«ã«ãã»ã«åããã]å
éšã€ã³ã¿ãŒãã§ã€ã¹ã«ãªãã€ã€ããããã§ãã
èšãæããã°ãHiveãŠãŒã¶ãŒã¯HiveQLã¯ãšãªå
ã«ãããã®ã«ã€ããŠããŸãå¿é
ããŠããããMapReduceã€ã³ã¿ãŒãã§ãŒã¹ã¯èŠããªããªããDBMSå
éšã®æ·±ã¿ã«æµžããŸãã æåŸã«ããããã¯ãŒã¯ãä»ããŠç°ãªãããŒãã®ããã»ããµéã§éä¿¡ããããã«äžŠåDBMSãéä¿¡ãããšãã«ãã©ã®ãããã³ã«ã䜿çšããããã«ã€ããŠå¿é
ããŠãã人ã¯ã©ããããããŸããïŒ
ãšããã§ããã®èšäºã®èè
ã®1人ã¯5ã€ã®äžŠåDBMSãäœæããŸããããã¡ããããªã¯ãšã¹ãã³ãŒãã£ããŒã¿ãŒãšç°ãªãããŒãäžã®è€æ°ã®ãšã°ãŒãã¥ãŒã¿ãŒãšã®éã®éä¿¡ãããã³ã«ã«ç²ŸéããŠããŸãã ãããŠã圌ã¯ãããã©ãŒããŒã®ããŒããä»ã®ããŒããšçžäºäœçšããŠãçžäºã«äžéããŒã¿ã転éããå¿
èŠãããããšãç¥ã£ãŠããŸãã ãã®ãããªã·ããªãªã§ã¯ã髿§èœã·ã¹ãã ãäœæããã«ã¯ã次ã®ã·ã¹ãã ç¹æ§ãå¿
èŠã«ãªããŸãã
- å®è¡ããŒãã¯ã忣ã¯ãšãªãã©ã³ã®å埩éã§ããŒã¿ãåå©çšã§ããããã«ç¶æ
ãç¶æã§ããå¿
èŠããããŸãã
- ããŒãéã®çžäºäœçšãç¶æããå¿
èŠããã
- èŠæ±åŠçãããŒãã®ããŒã«ã«ããŒã¿ã«ãã€ã³ãããããšãå¯èœã§ããå¿
èŠããããŸãã
äžè¬ã«ãDBMSã«ã¯éåžžãåè¿°ã®ç§åŠèšç®çšã®ã¢ã«ãŽãªãºã ãšåãæ¡ä»¶ã®ã»ãããå¿
èŠã§ãã çµæãšããŠã䞊åDBMSã®å
éšã€ã³ã¿ãŒãã§ã€ã¹ãšããŠMapReduceãååŸããŸãããã€ã³ã¿ãŒãã§ã€ã¹ãåããMapReduceã¯éåžžã«é©åãªDBMSã§ã¯ãããŸããã
ç§ãã¡ã®1人ã2009幎ã«ã䞊åDBMSãã¯ãããžãŒãšHadoop
[1-4]ãæ¯èŒããèšäºãæžããŸããã 倧ãŸãã«èšã£ãŠãDBMSã¯Hadoopããã1ã2åé«éã§ãã ãã®å©ç¹ã¯ãããŒã¿ã®ã€ã³ããã¯ã¹ã䜿çšããããšãããŒã¿ãååšããããŒãã«ã®ã¿èŠæ±ãéä¿¡ããããšïŒãã®éã§ã¯ãªãïŒãå§çž®ã®å©ç¹ãããã³ããŒââãéã®æé©ãªãããã³ã«ããåŸãããŸãã ç§ãã¡ãç¥ãéãã2012幎ã®ç¶æ³ã¯2009å¹Žãšæ¯ã¹ãŠããã»ã©å€åããŠããŸãããHadoopã¯ãéå
¬åŒã®æšå®ã«ãããšããŸã 1ã2æ¡é
ãã§ãã ããšãã°ãããå€§èŠæš¡ãªWebãããžã§ã¯ãã§ã¯ã2700ããŒãã«ãããã€ããã5ãã¿ãã€ãã®Hadoopã¯ã©ã¹ã¿ãŒããããå¥ã®äŸã§ã¯ãåæ§ã®5ãã¿ãã€ãã®ã€ã³ã¹ããŒã«ã§ã¯ãããåçšDBMSã§ç®¡çãããŠãããããããã200ããŒãã§
ããã13åå°ããããšã«æ³šæããŠãã ãã ã
ç°¡åã«èšãã°ãçŸæç¹ã§ã¯ãHadoopãä»ããŠããŒã¿ã管çããéã«æ¬¡ã®è»è·¡ã芳å¯ããŸãã
- æé 1.ãã€ããããããžã§ã¯ãã§Hadoopã詊ããŠãã ããã
- ã¹ããã2.飿é貚çšã«Hadoopãæ¡åŒµããŸãã
- ã¹ããã3.蚱容ã§ããªãããã©ãŒãã³ã¹ãªãŒããŒããããååŸããŸãã
- æé 4.䞊åDBMSã䜿çšããŠãœãªã¥ãŒã·ã§ã³ã®åœ¢åŒã倿ŽããŸãã
çŸåšãã»ãšãã©ã®Hadoopã€ã³ã¹ããŒã«ã¯ã¹ããã2ãš3ã®éã«ããããå£ãæã€ãã¹ããŒãžã¯æ¬¡ã®ã¹ãããã«ãããŸããã Hadoopã¯éãããæéã§å®éã®äžŠåDBMSã«æé·ãããããŠãŒã¶ãŒã¯ä»ã®ãœãªã¥ãŒã·ã§ã³ã«åãæ¿ããŠãHadoopãœãªã¥ãŒã·ã§ã³ã®äžéšã亀æããããå€éšããŒã¿ãæäŸããHadoopãžã®ã€ã³ã¿ãŒãã§ãŒã¹ã䜿çšããããä»ã®æ¹æ³ã§äœ¿çšããŸãã éå»3幎éã«ããããªé²å±ãèŠããããããåœç€Ÿã®æéã¯2çªç®ã®æ±ºå®[ä»ã®ãœãªã¥ãŒã·ã§ã³ãžã®åãæ¿ã]ã§æ±ºå®ãããå¯èœæ§ãé«ããªããŸãã
ãããŠæåŸã«èšãããšãã§ããŸããGartnerGroupããã®æåãªæ²ç·ããã€ããµã€ã¯ã«ããäœæãããš
[1-5] [@tsafin-ãããã¯ãã·ã¢èªã«ãæè¡æçæ²ç·ããšããŠç¿»èš³ãããŸãããŸãã«ãã®èµ·æºã Hadoopãšã³ã·ã¹ãã ã®çŸç¶ã¯ããããããã³ãšãã¿ãŒã®çºæä»¥æ¥ã®æåã®è§£æ±ºçããšããŠæç€ºãããŠããŸãããæéã®çµéãšãšãã«ãèšåããå¶éãåãé€ãããçŽæããããã®ã«å°ãè¿ã¥ãããšãé¡ã£ãŠããŸãã
åç
§è³æ
å²è·¯ã«ç«ã€Hadoop
ãã€ã±ã«ã¹ããŒã³ãã¬ã€ã«ãŒã2014幎8æ5æ¥
[@tsafin-2çªç®ã®èšäºã¯2幎åŸã®2014幎8æã«å·çãããåãJournal of Communications of ACM http://cacm.acm.org/blogs/blog-cacm/177467-hadoop-at-ã«æ²èŒãããŸããa-crossroads /ãã«ããã¹ã ]
2012å¹Žã«æžããã
Jeremy Kepnerãšã®ä»¥åã®å
±åèšäº
[2-1]以æ¥ã倧éã®æ°ŽãæµããŸããã ç§ã¯ãããã€ãã®éèŠãªçºè¡šã«ã€ããŠéç¥ããã ãã§ãªããçºçããããã€ãã®äºå®ãšçºçããæèŠãææããå¿
èŠããããšèããŠããŸãã ãã®çµæãHadoopã¹ã¿ãã¯ãå°æ¥ã©ãã«ç§»åããããäºæž¬ããŠèšäºãçµäºããŸãã
èšåãã䟡å€ã®ããæåã®çºè¡šã¯ãæ°ããDBMS-Cloudera Impala
[2-2]ã®ãªãªãŒã¹ã§ããããã¯HDFSäžã§å®è¡ãããŸãã ç°¡åã«èšãã°ãä»ã®ãã¹ãŠã®éå
±æäžŠåSQL DBMSãšåæ§ã«ImpalaãäœæãããŸãïŒ@tsafin-SNãšããçšèªã®ç¢ºç«ããã翻蚳ãã©ã®ãããªãã®ãã¯ãŸã æããã§ã¯ãããŸãããSergeyKuznetsovã®ããŒãžã§ã³ãå
±æãªãœãŒã¹ãªãã®ã¢ãŒããã¯ãã£ãã«çãŸãããšããå§ãããŸãïŒããŒã¿ãŠã§ã¢ããŠã¹åžå Žã ç¹ã«æ³šç®ãã¹ãã¯ãMapReduceã¬ã€ã€ãŒãåé€ãããæèçã«åé€ãããŠãããšããäºå®ã§ãã ç§ãã¡ã®å€ããé·å¹ŽææããŠããããã«ãMapReduceã¯SQLïŒãŸãã¯HiveïŒDBMS [2-3ã2-4]ã«ãšã£ãŠæé©ãªå
éšã€ã³ã¿ãŒãã§ã€ã¹ã§ã¯ãããŸããã Impalaã¯ããã®äºå®ãç¥ã£ãŠããéçºè
ã«ãã£ãŠäœæãããŸããã å®éãImpalaã®ãããªæŽ»åã¯ãHortonWorksãšFacebookã®äž¡æ¹ã§ãã§ã«è¡ãããŠããŸãã ããã¯Hadoopãã³ããŒã«ãžã¬ã³ãããããããŸããæŽå²çã«ããHadoopãã¯YahooãäœæããMapReduceã®ãªãŒãã³ãœãŒã¹å®è£
ã§ããã ãã ããImpalaã¯ãœãªã¥ãŒã·ã§ã³ã¹ã¿ãã¯ãããã®ã¬ã€ã€ãŒãã¹ããŒããŸããã
Hadoopãã¹ã¿ãã¯ã®ã³ã¢ã§ãªããªã£ãå Žåãã©ãããã°Hadoopãã³ããŒã«ãšã©ãŸãããšãã§ããŸããïŒ
çãã¯ç°¡åã§ã-ãHadoopãã®äŸ¡å€ãåå®çŸ©ããå¿
èŠãããããããæçµçã«Hadoopãã³ããŒãè¡ã£ãããšã§ãã ãHadoopããšã¯ã¹ã¿ãã¯å
šäœãæå³ããããã«ãªããŸãããäžçªäžã¯HDFSãImpalaãMapReduceããŸãã¯äžçªäžã§å®è¡ãããŠããä»ã®ã·ã¹ãã ã§ãã Mahoutãªã©ã®é«ã¬ãã«ã®ãœãªã¥ãŒã·ã§ã³ã§ãããããã®ã·ã¹ãã ã§åäœããŸãã ãHadoopãã®æŠå¿µã¯ãçµæã®ãœãªã¥ãŒã·ã§ã³ã®ã³ã¬ã¯ã·ã§ã³å
šäœãæãããã«ãªããŸããã
Googleã«ããå¥ã®æè¿ã®çºè¡šã§ã¯ãMapReduceã¯ãã§ã«ãåäžçŽãã§ãããDremmelãBigTableãF1 / Spannerãªã©ã®ã·ã¹ãã äžã«ã·ã¹ãã ãæ§ç¯ããããšã§ããœãªã¥ãŒã·ã§ã³ãããé©åã«é©çšãå§ãã
[2-5] ã Googleã¯ä»ã倧ç¬ãããŠããã«éããããŸããã2004å¹Žã«æ€çŽ¢ãšã³ãžã³ã§ã¯ããŒã©ãŒããµããŒãããããã«MapReduceãçºæããŸããããæ°å¹Žåã«MapReduceãBigTableå®è£
ã«çœ®ãæããŸããã ã€ã³ã¿ã©ã¯ãã£ãã¹ãã¬ãŒãžã·ã¹ãã ãå¿
èŠã§ãããMapReduceã¯ãããã¢ãŒãã§ã®ã¿æ©èœããŸããã ãã®çµæãMapReduceã®èåŸã«ããäž»ãªåååã¯ããã°ããåã«ãããæŸæ£ããããšãç¥ã£ãŠããŸãã ãããŠä»ãGoogleã¯MapReduceãå°æ¥å¿
èŠãšãããªããšå ±åããŠããŸãã
å®éãHadoopãGoogleããããæŸæ£ããŠããã«é²ãã ç¬éãã5幎åŸã«ãã®ãã©ãã€ã ããµããŒãããããšãéžæããã®ã¯ç®èã§ãã äžçã®ä»ã®åœã
ã¯ãçŽ10幎ã®ããªãã®é
ãã§Hadoopã§Googleããã©ããŒããŠããããšãããããŸããã
Googleã¯é·ãéãããæŸæ£ããŸããã ã äžçããã®äºå®ãå®çŸããã®ã«ã©ããããæéããããã®ã ãããïŒ
æçµçã«ãHadoopãœãªã¥ãŒã·ã§ã³ãããã€ããŒã¯ãããŒã¿ãŠã§ã¢ããŠã¹ãããã€ããŒãšéè€ããã³ãŒã¹ã«ç§»è¡ããŠããããšãããããŸããã çŸåšãããŒã¿ãŠã§ã¢ããŠã¹ãããã€ããŒãšæ¬è³ªçã«åãã¢ãŒããã¯ãã£ãå®è£
ããŠããŸãïŒãŸãã¯æ¢ã«å®è£
ããŠããŸãïŒã äœæããå®è£
ã匷åããã®ã«æ°å¹Žããããšããã«ãååãªããã©ãŒãã³ã¹ãçºæ®ã§ããããã«ãªããŸãã çŸæç¹ã§ã¯ãã»ãšãã©ã®ããŒã¿ãŠã§ã¢ããŠã¹ãããã€ããŒããã§ã«HDFSããµããŒãããŠãããå€ãã®ãããã€ããŒãéšåæ§é åããŒã¿ã®å®è£
ãæäŸããŠããããšã«æ³šæããŠãã ããã ãã®ãããããŒã¿ãŠã§ã¢ããŠã¹ãããã€ããŒåžå ŽãšHadoopãµãã©ã€ã€ãŒåžå Žã¯ãŸããªãçµ±åããããšç¢ºä¿¡ããŠããŸãã ãããŠãæé«ã®ã·ã¹ãã ããã®ãããªå¯Ÿé¢ã®ã¹ããªãŒãããã«ã«åã€ãããããŸããïŒ
次ã«ãHadoopã¹ã¿ãã¯ã®äž»èŠãªæ§æèŠçŽ ã®1ã€ã«ãªã£ãHDFSãèŠãŠã¿ãŸãããã HDFSã¯äž»ã«ããŒã¿ã®ãã€ããä¿åã§ãããã¡ã€ã«ã·ã¹ãã ã§ãããããã¯ã©ã®ã³ã³ãã¥ãŒãã£ã³ã°ãã©ãããã©ãŒã ã§ãåœç¶ã®ããšã§ãã HDFSãå°æ¥ç§»åã§ããå Žæã«ã€ããŠã¯ã2ã€ã®ãã¥ãŒãèããããŸãã
ãã¡ã€ã«ã·ã¹ãã ã®äžçã®ç®ãéããŠãããèŠããšããŠãŒã¶ãŒã¯å
±éã®åæ£ãã¡ã€ã«ã·ã¹ãã ãæã¡ãããšèããŠããŸãããã®èгç¹ãããHDFSã¯çæ³çãªåè£ã§ãã
DBMSã®äžŠåSQL / Hiveã®èгç¹ããèŠããšãHDFSã«ã¯ãæ»ãããæªãéåœãããããŸãã DBMS
ã¯ããã€ã§ãã©ãã§ã ããªã¯ãšã¹ãïŒæ°ãããã€ãïŒãããŒã¿ïŒæ°ã®ã¬ãã€ãïŒã«éä¿¡ããã
ãšèããŠããŸãã ãããã£ãŠããã¹DBMSãšã³ãžã³ããããŒã¿ã®å Žæãé ãããšãåæ§ã§ãããDBMSã¯ãã®ãããªå¶éãåé¿ããããšåžžã«éåžžã«å°é£ã«ãªããŸãã ããŒã¿ãŠã§ã¢ããŠã¹ãããã€ããŒããHadoopãããã€ããŒãŸã§ã®ãã¹ãŠã®äžŠåDBMSã¯ããã±ãŒã·ã§ã³ã®ééæ§ã
ãªãã«ã ãHDFSãåãªãã³ã¬ã¯ã·ã§ã³ã«å€æããŸã
Linuxãã¡ã€ã«ã·ã¹ãã ãããŒãããšã«1ã€ã®ãã¡ã€ã«ã·ã¹ãã ã
åæ§ã«ããã¡ã€ã«ã·ã¹ãã ã®ã¬ããªã«ã䜿çšãããDBMSã¯ãããŸããã
[2-6]ã§ã¯ããã®äž»é¡ã«é¢ãã詳现ãªè°è«ãèªãããšãã§ããŸãã èŠããã«ãããŒããã©ã³ã·ã³ã°ãããã³èŠæ±ãšãã©ã³ã¶ã¯ã·ã§ã³åŠçã®åé¡ã®æé©åã®ããã«ããã¹ãŠã§DBMSã§äœ¿çšãããã¬ããªã±ãŒã·ã§ã³ã·ã¹ãã ã奜ããšããããšã§ãã
æéãçµã€ã«ã€ããŠãDBMSãã³ããŒã®èŠç¹ãåžå Žã§æ®åããããšã倿ããå ŽåãHDFSã¯äœ¿ãæããããŸãã DBMSãããã€ããŒã¯äœ¿çšã忢ããŸãã 圌ãã®äžçã§ã¯ãåããŒãã«ã¯ãã§ã«ããŒã«ã«ãã¡ã€ã«ã·ã¹ãã ãããã䞊åDBMSã¯é«éã¯ãšãªèšèªããµããŒãããŠããŸããããã«ããŠãŒã¶ãŒå®çŸ©é¢æ°ã«ãã£ãŠå®çŸ©ãããå€ãã®ããŒã«ãšæ¡åŒµæ©èœããããŸãã ãã®ã·ããªãªã§ã¯ãHadoopãã·ã§ã¢ãŒãããã·ã³ã°ã¢ãŒããã¯ãã£ãåããæšæºDBMSã«å€ããã倿°ã®ä»£æ¿DBMSãã³ããŒãããªãã®ãéã®ããã«æŠã£ãŠããŸãã
äžæ¹ããã¡ã€ã«ã·ã¹ãã ã®èгç¹ãæ®åããŠããå ŽåãHDFSã¯ãã¡ã€ã«ã·ã¹ãã äžã§åäœããããŸããŸãªããŒã«ã§åç¶ããŸãã ããŒããã©ã³ã·ã³ã°ãç£æ»ããªãœãŒã¹ã³ã³ãããŒã©ãŒãããŒã¿ã®ç¬ç«æ§ãããŒã¿ã®æŽåæ§ãé«å¯çšæ§ãåæå®è¡ç®¡çããµãŒãã¹å質ãªã©ã®DBMSç°å¢ã®æšæºæ©èœã¯ãããããã¹ãŠã®æ©èœããã¡ã€ã«ã·ã¹ãã ã®ãŠãŒã¶ãŒã«åŸã
ã«è¡ãæž¡ããŸãã ãã®ã·ããªãªã§ã¯ãé«ã¬ãã«ã®æšæºã€ã³ã¿ãŒãã§ãŒã¹ã¯ãããŸããã èšãæãããšãäžçã®DBMSãã¥ãŒã¯ãå¹
åºã远å ã®æçšãªãµãŒãã¹ãæäŸãããŠãŒã¶ãŒã¯ãäœã¬ãã«ã®ã€ã³ã¿ãŒãã§ã€ã¹ãèµ·åãããšãã«éåžžã«æ£ç¢ºã§ããããšãäºåã«èŠåãããŸãã
ãããã®ã·ããªãªã®ãããã«ãããŠãããã¡ã€ã«ã·ã¹ãã ã¯äžè¬çãªéšåã§ãããHadoopãã³ããŒã¯ããã¡ã€ã«ã·ã¹ãã ã«åºã¥ããããŒã«ããŸãã¯DBMSã«åºã¥ããããŒã«ïŒããããããŸãã¯äž¡æ¹ïŒã販売ããŸãã ãã®çµæã圌ãã¯ãœãããŠã§ã¢ããµãŒãã¹ã販売ãããœãããŠã§ã¢ãã³ããŒã®ãã¹ãã«åå ããŸãã ãããŠãæé«ã®è£œåãåã€ããã«ããŸãããïŒ
åç
§è³æ