HighLoad ++ã®åå è
ã
Alexander Krasheninnikovã®ã¬ããŒãã«æ¥ããšãã圌ãã¯æ¯ç§1,600,000ã€ãã³ãã®åŠçã«ã€ããŠèããŠã¿ãããšæã£ãŠããŸããã æåŸ
ã¯å®çŸããŸããã§ãã...ããã©ãŒãã³ã¹ã®æºåäžã«ããã®æ°å€ã¯
1,800,000ã«ãŸã§äžæãããããHighLoad ++ã§ã¯ãçŸå®ã¯æåŸ
ãè¶
ããŠããŸãã
3幎åãAlexander㯠ãBadooã§ã¹ã±ãŒã©ãã«ãªãªã¢ã«ã¿ã€ã ã«è¿ãã€ãã³ãåŠçã·ã¹ãã ãã©ã®ããã«æ§ç¯ãããã
èªããŸããã ãã以æ¥ãããã»ã¹ãé²åããããªã¥ãŒã ãå¢å€§ããã¹ã±ãŒãªã³ã°ãšãã©ãŒã«ããã¬ã©ã³ã¹ã®åé¡ã解決ããå¿
èŠããããããæç¹ã§æ ¹æ¬çãªå¯Ÿçãå¿
èŠã«ãªããŸãã-
æè¡ã¹ã¿ãã¯ã®å€æŽ ã

埩å·åãããBadooã§Spark + Hadoopãã³ãã«ãClickHouseã«çœ®ãæãã
ããŒããŠã§ã¢ã3åç¯çŽããè² è·ã6åã«å¢ãããæ¹æ³ããããžã§ã¯ãã®çµ±èšãåéããçç±ã𿹿³ããããŠãã®ããŒã¿ãã©ãåŠçããããåŠç¿ããŸãã
ã¹ããŒã«ãŒã«ã€ããŠïŒ Alexander
Krasheninnikov ïŒ
alexkrash ïŒ
-Badooã®ããŒã¿ãšã³ãžãã¢ãªã³ã°è²¬ä»»è
ã 圌ã¯ãã¯ãŒã¯ããŒãã«åãããŠã¹ã±ãŒãªã³ã°ããBIã€ã³ãã©ã¹ãã©ã¯ãã£ã«åŸäºããããŒã¿åŠçã€ã³ãã©ã¹ãã©ã¯ãã£ãæ§ç¯ããããŒã ã管çããŠããŸãã 圌ã¯ãHadoopãSparkãClickHouseãªã©ãé
åžããããã¹ãŠã®ãã®ã倧奜ãã§ãã OpenSourceããã¯ãŒã«ãªåæ£ã·ã¹ãã ãæºåã§ãããšç¢ºä¿¡ããŠããŸãã
çµ±èšåé
ããŒã¿ããªãå Žåãç§ãã¡ã¯ç²ç®ã§ããããããžã§ã¯ãã管çã§ããŸããã ãã®ãã
ããããžã§ã¯ãã®å®è¡å¯èœæ§ã
ç£èŠãã
ããã«çµ±èšãå¿
èŠ
ã§ãã ãšã³ãžãã¢ãšããŠãç§ãã¡ã¯è£œåã®æ¹åã«åªããå¿
èŠããããŸã
ãæ¹åãããå Žå
ã¯æž¬å®ããŠãã ããã ãããç§ã®ä»äºã®ã¢ãããŒã§ãã ãŸã第äžã«ãç§ãã¡ã®ç®æšã¯ããžãã¹äžã®å©çã§ãã çµ±èš
ã¯ãããžãã¹äžã®è³ªåã«å¯ŸããçããæäŸããŸã ã ãã¯ãã«ã«ã¡ããªãã¯ã¯ãã¯ãã«ã«ã¡ããªãã¯ã§ãããããžãã¹ã¯ææšã«ãé¢å¿ããããèæ
®ããå¿
èŠããããŸãã
çµ±èšã©ã€ããµã€ã¯ã«
çµ±èšã®ã©ã€ããµã€ã¯ã«ã4ã€ã®ãã€ã³ãã§å®çŸ©ããŸããããããã«ã€ããŠåå¥ã«èª¬æããŸãã

ãã§ãŒãºã®å®çŸ©-圢åŒå
ã¢ããªã±ãŒã·ã§ã³ã§ã¯ãããã€ãã®ã¡ããªãã¯ãåéããŸãã ãŸãããããã¯
ããžãã¹ææšã§ãã ããšãã°ãåçãµãŒãã¹ãããå Žåã1æ¥ã«1æéã«1ç§éã«äœæã®åçãã¢ããããŒããããã®ãçåã«æã£ãŠããŸãã æ¬¡ã®ã¡ããªãã¯ã¯
ãæºæè¡çãã§ããã¢ãã€ã«ã¢ããªã±ãŒã·ã§ã³ãŸãã¯ãµã€ãã®å¿çæ§ãAPIæäœããŠãŒã¶ãŒããµã€ããšå¯Ÿè©±ããéããã¢ããªã±ãŒã·ã§ã³ã®ã€ã³ã¹ããŒã«ãUXã 3çªç®ã«éèŠãªææšã¯ã
ãŠãŒã¶ãŒã®è¡åã远跡ããããšã§ãã ãããã¯ãGoogle AnalyticsãYandex.Metricsãªã©ã®ã·ã¹ãã ã§ãã ç¬èªã®ã¯ãŒã«ãªè¿œè·¡ã·ã¹ãã ããããããã§å€ãã®æè³ãããŠããŸãã
çµ±èšãæ±ãããã»ã¹ã§ã¯ãå€ãã®ãŠãŒã¶ãŒãé¢äžããŸã-ãããã¯éçºè
ãšããžãã¹ã¢ããªã¹ãã§ãã å
šå¡ãåãèšèªã話ãããšãéèŠãªã®ã§ãåæããå¿
èŠããããŸãã
å£é ã§äº€æžããããšã¯å¯èœã§ããããããæ£åŒã«çºçããå Žåã¯ãã€ãã³ãã®æç¢ºãªæ§é ã«ãããŠã¯ããã«åªããŠããŸãã
éçºè
ã
ããžãã¹ã€ãã³ãã®æ§é ã圢åŒåãããš ãéçºè
ãç»é²æ°ãèšããšãã¢ããªã¹ãã¯ç»é²ã®ç·æ°ã ãã§ãªããåœãæ§å¥ãããã³ãã®ä»ã®ãã©ã¡ãŒã¿ãŒã«é¢ããæ
å ±ãæäŸãããããšãçè§£ããŸãã ãããŠããã®æ
å ±ã¯ãã¹ãŠåœ¢åŒåãã
ãäŒç€Ÿã®ãã¹ãŠã®ãŠãŒã¶ãŒã®ãããªãã¯ãã¡ã€ã³ã«ãããŸã ã ã€ãã³ãã«ã¯ãåä»ãæ§é ãšæ£åŒãªèª¬æããããŸãã ããšãã°ããã®æ
å ±ã
ãããã³ã«ãããã¡åœ¢åŒã§ä¿åããŸãã
ã€ãã³ããç»é²ãã®èª¬æïŒ
enum Gender { FEMALE = 1; MALE = 2; } message Registration { required int32 userid =1; required Gender usergender = 2; required int32 time =3; required int32 countryid =4; }
ç»é²ã€ãã³ãã«ã¯ã
ãŠãŒã¶ãŒããã£ãŒã«ããã€ãã³ãã®
æé ãããã³ãŠãŒã¶ãŒ
ã®ç»é²
åœã«é¢ããæ
å ±ãå«ãŸããŠããŸãã ãã®æ
å ±ã¯ã¢ããªã¹ããå©çšã§ããå°æ¥ãäŒæ¥ã¯åœç€Ÿãåéãããã®ãçè§£ããŸãã
æ£åŒãªèª¬æãå¿
èŠãªã®ã¯ãªãã§ããïŒ
æ£åŒãªèª¬æã¯
ãéçºè
ãã¢ããªã¹ããããã³è£œåéšéã®çµ±äžæ§ã§ãã æ¬¡ã«ããã®æ
å ±ã¯ãã¢ããªã±ãŒã·ã§ã³ã®ããžãã¹ããžãã¯ã®èª¬æãå®è¡ããŸãã ããšãã°ãããžãã¹ããã»ã¹ãèšè¿°ããå
éšã·ã¹ãã ãããããã®äžã«æ°ããæ©èœãããç»é¢ããããŸãã
補åèŠä»¶ããã¥ã¡ã³ãã«ã¯ããŠãŒã¶ãŒããã®æ¹æ³ã§ã¢ããªã±ãŒã·ã§ã³ãšå¯Ÿè©±ãããšãã«ããŸã£ããåããã©ã¡ãŒã¿ãŒã§ã€ãã³ããéä¿¡ããå¿
èŠããããšããæç€ºãå«ãã»ã¯ã·ã§ã³ããããŸãã ãã®åŸãæ©èœãã©ã®ããã«æ©èœããããããã³ããããæ£ããæž¬å®ããããšãæ€èšŒã§ããŸãã æ£åŒãªèª¬æã«ããããã®ããŒã¿ãããŒã¿ããŒã¹ã«ä¿åããæ¹æ³ïŒNoSQLãSQLãªã©ïŒãããã«çè§£ã§ããŸãã
ããŒã¿ã¹ããŒãããã ãããã¯çŽ æŽãããã§ãã
ãµãŒãã¹ãšããŠæäŸãããäžéšã®åæã·ã¹ãã ã§ã¯ãã·ãŒã¯ã¬ããã¹ãã¬ãŒãžã«ã¯10ã15åã®ã€ãã³ããããªãã ç§ãã¡ã®æ°ã¯1000ãè¶
ããŠæé·ããæ¢ãŸã
ããšã¯ãããŸããã
åäžã®ã¬ãžã¹ããªãªãã§ã¯çããããšã¯äžå¯èœã§ãã
ãã§ãŒãºæŠèŠã®å®çŸ©
çµ±èšã¯éèŠã§ãããšå€æã
ãç¹å®ã®äž»é¡åéã«ã€ããŠèª¬æããŸãããããã¯è¯ãããšã§ãã
åéãã§ãŒãº-ããŒã¿åé
ç»é²ãã¡ãã»ãŒãžã®éä¿¡ãªã©ã®ããžãã¹ã€ãã³ããçºçãããšãã«ããã®æ
å ±ãä¿åãããšåæã«ãçµ±èšã€ãã³ããåå¥ã«éä¿¡ããããã«ã·ã¹ãã ãæ§ç¯ããããšã«ããŸããã
ã³ãŒãã§ã¯ãçµ±èšã¯ããžãã¹ã€ãã³ããšåæã«éä¿¡ãããŸãã
ããŒã¿ãããŒã¯å¥ã®åŠçãã€ãã©ã€ã³ãééãããããã¢ããªã±ãŒã·ã§ã³ãå®è¡ãããããŒã¿ã¹ãã¢ãšã¯å®å
šã«ç¬ç«
ããŠåŠçãããŸããEDLã«ãã説æïŒ
enum Gender { FEMALE = 1; MALE = 2; } message Registration { required int32 user_id =1; required Gender user_gender = 2; required int32 time =3; required int32 country_id =4; }
ç»é²ã€ãã³ãã®èª¬æããããŸãã APIã¯èªåçã«çæãããéçºè
ã¯ã³ãŒãããã¢ã¯ã»ã¹ã§ãã4è¡ã§çµ±èšãéä¿¡ã§ããŸãã
EDLããŒã¹ã®APIïŒ
\EDL\Event\Regist ration::create() ->setUserId(100500) ->setGender(Gender: :MALE) ->setTime(time()) ->send();
ã€ãã³ãé
ä¿¡
ãããå€éšã·ã¹ãã ã§ãã ãããè¡ãã®ã¯ãåçããŒã¿ãæäœããããã®APIãæäŸããçŽ æŽããããµãŒãã¹ãããããã§ãã ãããã¯ãã¹ãŠãAerospikeãCockroachDBãªã©ã®ã¯ãŒã«ãªæ°ããããŒã¿ããŒã¹ã«ããŒã¿ãä¿åããŸãã
ããçš®ã®ã¬ããŒããäœæããå¿
èŠãããå Žåã¯ããã¿ããªãã©ãã ãããã®ããã©ãã ãããã®ãããšããã¹ã¯ã©ã³ãã«ã«è¡ãå¿
èŠã¯ãããŸããããã¹ãŠã®ããŒã¿ã¯å¥ã®ãããŒã«éä¿¡ãããŸãã åŠçã³ã³ãã¢-å€éšã·ã¹ãã ã ã¢ããªã±ãŒã·ã§ã³ã³ã³ããã¹ããããããžãã¹ããžãã¯ãªããžããªãããã¹ãŠã®ããŒã¿ãè§£ããããã«å¥ã®ãã€ãã©ã€ã³ã«éä¿¡ããŸãã
åéãã§ãŒãºã§ã¯ãã¢ããªã±ãŒã·ã§ã³ãµãŒããŒã®å¯çšæ§ãæ³å®ããŠããŸãã ãã®PHPããããŸãã

茞é
ããã¯ãã¢ããªã±ãŒã·ã§ã³ã³ã³ããã¹ãããè¡ã£ãããšãå¥ã®ãã€ãã©ã€ã³ã«éä¿¡ã§ãããµãã·ã¹ãã ã§ãã 茞éã¯ããããžã§ã¯ãã®ç¶æ³ã«å¿ããŠãèŠä»¶ããã®ã¿éžæãããŸãã
茞éã«ã¯ç¹åŸŽããããæåã®
ä¿èšŒã¯
é
éä¿èšŒã§ãã ãã©ã³ã¹ããŒãã®ç¹æ§ïŒå°ãªããšã1åãæ£ç¢ºã«1åããã®ããŒã¿ã®éèŠåºŠã«åºã¥ããŠãã¿ã¹ã¯ã®çµ±èšãéžæããŸãã ããšãã°ã課éã·ã¹ãã ã®å Žåãçµ±èšã«çŸåšãããå€ãã®ãã©ã³ã¶ã¯ã·ã§ã³ã衚瀺ãããããšã¯åãå
¥ããããŸãããããã¯ééã§ãããäžå¯èœã§ãã
2çªç®ã®ãã©ã¡ãŒã¿ãŒã¯
ãããã°ã©ãã³ã°èšèªã®ãã€ã³ãã£ã³ã°ã§ãã ã©ãããããããã©ã³ã¹ããŒããšããåãããå¿
èŠãããããããããžã§ã¯ããèšè¿°ãããŠããèšèªã«å¿ããŠéžæãããŸãã
3çªç®ã®ãã©ã¡ãŒã¿ãŒã¯
ã¹ã±ãŒã©ããªãã£ã§ãã 1ç§ãããæ°çŸäžã®ã€ãã³ãã«ã€ããŠè©±ããŠããã®ã§ãå°æ¥ã®ã¹ã±ãŒã©ããªãã£ã念é ã«çœ®ããŠãããšããã§ãããã
å€ãã®ãã©ã³ã¹ããŒããªãã·ã§ã³ããããŸãïŒRDBMSã¢ããªã±ãŒã·ã§ã³ãFlumeãKafkaãŸãã¯LSDã ç§ãã¡ã¯
LSDã䜿çšããŸã-ããã¯ç§ãã¡ã®ç¹å¥ãªæ¹æ³ã§ãã
ã©ã€ãã¹ããªãŒãã³ã°ããŒã¢ã³
LSDã¯çŠæ¢ç©è³ªãšã¯é¢ä¿ãããŸããã ããã¯ã
掻çºã§éåžžã«é«éãªã¹ããªãŒãã³ã°ããŒã¢ã³ã§ãããæžã蟌ã¿çšã®ãšãŒãžã§ã³ããæäŸããŸããã ããã調æŽããããšãã§ã
ãä»ã®ã·ã¹ãã ãšã®çµ±åããããŸã ïŒHDFSãKafka-éä¿¡ãããããŒã¿ãåé
眮ã§ããŸãã LSDã«ã¯INSERTã®ãããã¯ãŒã¯ã³ãŒã«ããªãããããã¯ãŒã¯ããããžãå¶åŸ¡ã§ããŸãã
æãéèŠãªããšã¯ãããã¯
Badooã®ãªãŒãã³ãœãŒã¹ã§ã -ãã®ãœãããŠã§ã¢ãä¿¡é Œããªãçç±ã¯ãããŸããã
ãããå®ç§ãªæªéã§ããã°ãã«ãã«ã®ä»£ããã«ãã¹ãŠã®äŒè°ã§LSDã«ã€ããŠè°è«ããŸããããã¹ãŠã®LSDã«ã¯è»èã®ããšããããŸãã ç§ãã¡ã«ã¯ç§ãã¡ã«åã£ãç¬èªã®å¶éããããç§ãã¡ã«åã£ãŠããŸãïŒ
LSDã§ã¯è€è£œããµããŒããããŠããã ãå°ãªããšã1åã®é
ä¿¡ä¿èšŒããããŸãã ãŸããééååŒã®å Žåãããã¯æé©ãªãã©ã³ã¹ããŒãã§ã¯ãããŸããããäžè¬ã«
ACIDããµããŒããããé
žæ§ãããŒã¿ããŒã¹ãä»ããŠã®ã¿ãéãšéä¿¡ããå¿
èŠããããŸãã
åéãã§ãŒãºã®èŠçŽ
åã®ã·ãªãŒãºã®çµæã«åºã¥ããŠãããŒã¿ã®
æ£åŒãªèª¬æãåããéçºè
ã
ããã€ãã³ããéä¿¡ããã®ã«äŸ¿å©ãªåªãã
APIãçæãããã®ããŒã¿
ãã¢ããªã±ãŒã·ã§ã³ã³ã³ããã¹ãããå¥ã®ãã€ãã©ã€ã³ ã«è»¢éããæ¹æ³
ãèŠã€ããŸããã ãã§ã«æªãã¯ãããŸãããæ¬¡ã®æ®µéã«è¿ã¥ããŠããŸãã
ãã§ãŒãºããã»ã¹-ããŒã¿åŠç
ç»é²ãã¢ããããŒããããåçãæç¥šããããŒã¿ãåéããŸãã-ããããã¹ãŠãã©ããããïŒ ãã®ããŒã¿ãããé·ãå±¥æŽãš
çããŒã¿ãå«ã ãã£ãŒããååŸã
ãŸã ã ãã£ãŒãã¯ãã¹ãŠãçè§£ããŸããäŒç€Ÿã®åçãå¢å ããŠããããšãæ²ç·ããçè§£ããããã«éçºè
ã§ããå¿
èŠã¯ãããŸããã ãªã³ã©ã€ã³ã¬ããŒããšã¢ãããã¯ã«çããŒã¿ã䜿çšããŸãã ããè€éãªã±ãŒã¹ã§ã¯ãã¢ããªã¹ãã¯ãã®ããŒã¿ã«å¯ŸããŠåæã¯ãšãªãå®è¡ããããšèããŠããŸãã ãããšãã®æ©èœã®äž¡æ¹ãç§ãã¡ã«ãšã£ãŠå¿
èŠã§ãã
ã°ã©ã
ãã£ãŒãã«ã¯å€ãã®åœ¢åŒããããŸãã

ãŸãã¯ãããšãã°ã10幎éã®ããŒã¿ã瀺ãå±¥æŽãæã€ã°ã©ãã

ãã£ãŒãããã®ãããªãã®ã§ãã

ããã¯ããã€ãã®ABãã¹ãã®çµæã§ãããé©ãã¹ãããšã«ãã¥ãŒãšãŒã¯ã®ã¯ã©ã€ã¹ã©ãŒãã«ã«äŒŒãŠããŸãã
ã°ã©ããæç»ããã«ã¯
ãçããŒã¿ã®ã¯ãšãªãš
æç³»åã® 2ã€ã®æ¹æ³ããã
ãŸã ã ã©ã¡ãã®ã¢ãããŒãã«ãæ¬ ç¹ãšå©ç¹ããããŸããããããã«ã€ããŠã¯è©³ãã説æããŸããã
ãã€ããªããã¢ãããŒãã䜿çšããŸããéçšã¬ããŒãçšã®çããŒã¿ã®çããããŒã«ããšãé·æä¿åçšã®æç³»åãä¿æããŸãã 2çªç®ã¯1çªç®ããèšç®ãããŸãã
1ç§ããã180äžã®ã€ãã³ãã«æé·ããæ¹æ³
ããã¯é·ã話ã§ã-1æ¥ã§æ°çŸäžã®RPSã¯çºçããŸããã Badooã¯10å¹Žã®æŽå²ãæã€äŒç€Ÿã§ãããããŒã¿åŠçã·ã¹ãã ã¯äŒç€Ÿãšãšãã«æé·ãããšèšããŸãã

æåã¯äœããããŸããã§ããã ããŒã¿ã®åéãéå§ããŸãã-
æ¯ç§5,000ã€ãã³ãã«ãªããŸãã
ã 1ã€ã®MySQLãã¹ããšä»ã®äœããªãïŒ ãªã¬ãŒã·ã§ãã«DBMSã¯ãã¹ãŠãã®ã¿ã¹ã¯ã«å¯Ÿå¿ããå¿«é©ã§ãããã©ã³ã¶ã¯ã·ã§ã³æ§ããããŸã-ããŒã¿ãå
¥åãããªã¯ãšã¹ããåä¿¡ããŸã-ãã¹ãŠãããŸãåäœããŸãã ã ããç§ãã¡ã¯ãã°ããäœãã§ããã
æ©èœçã·ã£ãŒãã£ã³ã°ã¯ãããæç¹ã§çºçããŸãããç»é²ããŒã¿ãããã«æ¥ãŠãããã«åçããããŸãã ãããã£ãŠã
1ç§éã«æå€§
200,000ä»¶ã®ã€ãã³ããåŠçããããŸããŸãªçµã¿åããã¢ãããŒãã䜿çšãå§ããŸãããçããŒã¿ã§ã¯ãªãã
éçŽããããã®ã§ããããããŸã§ã®ãšãããªã¬ãŒã·ã§ãã«ããŒã¿ããŒã¹å
ã§ãã ã«ãŠã³ã¿ãŒãä¿åããŸãããã»ãšãã©ã®ãªã¬ãŒã·ã§ãã«ããŒã¿ããŒã¹ã®æ¬è³ªã¯ããã®ããŒã¿ã«å¯ŸããŠ
DISTINCTã¯ãšãªãå®è¡ããããšãã§ããªããšããããšã§ããã«ãŠã³ã¿ãŒã®ä»£æ°ã¢ãã«ã§ã¯DISTINCTãèšç®ã§ããŸããã
Badooã®ã¢ãããŒã¯
ãæ¢ããããªãåãã§ãã ç§ãã¡ã¯æ¢ãŸãããããã«æé·ããŸããã
1ç§ããã200,000ã€ãã³ããšãããããå€ãè¶
ããæç¹ã§ãäžã§èª¬æããæ£åŒãªèª¬æãäœæããããšã«ããŸããã ãã以åã¯æ··ä¹±ããããçŸåšãã€ãã³ãã®æ§é åãããã¬ãžã¹ã¿ããããŸã
ãHadoopã«
æ¥ç¶ããŠã·ã¹ãã ã®ã¹ã±ãŒãªã³ã°ãéå§ãããã¹ãŠã®ããŒã¿ã
HiveããŒãã«ã«å
¥ããããŸãã
ãHadoopã¯ã巚倧ãªãœãããŠã§ã¢ããã±ãŒãžã§ãããã¡ã€ã«ã·ã¹ãã ã§ãã 忣ã³ã³ãã¥ãŒãã£ã³ã°ã®å ŽåãHadoopæ°ã¯ããããã«ããŒã¿ãå
¥ããŠãåæã¯ãšãªãå®è¡ã§ããããã«ããŸãããšèšããŸãã
ãã¹ãŠã®ãã£ãŒãã®å®æçãªèšç®ãäœæããŸããããããŸããããŸããã ãããããã£ãŒãã¯è¿
éã«æŽæ°ãããå Žåã«äŸ¡å€ããããŸãã1æ¥1åããã£ãŒãã®æŽæ°ãèŠãããšã¯ããã»ã©æ¥œãããããŸããã æ¬çªç°å¢ã§èŽåœçãªãšã©ãŒãåŒãèµ·ããäœããå±éããå Žåããã£ãŒãã1æ¥ããã«ã§ã¯ãªããããã«ããããããããšã確èªããããšæããŸãã ãã®ããããã°ãããããšã·ã¹ãã å
šäœãå£åãå§ããŸããã ãã ãããã®æ®µéã§ãéžæãããã¯ãããžãŒã¹ã¿ãã¯ã«åºå·ã§ããããšã«æ°ä»ããŸããã
ç§ãã¡ã«ãšã£ãŠãJavaã¯æ°ãããã®ã§ãããç§ãã¡ã¯ãããæ°ã«å
¥ã£ãŠãããç°ãªãæ¹æ³ã§äœãã§ããããçè§£ããŸããã
1ç§ããã 40äžãã
800,000ã€ãã³ãã®æ®µéã§ãHadoopãæãçŽç²ãªåœ¢åŒã§çœ®ãæããHiveã¯åæã¯ãšãªã®ãšã°ãŒãã¥ãŒã¿ãŒãšããŠ
Spark Streamingã䜿çšããŠã
äžè¬çãªããã/ãªãã¥ãŒã¹ããã³ã¡ããªãã¯ã®å¢åèšç®ãäœæããŸããã 3幎åãç§
ã¯ãããã©ããã£ãŠãã£ããã
話ããŸããã ãããããSparkã¯æ°žé ã«çãç¶ããããã«æããŸããããããã§ãªãå Žåã¯åœãåœããããŸãã-Hadoopã®éçã«ã¶ã€ãããŸããã ãããããä»ã®æ¡ä»¶ããã£ããšããŠããHadoopã䜿ãç¶ããŸãã
ãã1ã€ã®åé¡ã¯ãHadoopã§ã°ã©ããèšç®ããããšã«å ããŠãã¢ããªã¹ãã«ãã£ãŠé§åãããä¿¡ããããªãã»ã©ã®4é建ãŠã®SQLã¯ãšãªã§ãããã°ã©ãã¯ããã«ã¯æŽæ°ãããŸããã§ããã å®éã«ã¯ãéçšããŒã¿åŠçã«ã¯ããªã泚æãå¿
èŠãªä»äºãããããããªã¢ã«ã¿ã€ã ã§é«éãã€ã¯ãŒã«ã§ãã
Badooã«ã¯ããšãŒããããšåç±³ã®å€§è¥¿æŽã®äž¡åŽã«ãã2ã€ã®ããŒã¿ã»ã³ã¿ãŒã察å¿ããŠããŸãã çµ±åã¬ããŒããäœæããã«ã¯ãã¢ã¡ãªã«ãããšãŒãããã«ããŒã¿ãéä¿¡ããå¿
èŠããããŸãã ããå€ãã®èšç®èœåãããããããã¹ãŠã®çµ±èšçµ±èšãä¿æããã®ã¯ãšãŒãããã®ããŒã¿ã»ã³ã¿ãŒã§ãã çŽ
200ããªç§ã®ããŒã¿ã»ã³ã¿ãŒé
ã®åŸåŸ© -ãããã¯ãŒã¯ã¯ããªãæè»ã§ã-å¥ã®DCã«èŠæ±ãè¡ãããšã¯ã次ã®ã©ãã¯ã«è¡ãããšãšåãã§ã¯ãããŸããã
ã€ãã³ããšéçºè
ã®æ£åŒåãéå§ãããããã¯ããããŒãžã£ãŒãé¢äžãããšãã誰ãããã¹ãŠãæ°ã«å
¥ã£ãŠããŸããã
ã€ãã³ãã®ççºçãªæé·ããããŸãã ã çŸæç¹ã§ã¯ãã¯ã©ã¹ã¿ãŒã§éã賌å
¥ããææã§ããããç§ãã¡ã¯æ¬åœã«ããããããããããŸããã§ããã
1ç§éã«
800,000ã€ãã³ãã®ããŒã¯ãéãããšãã«ãYandexãOpenSource
ClickHouseã«ã¢ããããŒããããã®ãèŠã€ããŠã詊ããŠã¿ãããšã«ããŸããã
圌ãã¯äœããããããšããŠããæäž
ã«ã³ãŒã³ã®éЬè»ãåãããã®çµæããã¹ãŠãããŸããã£ããšãã圌ãã¯æåã®çŸäžã®ã€ãã³ãã®ããã«å°ããªã¬ã»ãã·ã§ã³ãè¡ããŸããã ãããããClickHouseãã¬ããŒããçµäºããå¯èœæ§ããããŸãã
ClickHouseã䜿çšããŠããã®ãŸãŸäœ¿çšããŠãã ããã
ããããããã¯é¢çœããªãã®ã§ãããŒã¿åŠçã«ã€ããŠåŒãç¶ã説æããŸãã
ã¯ãªãã¯ããŠã¹
ClickHouseã¯éå»2幎éã®èªå€§åºåã§ããã玹ä»ããå¿
èŠã¯ãããŸããã2018幎ã®HighLoad ++ã«ã€ããŠã®ã¿ããã
ã«é¢ãã5ã€ã®ã¬ããŒããšãã»ãããŒããã³äŒè°ã«ã€ããŠèŠããŠããŸãã
ãã®ããŒã«ã¯ãèªåã§èšå®ããã¿ã¹ã¯ãæ£ç¢ºã«è§£æ±ºããããã«èšèšãããŠããŸãã HadoopããäžåºŠã«åãåã£ã
ãªã¢ã«ã¿ã€ã æŽæ°ãšãããããã
ãŸã ïŒã¬ããªã±ãŒã·ã§ã³ãã·ã£ãŒãã£ã³ã°ã ClickHouseã詊ããªãçç±ã¯ãããŸããã§ãããHadoopã§ã®å®è£
ã§ã¯ããã§ã«åºãçªç ŽããŠããããšãçè§£ããŠããããã§ãã ãã®ããŒã«ã¯ã¯ãŒã«ã§ãããã¥ã¡ã³ãã¯äžè¬çã«ç«ãã€ããŸã-ç§ã¯èªåã§ããã«æžããã®ã§ããã¹ãŠãæ¬åœã«å¥œãã§ããã¹ãŠãçŽ æŽãããã§ãã ããããå€ãã®åé¡ã解決ããå¿
èŠããããŸããã
ClickHouseã§ã€ãã³ãã®ãããŒå
šäœãã·ããããæ¹æ³ã¯ïŒ 2ã€ã®ããŒã¿ã»ã³ã¿ãŒããã®ããŒã¿ãçµåããæ¹æ³ã¯ïŒ ç§ãã¡ã管çè
ã®ãšããã«æ¥ãŠããã¿ããªãClickHouseãã€ã³ã¹ããŒã«ããŸãããããšèšã£ããšããäºå®ããã圌ãã¯ãããã¯ãŒã¯ã2åã«åããããé
å»¶ã¯ååã«ãªããŸãã ãããããããã¯ãŒã¯ã¯ãŸã æåã®çµŠäžãšåããããèããŠå°ããã§ãã
çµæãä¿åããæ¹æ³ã¯ ïŒ Hadoopã§ã¯ãã°ã©ãã£ãã¯ã¹ã®æç»æ¹æ³ãçè§£ããŸããããéæ³ã®ClickHouseã§ã©ã®ããã«æç»ããã®ã§ããïŒ éæ³ã®æã¯å«ãŸããŠããŸããã æç³»åã¹ãã¬ãŒãžã«
çµæãé
ä¿¡ããæ¹æ³ã¯ ïŒ
ç ç©¶æã®è¬åž«ãèšã£ãããã«ã3ã€ã®ããŒã¿ã¹ããŒã ãæ€èšããŠãã ããïŒæŠç¥çãè«ççãç©ççã§ãã
æŠç¥çã¹ãã¬ãŒãžã¹ããŒã
2ã€ã®ããŒã¿ã»ã³ã¿ãŒããããŸãã ClickHouseã¯DCã«ã€ããŠäœãç¥ããªãæ¹æ³ãç¥ã£ãŠããããšãåŠã³ãåDCã§ã¯ã©ã¹ã¿ãŒããããããŸããã ããã§ã
ããŒã¿ã¯å€§è¥¿æŽéã±ãŒãã«ãç§»åããªããªããŸãããDCã§çºçãããã¹ãŠã®ããŒã¿ã¯ããã®ã¯ã©ã¹ã¿ãŒã«ããŒã«ã«ã«ä¿åãããŸãã ããšãã°ãäž¡æ¹ã®DCã«ããã€ã®ç»é²ããããã調ã¹ãããã«ãçµåãããããŒã¿ã«å¯ŸããŠãªã¯ãšã¹ããè¡ãå ŽåãClickHouseã¯ãã®æ©äŒãæäŸããŸãã ãªã¯ãšã¹ãã®äœã¬ã€ãã³ã·ãšå¯çšæ§-æé«åäœïŒ

ç©çã¹ãã¬ãŒãžã¹ããŒã
ç¹°ãè¿ãã«ãªããŸãããããŒã¿ã¯ClickHouseãªã¬ãŒã·ã§ãã«ã¢ãã«ã«ã©ã®ããã«åé¡ãããŸãããã¬ããªã±ãŒã·ã§ã³ãšã·ã£ãŒãã£ã³ã°ã倱ããªãããã«äœããã¹ãã§ããããã
ClickHouseã®ããã¥ã¡ã³ãã«ã¯ãã¹ãŠã
詳ãã説æãããŠããŸããè€æ°ã®ãµãŒããŒãããå Žåã¯ããã®èšäºã«åºããããŸãã ãããã£ãŠãããã¥ã¢ã«ã®å
容ãã€ãŸããã¬ããªã±ãŒã·ã§ã³ãã·ã£ãŒãã£ã³ã°ãã·ã£ãŒãäžã®ãã¹ãŠã®ããŒã¿ãžã®ã¯ãšãªã«ã€ããŠã¯è©³ãã説æããŸããã
ã¹ãã¬ãŒãžããžãã¯
è«çå³ãæãè峿·±ãã§ãã 1ã€ã®ãã€ãã©ã€ã³ã§ãç°çš®ã€ãã³ããåŠçããŸãã ããã¯ãç»é²ãé³å£°ãåçã®ã¢ããããŒããæè¡ææšããŠãŒã¶ãŒã®è¡åã®è¿œè·¡
ãªã©ãç°çš®ã€ãã³ãã®ã¹ããªãŒã ãããããšãæå³ã
ãŸã ããããã®ã€ãã³ãã¯ãã¹ãŠå®å
šã«
ç°ãªã屿§ãæã£ãŠã
ãŸã ã ããšãã°ãæºåž¯é»è©±ã§ç»é¢ãèŠãŸãã-ç»é¢IDãå¿
èŠã§ãã誰ãã«æç¥šããŸãã-æç¥šãè³æãå察ããçè§£ããå¿
èŠããããŸãã ãããã®ã€ãã³ãã«ã¯ãã¹ãŠç°ãªã屿§ããããç°ãªãã°ã©ããæç»ãããŸããããããã¯ãã¹ãŠåäžã®ãã€ãã©ã€ã³ã§åŠçããå¿
èŠããããŸãã ClickHouseã¢ãã«ã«é
眮ããæ¹æ³ã¯ïŒ
ã¢ãããŒãNo. 1-ã€ãã³ãããŒãã«ããšã ãã®æåã®ã¢ãããŒãã¯ãMySQLã§åŸãããçµéšããæšå®ãããã®ã§ããClickHouseã§
ã€ãã³ãããšã«ã¿ãã¬ãããäœæããŸããã ããªãè«ççã«èãããŸãããå€ãã®å°é£ã«ééããŸããã
æ¬æ¥ã®ãã«ãããªãªãŒã¹ããããšãã«ã€ãã³ãã®æ§é ã倿Žããããšããå¶éã¯ãããŸããã ãã®ãããã¯ãã©ã®éçºè
ã§ãäœæã§ããŸãã ãã®ã¹ããŒã ã¯ãéåžžããã¹ãŠã®æ¹åã§å€æŽå¯èœã§ãã å¯äžã®
å¿
é ãã£ãŒã«ãã¯ã
ã¿ã€ã ã¹ã¿ã³ãã€ãã³ããš
ã€ãã³ãã®å
容ã§ãã ä»ã®ãã¹ãŠã¯ãªã³ã¶ãã©ã€ã§å€æŽãããããããããã®ãã¬ãŒãã倿Žããå¿
èŠããããŸãã ClickHouseã«ã¯
ã¯ã©ã¹ã¿ãŒã§
ALTERãå®è¡ããæ©èœã
ãããŸãããããã¯ç¹çްã§ããªã±ãŒããªæé ã§ãããèªååãå°é£ã§ã¹ã ãŒãºã«æ©èœããŸããã ãããã£ãŠãããã¯ãã€ãã¹ã§ãã
1000ãè¶
ããããŸããŸãªã€ãã³ããããããã
ãã·ã³ããšã«é«ãINSERTã¬ãŒããåŸãããŸãããã¹ãŠã®ããŒã¿ãåžžã«1000ã®ããŒãã«ã«èšé²ããŸãã ClickHouseã®å Žåãããã¯ã¢ã³ããã¿ãŒã³ã§ãã ããã·ã®ã¹ããŒã¬ã³ãããã°-ãLive in big sipsããClickHouse-
ãLive in big batchã ã ãããè¡ãããªããšãã¬ããªã±ãŒã·ã§ã³ã忢ããClickHouseã¯æ°ããæ¿å
¥ã®åãå
¥ããæåŠããŸããããã¯äžå¿«ãªã¹ããŒã ã§ãã
ã¢ãããŒã2-åºãããŒãã« ã ã·ããªã¢ã®ç·æ§ã¯ããã§ãŒã³ãœãŒãã¬ãŒã«ã«æ»ã蟌ãŸããå¥ã®ããŒã¿ã¢ãã«ã䜿çšããããšããŸããã
ååã®ããŒãã«ãäœæããŸããåã€ãã³ãã«ã¯ããŒã¿çšã®åãäºçŽãããŠããŸãã 巚倧ãª
ã¹ããŒã¹ããŒãã«ãååŸã
ãŸã -幞ããªããšã«ãããã¯éçºç°å¢ãè¶
ããŸããã§ãããæåã®æ¿å
¥ãããã¹ããŒã ãçµ¶å¯Ÿã«æªãããšãæããã«ãªã£ãããã§ãã
ããã§ããç§ã¯ãã®ãããªã¯ãŒã«ãªãœãããŠã§ã¢è£œåã䜿ããããå°ãä»äžããããšæã£ãŠããŸãããããããªããå¿
èŠãšãããã®ã§ãã
ã¢ãããŒã3-äžè¬çãªè¡šã ClickHouseã¯
éã¹ã«ã©ãŒããŒã¿åããµããŒãããŠãããããé
åã«ããŒã¿ãæ ŒçŽãã1ã€ã®å·šå€§ãªããŒãã«ããã
ãŸã ã ã€ãŸãã屿§ã®ååãæ ŒçŽãããåãšã屿§ã®å€ãæ ŒçŽãããé
åãæã€å¥ã®åãéå§ããŸãã

ClickHouseã¯ãããã§ãã®ã¿ã¹ã¯ãéåžžã«ããŸãå®è¡ããŸãã ããŒã¿ãæ¿å
¥ããã ãã§ããã°ãããããçŸåšã®ã€ã³ã¹ããŒã«ã§ããã«10åçµã蟌ã¿ãŸãã
ãã ããè»èã®ããšã¯ã
æååã®é
åãä¿åããããã®ClickHouseã®ã¢ã³ããã¿ãŒã³ã§ãããããšã§ãã è¡é
å
ã¯ããå€ãã®ãã£ã¹ã¯å®¹éãå æãããããããã¯æªãããšã§ããåçŽãªåãããçž®å°
ããåŠçã
å°é£ã§ãã ããããç§ãã¡ã®ã¿ã¹ã¯ã«ã€ããŠã¯ãå©ç¹ãäžåããããããã«ç®ãåããŸãã
ãã®ãããªããŒãã«ããSELECTãäœæããæ¹æ³ã¯ïŒ ç§ãã¡ã®ã¿ã¹ã¯ã¯ãæ§å¥ã§ã°ã«ãŒãåãããç»é²ãã«ãŠã³ãããããšã§ãã ãŸãã1ã€ã®é
åã§æ§å¥ã®åã«å¯Ÿå¿ããäœçœ®ãèŠã€ããæ¬¡ã«ãã®ã€ã³ããã¯ã¹ã䜿çšããŠå¥ã®åã«ç§»åããŠããŒã¿ãååŸããå¿
èŠããããŸãã

ãã®ããŒã¿ã«ã°ã©ããæãæ¹æ³
ãã¹ãŠã®ã€ãã³ããèšè¿°ãããŠããããããããã¯å³å¯ãªæ§é ãæã¡ãã€ãã³ãã®ã¿ã€ãããšã«4é建ãŠã®SQLã¯ãšãªã圢æããå®è¡ããŠãçµæãå¥ã®ããŒãã«ã«ä¿åããŸãã
åé¡ã¯ãã°ã©ãäžã«2ã€ã®é£æ¥ããç¹ãæãã«
ã¯ãããŒãã«å
šäœã
ã¹ãã£ã³ããå¿
èŠãããããšã§ãã äŸïŒ1æ¥ãããã®ç»é²ã確èªããŸãã ãã®ã€ãã³ãã¯ãäžçªäžã®è¡ããæåŸãã2çªç®ã®è¡ãŸã§ã§ãã äžåºŠã¹ãã£ã³-çŽ æŽãããã 5ååŸãã°ã©ãäžã«æ°ãããã€ã³ããæç»ããŸã-åã³ãåã®ã¹ãã£ã³ãšäº€å·®ããããŒã¿ç¯å²ãã¹ãã£ã³ããåã€ãã³ãã«ã€ããŠåæ§ã«è¡ããŸãã è«ççã«èãããŸãããèŠæ ãã¯ãããããŸããã
ããã«ãããã€ãã®è¡ãååŸããå Žå
ãéèšã®çµæã
èªã¿åãå¿
èŠããã
ãŸã ã ããšãã°ãç¥ã®ããã¹ãã¹ã«ã³ãžããã¢ã§ç»é²ãããç·æ§ã§ãã£ããšããäºå®ããããèŠçŽçµ±èšãèšç®ããå¿
èŠããããŸãïŒç»é²æ°ãç·æ§æ°ããããã®äœäººããã«ãŠã§ãŒäººã§ãããã ããã¯ãåæããŒã¿ããŒã¹
ROLLUPãCUBE ãããã³
GROUPING SETSã®èгç¹ããåŒã³åºãããŸã
-1è¡ãè€æ°è¡ã«ããŸãã
æ²»çæ¹æ³
幞ããªããšã«ãClickHouseã«ã¯ããã®åé¡ãã€ãŸã
éèšé¢æ°ã®ã·ãªã¢ã«åãããç¶æ
ã解決ããããŒã«ããããŸãã ããã¯ãããŒã¿ã®äžéšãäžåºŠã¹ãã£ã³ããŠããããã®çµæãä¿åã§ããããšãæå³ããŸãã ããã¯
ãã©ãŒæ©èœã§ã ã 3幎åãSparkãšHadoopã§ãããæ£ç¢ºã«è¡ããŸããããYandexã®æé«ã®ãã€ã³ããClickHouseã«ã¢ããã°ãå®è£
ããã®ã¯çŽ æŽãããããšã§ãã
é
ããªã¯ãšã¹ã
仿¥ãšæšæ¥ã®ãŠããŒã¯ãŠãŒã¶ãŒæ°ãã«ãŠã³ããããšããããã£ãããšãããªã¯ãšã¹ãããããŸãã
SELECT uniq(user_id) FROM table WHERE dt IN (today(), yesterday())
ç©ççã«ã¯ãæšæ¥ã®ç¶æ
ã®SELECTãäœæãããã®ãã€ããªè¡šçŸãååŸããŠãã©ããã«ä¿åã§ããŸãã
SELECT uniq(user_id), 'xxx' AS ts, uniqState(user id) AS state FROM table WHERE dt IN (today(), yesterday())
仿¥ã¯ã仿¥ã«ãªããšããæ¡ä»¶ã倿Žããã ãã§ãïŒ
'yyy' AS ts
ããã³
WHERE dt = today()
ããã³ã¿ã€ã ã¹ã¿ã³ãâ xxxâããã³â yyyâãåŒã³åºããŸãã , , 2 .
SELECT uniqMerge(state) FROM ageagate_table WHERE ts IN ('xxx', 'yyy')
:
, - .
. , , , , ClickHouse, : «, ! , !»
, , .
, . . â SQL-, . , , .

, - time series. : , , , time series.
time series : , , timestamp . , , . . , , , â , . , , ClickHouse -, , .
, , ClickHouse:
â « », â .
time series 2 , 20 20-80 . . ClickHouse
GraphiteMergeTree , time series, .
8 ClickHouse , 6 - , 2 : 2 â , .
1.8 . ,
500 . , 1,8 , 500 ! .
Hadoop
2 . .
3 , CPU â
4 . , .
Process
, , , . , , ClickHouse 3 000 . , , , overkill.
, , . ClickHouse,
. , , , . , 8 3â4 . â .
Present â
, ? time series,
time series , , , .
Drop Detect â SQL : SQL- , , .
Anomaly Detection â . , , 2% , â 40, , , , .
â , , - , Anomaly Detection.
Anomaly Detection
, time series . : , , . time series
. , , . ,
drop detection â , .
UI.

. - , â . -, .
Present
, ,
.
, : 1000 â alarm, 0 â alarm. .
Anomaly Detection , . Anomaly Detection
Exasol , ClickHouse. Anomaly Detection 2 , .
, , 4 .
,
, , . ,
, . ,
.
HighLoad++ , HighLoad++ - . , , :)
, PHP Russia , , . , , , 1,8 /, , 1 .