ãŠãŒã¶ãŒãSpotifyã¯ã©ã€ã¢ã³ãã§ã¢ã¯ã·ã§ã³ãå®è¡ãããšïŒããšãã°ãæ²ãèããããã¢ãŒãã£ã¹ããæ€çŽ¢ããããããã³ã«ïŒãå°éã®æ
å ±ïŒã€ãã³ãïŒããµãŒããŒã«éä¿¡ãããŸãã ã€ãã³ãé
ä¿¡ã¯ãäžçäžã®ã客æ§ããåœç€Ÿã®äžå€®åŠçã·ã¹ãã ã«æ
å ±ãå®å
šãã€ç¢ºå®ã«èŒžéããããã»ã¹ã§ãããè峿·±ãã¿ã¹ã¯ã§ãã ãããã®äžé£ã®èšäºã§ã¯ããã®åéã§å®è£
ãããŠãããœãªã¥ãŒã·ã§ã³ã®ããã€ããæ€èšããŸãã ããæ£ç¢ºã«ã¯ãæ°ããã€ãã³ãé
ä¿¡ã·ã¹ãã ã®ã¢ãŒããã¯ãã£ãèŠãŠãGoogle Cloudã«ãããã€ããããšã«ããçç±ã説æããŸãã
ãã®æåã®èšäºã§ã¯ãçŸåšã®ã€ãã³ãé
ä¿¡ã·ã¹ãã ãã©ã®ããã«æ©èœãããã説æããããã䜿çšããŠåŠãã ããã€ãã®æèšã«ã€ããŠèª¬æããŸãã
æ¬¡ã« ãæ°ããã·ã¹ãã ã®äœæãšããã¹ãŠã®ã€ãã³ãã®ãã©ã³ã¹ããŒãã¡ã«ããºã ãšããŠ
Cloud Pub / Subãéžæããçç±ãæ€èšããŸãã 3çªç®ã®æåŸã®èšäºã§ã¯ãDataFlowã䜿çšããŠãã¹ãŠã®ã€ãã³ããåŠçããæ¹æ³ãšããããã©ãã»ã©è¿
éã«çºçãããã«ã€ããŠèª¬æããŸãã

é
ä¿¡ã·ã¹ãã ãéããŠé
ä¿¡ãããã€ãã³ãã«ã¯å€ãã®çšéããããŸãã 補åèšèšã«ãããåœç€Ÿã®ãœãªã¥ãŒã·ã§ã³ã®ã»ãšãã©ã¯ãA / Bãã¹ãã®çµæã«åºã¥ããŠããããã®çµæãå€§èŠæš¡ã§æ£ç¢ºãªããŒã¿ã«äŸåããå¿
èŠããããŸãã 2015幎ã«ãªãªãŒã¹ããã
Discover Weeklyãã¬ã€ãªã¹ãã¯ãããã«Spotifyã®æããã䜿çšãããæ©èœã®1ã€ã«ãªããŸããã 鳿¥œåçããŒã¿ã«åºã¥ããŠäœæãããŸãã
鳿¥œã®å¹Ž ã
Spotify Party ããã®ä»ã®å€ãã®Spotifyæ©èœãããŒã¿ããŒã¹ã«åºã¥ããŠããŸãã ããã«ãSpotifyããŒã¿ã¯ã
ãã«ããŒãããããã³ã³ãã€ã«ããããã®ãœãŒã¹ã®1ã€ã§ãã
ã¡ãã»ãŒãžã³ã°ã·ã¹ãã ã¯ãSpotifyããŒã¿ã€ã³ãã©ã¹ãã©ã¯ãã£ã®åºæ¬çãªéšåã®1ã€ã§ãã ãã®ããã®éèŠãªèŠä»¶ã¯ãååã«èª¬æãããã€ã³ã¿ãŒãã§ã€ã¹ãä»ããŠãéçºè
ãäºæž¬å¯èœãªé
å»¶ãšå¯çšæ§ã§ãã¹ãŠã®ããŒã¿ãé
ä¿¡ããããšã§ãã 䜿çšç¶æ³ããŒã¿ã¯ãããæç¹ã§äºåã«èšå®ãããã¢ã¯ã·ã§ã³ãžã®å¿çãšããŠåœ¢æãããæ§é åã€ãã³ãã®ã»ãããšããŠèª¬æã§ããŸãã
Spotifyã䜿çšããã€ãã³ãã®ã»ãšãã©ã¯ãç¹å®ã®ãŠãŒã¶ãŒã¢ã¯ã·ã§ã³ã«å¯Ÿããå¿çãšããŠSpotifyã¯ã©ã€ã¢ã³ãã«ãã£ãŠçŽæ¥çæãããŸãã Spotifyã¯ã©ã€ã¢ã³ãã§ã€ãã³ããçºçãããã³ã«ãSpotifyã²ãŒããŠã§ã€ã®1ã€ã«ã€ãã³ãã«é¢ããæ
å ±ãéä¿¡ãããSpotifyã²ãŒããŠã§ã€ã¯ãã®ã€ãã³ããã·ã¹ãã ãã°ã«æžã蟌ã¿ãŸãã ããã§ãã¡ãã»ãŒãžé
ä¿¡ã·ã¹ãã ã§äœ¿çšãããã¿ã€ã ã¹ã¿ã³ããå²ãåœãŠãããŸãã ã¡ãã»ãŒãžé
ä¿¡ã®ç¹å®ã®é
å»¶ãšå®å
šæ§ãä¿èšŒããããã«ããµãŒããŒã«å°çããåã«ã€ãã³ããå¶åŸ¡ã§ããªããããã¯ã©ã€ã¢ã³ãã§ã¯ãªãã€ãã³ãã«ãã°ã©ãã«ïŒsyslogã¿ã€ã ã¹ã¿ã³ãïŒã䜿çšããããšã決å®ãããŸããã
Spotifyã®å Žåããã¹ãŠã®ããŒã¿ãäžå€®ã®Hadoopã¯ã©ã¹ã¿ãŒã«é
ä¿¡ããå¿
èŠããããŸãã ããŒã¿ãåéããSpotifyãµãŒããŒã¯ã2ã€ã®å€§éžã®ããã€ãã®ããŒã¿ã»ã³ã¿ãŒã«ãããŸãã ããŒã¿ã»ã³ã¿ãŒéã®åž¯åå¹
ã¯åžå°ãªãªãœãŒã¹ã§ãããããŒã¿äŒéãç¹å¥ãªæ³šæã§æ±ãå¿
èŠããããŸãã
ããŒã¿ã€ã³ã¿ãŒãã§ã€ã¹ã¯ãHadoopå
ã®ããŒã¿ã®å Žæãšä¿åããã圢åŒã«ãã£ãŠæ±ºãŸããŸãã ãµãŒãã¹ã«ãã£ãŠé
ä¿¡ããããã¹ãŠã®ããŒã¿ã¯ãHDFSã§
Avro圢åŒã§èšé²ãããŸãã é
ä¿¡ãããããŒã¿ã¯ã60åéïŒæéïŒã»ã¯ã·ã§ã³ïŒããŒãã£ã·ã§ã³ïŒã«åå²ãããŸãã ããã¯ãæåã®ã¡ãã»ãŒãžé
ä¿¡ã·ã¹ãã ãscpã³ãã³ãã«åºã¥ããŠãããHadoopäžã®ãã¹ãŠã®ãµãŒããŒããsyslogãã¡ã€ã«ã1æéããšã«ã³ããŒããŠããéå»ã®éºç©ã§ãã çŸåšãSpotifyã®ãã¹ãŠã®ããŒã¿åŠçããã»ã¹ã¯1æéããšã®ããŒã¿ã«åºã¥ããŠããããããã®ã€ã³ã¿ãŒãã§ãŒã¹ã¯è¿ãå°æ¥ã«æ®ãã§ãããã
Spotifyã®ã»ãšãã©ã®ããŒã¿ããã»ã¹ã¯ãç£èŠã¢ã»ã³ããªãã1åã ãããŒã¿ãèªã¿åããŸãã äžéšã®ããã»ã¹ã®åºåå€ã¯ãä»ã®ããã»ã¹ãžã®å
¥åãšããŠæ©èœããããã倿ã®é·ããã§ãŒã³ã圢æããŸãã ããã»ã¹ã1æéããŒã¿ãåŠçããåŸããã®å
ã®æéã«å€æŽã®ãã§ãã¯ãå®è¡ããªããªããŸãã ããŒã¿ã倿Žãããå Žåããããã®å€æŽãããã«åçŸããå¯äžã®æ¹æ³ã¯ããã®ç¹å®ã®ééïŒæéïŒã§ãã¹ãŠã®é¢é£ã¿ã¹ã¯ïŒããã³é¢é£ã¿ã¹ã¯ïŒãæåã§åèµ·åããããšã§ãã ããã¯é«äŸ¡ã§æéã®ãããããã»ã¹ã§ãããã®ãããã¡ãã»ãŒãžé
ä¿¡ãµãŒãã¹ã«ãã®ãããªèŠä»¶ãæç€ºããæéã»ãããæäŸããåŸããã®ããŒã¿ãè£è¶³ã§ããªããªããŸããã ããŒã¿ã®å®å
šæ§ã®åé¡ãšããŠç¥ããããã®åé¡ã¯ãããŒã¿åŠçã®é
å»¶ãæå°éã«æãããšããèŠä»¶ã«åããŠããŸãã
ããŒã¿ã®å®å
šæ§ã®åé¡ã«é¢ããè峿·±ãèŠç¹ã¯ãGoogleã®
Dataflowã¬ããŒãã«ç€ºãã
ãŠããŸãã
ãªãªãžãã«ã®ã¡ãã»ãŒãžé
ä¿¡ã·ã¹ãã
ã·ã¹ãã æ§æ
æåã®ã¡ãã»ãŒãžã³ã°ã·ã¹ãã ã¯ãKafka 0.7ã®äžã«æ§ç¯ãããŸããã

ãã®äžã§ãã€ãã³ãé
ä¿¡ã·ã¹ãã ã¯ã1æéããšã®ãã¡ã€ã«ã®æœè±¡åãäžå¿ã«æ§ç¯ãããŠããŸãã ãµãŒãã¹ãã·ã³ããã®ã€ãã³ããå«ããã°ãã¡ã€ã«ãHDFSã«ã¹ããªãŒãã³ã°ããããã«èšèšãããŠããŸãã ç¹å®ã®æéå
ã«ãã¹ãŠã®ãã°ãã¡ã€ã«ãHDFSã«è»¢éãããåŸãã¿ãä»ãã®ããã¹ãããAvro圢åŒã«å€æãããŸãã
ã·ã¹ãã ãæåã«äœæããããšããKafka 0.7ã«æ¬ ããŠããæ©èœã®1ã€ã¯ãKafka Brokerã¯ã©ã¹ã¿ãŒãä¿¡é Œæ§ã®é«ãæ°žç¶ã¹ãã¬ãŒãžãæäœã§ããããšã§ããã ããã¯ãããŒã¿ãããã¥ãŒãµãŒãKafka Syslogãããã¥ãŒãµãŒãããã³Hadoopã®éã§äžå®ã®ç¶æ
ãç¶æããªããšããéèŠãªèšèšæ±ºå®ã«åœ±é¿ãäžããŸããã ã€ãã³ãã¯ãHDFSäžã®ãã¡ã€ã«ã«æžã蟌ãŸãããšãã«ã®ã¿å®å
šã«ä¿åããããšèŠãªãããŸãã
Hadoopå
ã§ã®ã¿ã€ãã³ãã確å®ã«ååšããåé¡ã¯ãHadoopã¯ã©ã¹ã¿ãŒãã¡ãã»ãŒãžé
ä¿¡ã·ã¹ãã ã®åäžé害ç¹ã«ãªãããšã§ãã Hadoopã倱æãããšãé
ä¿¡ã·ã¹ãã å
šäœã忢ããŸãã ããã«å¯ŸåŠããã«ã¯ãã€ãã³ããåéãããã¹ãŠã®ãµãŒãã¹ã«ååãªãã£ã¹ã¯å®¹éãããããšã確èªããå¿
èŠããããŸãã HadoopããµãŒãã¹ã«åŸ©åž°ãããããã®ç¶æ
ã«ã远ãã€ããå¿
èŠãããããã¹ãŠã®ããŒã¿ãã§ããã ãæ©ã転éããŸãã åŸ©æ§æéã¯ãäž»ã«ããŒã¿ã»ã³ã¿ãŒéã§äœ¿çšã§ãã垯åå¹
ã«ãã£ãŠå¶éãããŸãã
ãããã¥ãŒãµãŒã¯ãHadoopã«ã€ãã³ããéä¿¡ãããã¹ãŠã®ãã¹ãã§å®è¡ãããããŒã¢ã³ã§ãã ãã°ãã¡ã€ã«ã远跡ãããã°ããã±ãŒãžãKafka Syslog Consumerã«éä¿¡ããŸãã ãããã¥ãŒãµãŒã¯ãã€ãã³ãã®ã¿ã€ããã€ãã³ãã®ããããã£ã«ã€ããŠäœãç¥ããŸããã 圌ã®èгç¹ããèŠããšãã€ãã³ãã¯ãã¡ã€ã«å
ã®äžé£ã®è¡ã§ããããã¹ãŠã®è¡ã¯åããã£ãã«ã«ãªãã€ã¬ã¯ããããŸãã ã€ãŸãã1ã€ã®ãã°ãã¡ã€ã«ã«å«ãŸãããã¹ãŠã®ã¿ã€ãã®ã€ãã³ãã1ã€ã®ãã£ãã«ãä»ããŠéä¿¡ãããŸãã ãã®ãããªã·ã¹ãã ã§ã¯ãKafkaãããã¯ã¯ã€ãã³ããéä¿¡ããããã®ãã£ãã«ãšããŠäœ¿çšãããŸãã ãããã¥ãŒãµãŒããã°ãã³ã³ã·ã¥ãŒããŒã«éä¿¡ããåŸãã³ã³ã·ã¥ãŒããŒããã°è¡ãHDFSã«æ£åžžã«ä¿åããããšã®ç¢ºèªïŒACKïŒãåŸ
ã€å¿
èŠããããŸãã ãããã¥ãŒãµãŒã¯ãéä¿¡ããããã°ã®ACKãåä¿¡ããåŸã«ã®ã¿ãããããå®å
šã«ä¿åãããŠãããšèããä»ã®ã¬ã³ãŒãã転éããŸãã
ã€ãã³ãã®å Žåããããã¥ãŒãµãŒããã³ã³ã·ã¥ãŒããŒã«å°éããã«ã¯ãKafka BrokersãééããŠãããKafka Groupersãééããå¿
èŠããããŸãã Kafka Brokersã¯Kafkaã®æšæºã³ã³ããŒãã³ãã§ãããKafka Groupersã¯ç§ãã¡ãæžããã³ã³ããŒãã³ãã§ãã ã°ã«ãŒããŒã¯ãããŒã«ã«ããŒã¿ã»ã³ã¿ãŒããã®ãã¹ãŠã®ã€ãã³ãã¹ããªãŒã ãåŠçããããããåã³å§çž®ããŠå
¬éãã1ã€ã®ãããã¯ã«å¹æçã«ã°ã«ãŒãåããã³ã³ã·ã¥ãŒããŒããã«ããŸãã
æœåºã倿ãããŒãïŒETLïŒã¿ã¹ã¯ã¯ãã¿ãã§åºåãããåçŽãªåœ¢åŒã®ããŒã¿ãAvro圢åŒã«å€æããããã«äœ¿çšãããŸãã ãã®ããã»ã¹ã¯éåžžã®Hadoop MapReduceãžã§ãã§ããã
Crunchãã¬ãŒã ã¯ãŒã¯ã䜿çšããŠå®è£
ããã1æéããšã®ã»ããã§åäœããŸãã ç¹å®ã®æéã«äœæ¥ãéå§ããåã«ã圌ã¯ãã¹ãŠã®ãã¡ã€ã«ãå®å
šã«è»¢éãããããšã確èªããå¿
èŠããããŸãã
ãã¹ãŠã®ãããã¥ãŒãµãŒã¯ããã¡ã€ã«ã®çµããããŒã¯ã³ãå«ãå¯èœæ§ã®ãããã§ãã¯ããŒã¯ãåžžã«éä¿¡ããŠããŸãã Producerã¯ããã¡ã€ã«å
šäœãå®å
šã«Hadoopã«ä¿åãããŠãããšProducerã倿ãããšãã«1åã ãéä¿¡ãããŸãã ç¶æ
ïŒãŸãã¯ããµãã€ãããªãã£ãïŒã¢ãã¿ãŒã¯ãç¹å®ã®æéã«ãµãŒãã¹ãã·ã³ãå®è¡ãããŠãããã¹ãŠã®ããŒã¿ã»ã³ã¿ãŒã®ãµãŒãã¹æ€åºã·ã¹ãã ãåžžã«ããŒãªã³ã°ããŸãã ãã®1æéã§ãã¹ãŠã®ãã¡ã€ã«ãæçµçã«è»¢éããããã©ããã確èªããããã«ãETLã¯ãããŒã¿ã®çµãããäºæ³ããããµãŒããŒã«é¢ããæ
å ±ããã¡ã€ã«ã®çµããããŒã«ãŒãšæ¯èŒããŸãã ETLãäžäžèŽãšäžå®å
šãªããŒã¿è»¢éã倿ããå ŽåãETLã¯ããŒã¿ã®åŠçãç¹å®ã®æéé
å»¶ãããŸãã
äžè¬çãªHadoop MapReduceã¿ã¹ã¯ã§ããæ¢åã®ããããŒããã³ã¬ãã¥ãŒãµãŒã§ããETLãæå€§éã«æŽ»çšã§ããããã«ããã«ã¯ãå
¥åããŒã¿ãã·ã£ãŒãã£ã³ã°ããæ¹æ³ãç¥ãå¿
èŠããããŸãã ããããŒãšã¬ãã¥ãŒãµãŒã¯ãå
¥åããŒã¿ã®ãµã€ãºã«åºã¥ããŠèšç®ãããŸãã æé©ãªã·ã£ãŒãã£ã³ã°ã¯ãã³ã³ã·ã¥ãŒããŒããç¶ç¶çã«åä¿¡ãããã€ãã³ãã®æ°ã«åºã¥ããŠèšç®ãããŸãã
ã¬ãã¹ã³
ãã®èšèšã«é¢é£ããäž»ãªåé¡ã®1ã€ã¯ãããŒã«ã«ã®ãããã¥ãŒãµãŒãããŒã¿ã確å®ã«é
ä¿¡ããããšèŠãªãããåã«ãäžå€®ã®å Žæã®HDFSã«ä¿åãããããšã確èªããå¿
èŠãããããšã§ãã ããã¯ãç±³åœè¥¿æµ·å²žã®ProducerãµãŒããŒãããã³ãã³ã®ãã£ã¹ã¯ã«ããŒã¿ãæžã蟌ãŸããŠããããšãç¥ãå¿
èŠãããããšãæå³ããŸãã ã»ãšãã©ã®å Žåãæ£åžžã«æ©èœããŸãããããŒã¿è»¢éãé
ããªããšãé
ä¿¡ã®é
å»¶ãçºçãããããåãé€ãã®ãå°é£ã«ãªããŸãã
ãµãŒãã¹ãã€ã³ããããŒã«ã«ããŒã¿ã»ã³ã¿ãŒã«ããå Žåã®ãªãã·ã§ã³ãšæ¯èŒããŠãã ããã éåžžãããŒã¿ã»ã³ã¿ãŒå
ã®ãã¹ãéã®ãããã¯ãŒã¯ã¯éåžžã«ä¿¡é Œæ§ãé«ããããããã«ãããããã¥ãŒãµãŒã®èšèšãç°¡çŽ åãããŸãã
åé¡ãèŠçŽãããšãäžçäžã®1ç§ããã700,000ãè¶
ããã€ãã³ãã確å®ã«é
ä¿¡ã§ããã·ã¹ãã ã«éåžžã«æºè¶³ããŠããŸãã ã·ã¹ãã ã®åèšèšã«ããããœãããŠã§ã¢éçºããã»ã¹ãæ¹åããæ©äŒãäžããããŸããã
1ã€ã®ãã£ãã«ãä»ããŠãã¹ãŠã®ã€ãã³ããäžç·ã«éä¿¡ããããšã«ãããç°ãªãQuality of ServiceïŒQoSïŒã§ã€ãã³ããããŒã管çããæè»æ§ã倱ãããŸããã ãŸãããªã¢ã«ã¿ã€ã ã§åäœããããã»ã¹ã¯ãã¹ããªãŒã å
šäœãå
¥ã£ãŠããåäžã®ãã£ãã«ãä»ããŠããŒã¿ã転éããããããå¿
èŠãªãã®ã®ã¿ãé€å€ããå¿
èŠãããããããªã¢ã«ã¿ã€ã ã§ã®äœæ¥ãå¶éãããŠããŸããã
éæ§é åããŒã¿ã®è»¢éã§ã¯ã远å ã®ETL倿ãå¿
èŠã«ãªããããäžèŠãªé
å»¶ã远å ãããŸãã çŸåšãETLäœæ¥ã«ãããã€ãã³ãé
ä¿¡ã«çŽ30åã®é
å»¶ã远å ãããŸãã ããŒã¿ãAvro圢åŒã§éä¿¡ãããå ŽåãHDFSã§ã®èšé²æã«ããã«å©çšã§ããŸããã
éä¿¡è
ãæéã®çµããã远跡ããå¿
èŠãåé¡ãåŒãèµ·ãããŸããã ããšãã°ããã·ã³ã忢ããå Žåããã¡ã€ã«ã®çµããã«é¢ããã¡ãã»ãŒãžãéä¿¡ã§ããŸããã ãã¡ã€ã«ã®çµããããŒã«ãŒã倱ãããå Žåããã®ããã»ã¹ãæåã§äžæããããŸã§æ°žé ã«åŸ
æ©ããŸãã è»ã®æ°ãå¢ãããšããã®åé¡ã¯ãŸããŸãç·æ¥ã«ãªããŸãã
次ã®ã¹ããã
Spotifyã§é
ä¿¡ãããã¡ãã»ãŒãžã®æ°ã¯åžžã«å¢å ããŠããŸãã è² è·ãå¢å ããçµæãããå€ãã®åé¡ãçºçãå§ããŸããã æéãçµã€ã«ã€ããŠãåæ¢ã®æ°ã¯ç§ãã¡ãå¿é
ãå§ããŸããã ç§ãã¡ãã·ã¹ãã ããå¢å ããè² è·ã«å¯ŸåŠã§ããªãããšã«æ°ä»ããŸããã
次ã®èšäºã§ ãã·ã¹ãã ã®å€æŽã決å®ããæ¹æ³ã«ã€ããŠèª¬æããŸãã
ç¹å®ã®æç¹ã§ã·ã¹ãã ã«ãã£ãŠåŠçãããã¡ãã»ãŒãžã®æ°ã