æšå¹Žã®å€ãã€ã³ãã«Optane SSDãã£ã¹ã¯ãã©ã€ã
ã«é¢ããèšäºãå
¬ââéã
ã ç¡æã®ãã¹ãã«åå ããããå
šå¡ã«æåŸ
ã
ãŸãã ã ããã«ãã£ã¯å€§ããªé¢å¿ãåŒã³ãŸããïŒãŠãŒã¶ãŒã¯ãæ©æ¢°åŠç¿ã®åéã®ãããžã§ã¯ãã§ã
ã¡ã¢ãªå
ã®ããŒã¿ããŒã¹ãæäœããããã«ã
ç§åŠçãªèšç®ã«Optaneã䜿çšããããšããŸããã
ç§ãã¡ã¯é·ãé詳现ãªã¬ãã¥ãŒãæžãã€ããã§ãããããã¹ãŠãæã«å±ããŸããã§ããã ããããã€ãæè¿ãé©åãªæ©äŒãçŸããŸãããIntelã®ååãããã¹ãçš
ã«750 GBã®å®¹éãæã€æ°ãã
OptaneãæäŸããŠãããŸããã ç§ãã¡ã®å®éšã®çµæã以äžã«èª¬æããŸãã
Intel Optane P4800X 750GBïŒäžè¬æ
å ±ãšä»æ§
Intel Optane SSDã¯20nmããã»ã¹ã§å©çšå¯èœã§ãã 2ã€ã®ãã©ãŒã ãã¡ã¯ã¿ãŒã§ååšããŸãïŒãããã®åœ¢åŒïŒHHHLïŒCEM3.0ïŒ-詳现ã¯
ãã¡ããåç
§ïŒãšU.2 15 mmã
ã«ãŒãã®åœ¢ã®ãã£ã¹ã¯ããããŸãã
BIOSã«è¡šç€ºããããã©ã€ããŒãšè¿œå ããã°ã©ã ãã€ã³ã¹ããŒã«ããã«ã·ã¹ãã ã«ãã£ãŠæ±ºå®ãããŸãïŒUbuntu 16.04 OSã®äŸã瀺ããŸãïŒã
$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 149.1G 0 disk ââsda2 8:2 0 1K 0 part ââsda5 8:5 0 148.1G 0 part â ââvg0-swap_1 253:1 0 4.8G 0 lvm [SWAP] â ââvg0-root 253:0 0 143.3G 0 lvm / ââsda1 8:1 0 976M 0 part /boot nvme0n1 259:0 0 698.7G 0 disk
詳现æ
å ±ã¯ãnvme-cliãŠãŒãã£ãªãã£ã䜿çšããŠè¡šç€ºã§ããŸãïŒææ°ã®Linuxãã£ã¹ããªãã¥ãŒã·ã§ã³ã®ãªããžããªã«å«ãŸããŠããŸãããéåžžã«å€ãããŒãžã§ã³ã§ããããã
ãœãŒã¹ã³ãŒãããææ°ããŒãžã§ã³ãåéããããšããå§ãã
ãŸã ïŒã
ãããŠãããã«ãã®äž»èŠãªæè¡ä»æ§ããããŸãïŒ
å
¬åŒã®Intel Webãµã€ãããååŸïŒ
ç¹åŸŽ | äŸ¡å€ |
ããªã¥ãŒã | 750GB |
ã·ãŒã±ã³ã·ã£ã«èªã¿åãããã©ãŒãã³ã¹ãMB / s | 2500 |
é 次æžã蟌ã¿æäœäžã®ããã©ãŒãã³ã¹ãMB / s | 2200 |
ã©ã³ãã èªã¿åãããã©ãŒãã³ã¹ãIOPS | 550000 |
ã©ã³ãã æžã蟌ã¿ããã©ãŒãã³ã¹ãIOPS | 550000 |
èªã¿åãæäœã®é
延 | 10 µs |
é²é³äžã®é
延 | 10 µs |
èæ©èæ§ãPBW * | 41.0 |
* PBWã¯ãæžã蟌ãŸãããã¿ãã€ãã®ç¥ã§ãã ãã®ç¹æ§ã¯ãã©ã€ããµã€ã¯ã«å
šäœã§ãã£ã¹ã¯ã«æžã蟌ãããšãã§ããæ
å ±ã®éã瀺ããŸããäžèŠããã¹ãŠãéåžžã«å°è±¡çã§ãã ããããããŒã±ãã£ã³ã°è³æã«èšèŒãããŠããæ°åã®å€ãïŒçç±ããªãããã§ã¯ãããŸããïŒã¯ãä¿¡é Œããªãããšã«æ
£ããŠããŸãã ãããã£ãŠããããã確èªããããè¿œå ã®å®éšãè¡ã£ããããå¿
èŠã¯ãããŸããã
ããªãåçŽãªæš¡æ¬ãã¹ãããå§ããå®éã®ç·Žç¿ã«å¯èœãªéãè¿ãæ¡ä»¶ã§ãã¹ããå®æœããŸãã
ãã¹ããããæ§æ
Intelã®ååïŒå€ãã®äººã«æè¬ããŸãïŒã¯ã次ã®æè¡ä»æ§ã®ãµãŒããŒãæäŸããŠãããŸããã
- ãã¶ãŒããŒã-Intel R2208WFTZS;
- ããã»ããµãŒ-Intel Xeon Gold 6154ïŒ24.75Mãã£ãã·ã¥ã3.00 GHzïŒ;
- ã¡ã¢ãª-192GB DDR4;
- Intel SSD DC S3510ïŒOSã¯ãã®ãã£ã¹ã¯ã«ã€ã³ã¹ããŒã«ãããŸããïŒ;
- Intel Optaneâ¢SSD DC P4800X 750GBã
4.13ã«ãŒãã«ã®Ubuntu 16.04 OSããµãŒããŒã«ã€ã³ã¹ããŒã«ãããŸããã
泚æããŠãã ããïŒ NVMeãã©ã€ãã®è¯å¥œãªããã©ãŒãã³ã¹ãåŸãã«ã¯ãå°ãªããšã4.10ã®ã«ãŒãã«ããŒãžã§ã³ãå¿
èŠã§ãã 以åã®ããŒãžã§ã³ã®ã«ãŒãã«ã§ã¯ãçµæã¯ããã«æªããªããŸããNVMeãµããŒããé©åã«å®è£
ãããŠããŸããããã¹ãã§ã¯ã次ã®ãœãããŠã§ã¢ã䜿çšããŸããã
- fioãŠãŒãã£ãªãã£ããã£ã¹ã¯ããã©ââãŒãã³ã¹ã枬å®ããåéã®äºå®äžã®æšæºã§ãã
- iovisorãããžã§ã¯ãã®äžéšãšããŠBrendan Greggã«ãã£ãŠéçºããã蚺æããŒã«ã
- Facebookã§äœæããã rocksdbããŒã¿ãŠã§ã¢ããŠã¹ã®ããã©ãŒãã³ã¹ã枬å®ããããã«äœ¿çšãããdb_benchãŠãŒãã£ãªãã£ã
æš¡æ¬è©Šéš
åè¿°ã®ããã«ãæåã«åæãã¹ãã®çµæã確èªããŸãã
ãœãŒã¹ã³ãŒãããåéããfioãŠãŒãã£ãªãã£ããŒãžã§ã³3.3.31ã䜿çšããŠå®è¡ããŸããã
æ¹æ³è«ã«åŸã£ãŠããã¹ãã§ã¯æ¬¡ã®è² è·ãããã¡ã€ã«ã䜿çšããŸããã
- 4 Kbã®ãããã¯ã§ã®ã©ã³ãã ãªæžã蟌ã¿/èªã¿åãããã¥ãŒã®æ·±ã-1ã
- 4 Kbã®ãããã¯ã§ã®ã©ã³ãã ãªæžã蟌ã¿/èªã¿åãããã¥ãŒã®æ·±ã-16ã
- 4 Mã®ãããã¯ã§ã®ã©ã³ãã ãªæžã蟌ã¿/èªã¿åãããã¥ãŒã®æ·±ã-32;
- 4 Kbã®ãããã¯ã§ã®ã©ã³ãã ãªæžã蟌ã¿/èªã¿åãããã¥ãŒã®æ·±ã-128ã
æ§æãã¡ã€ã«ã®äŸã次ã«ç€ºããŸãã
[readtest] blocksize=4M filename=/dev/nvme0n1 rw=randread direct=1 buffered=0 ioengine=libaio iodepth=32 runtime=1200 [writetest] blocksize=4M filename=/dev/nvme0n1 rw=randwrite direct=1 buffered=0 ioengine=libaio iodepth=32
åãã¹ãã¯20åéå®è¡ãããŸããã å®äºããããé¢å¿ã®ãããã¹ãŠã®ææšãè¡šã«å
¥åããŸããïŒä»¥äžãåç
§ïŒã
ç§ãã¡ã«ãšã£ãŠæãèå³æ·±ãã®ã¯ã1ç§ãããã®å
¥åºåæäœæ°ïŒIOPSïŒãªã©ã®ãã©ã¡ãŒã¿ãŒã§ãã å4Mã®ãããã¯ã®èªã¿åããšæžã蟌ã¿ã®ãã¹ãã§ã¯ã垯åå¹
ã®ãµã€ãºïŒåž¯åå¹
ïŒãããŒãã«ã«å
¥åãããŸãã
ããããããããããã«ãOptaneã ãã§ãªããä»ã®NVMeãã©ã€ãã®çµæã衚瀺ããŸããããã¯Intel P 4510ã§ãããä»ã®ã¡ãŒã«ãŒã®ãã©ã€ãã§ã
-Micron ïŒ
é§åã¢ãã« | ãã£ã¹ã¯å®¹éGB | ã©ã³ããªãŒã 4k åäºé¢äœ = 128 | randwrite 4k åäºé¢äœ = 128 | ã©ã³ããªãŒã 4M åäºé¢äœ = 32 | randwrite 4M åäºé¢äœ = 32 | ã©ã³ããªãŒã 4k åäºé¢äœ = 1 | randwrite 4k åäºé¢äœ = 16 | ã©ã³ããªãŒã 4k åäºé¢äœ = 1 | randwrite 4k åäºé¢äœ = 1 |
Intel P4800 X | 750 GB | 40äž | 324k | 2663 | 2382 | 399k | 362k | 373k | 76.1k |
Intel P4510 | 1 TB | 335k | 179k | 2340 | 504 | 142k | 143k | 12.3k | 73.5k |
ãã¯ãã³MTFDHA X1T6MCE | 1.6 TB | 387k | 201k | 2933 | 754 | 80.6k | 146k | 8425 | 27.4k |
ã芧ã®ãšãããäžéšã®ãã¹ãã§ã¯ãOptaneã¯ä»ã®ãã©ã€ãã®åæ§ã®ãã¹ãã®çµæãããæ°åé«ãæ°å€ã瀺ããŠããŸãã
ãããããã£ã¹ã¯ããã©ââãŒãã³ã¹ã«ã€ããŠå€ããå°ãªãã客芳çãªå€æãäžãããã«ã¯ãIOPSã®æ°ã ãã§ã¯æããã«ååã§ã¯ãããŸããã ãã®ãã©ã¡ãŒã¿ãŒèªäœã¯ãå¥ã®ã¬ã€ãã³ã·ãŒã«é¢ä¿ãªãäœãæå³ããŸããã
åŸ
æ©æéãšã¯ãã¢ããªã±ãŒã·ã§ã³ããéä¿¡ãããI / OèŠæ±ãå®è¡ãããæéã§ãã åãfioãŠãŒãã£ãªãã£ã䜿çšããŠæž¬å®ãããŸãã ãã¹ãŠã®ãã¹ããå®äºãããšã次ã®åºåãã³ã³ãœãŒã«ã«è¡šç€ºãããŸãïŒå°ããªãã©ã°ã¡ã³ãã®ã¿ã衚瀺ãããŸãïŒã
Jobs: 1 (f=1): [w(1),_(11)][100.0%][r=0KiB/s,w=953MiB/s][r=0,w=244k IOPS][eta 00m:00s] writers: (groupid=0, jobs=1): err= 0: pid=14699: Thu Dec 14 11:04:48 2017 write: IOPS=46.8k, BW=183MiB/s (192MB/s)(699GiB/3916803msec) slat (nsec): min=1159, max=12044k, avg=2379.65, stdev=3040.91 clat (usec): min=7, max=12122, avg=168.32, stdev=98.13 lat (usec): min=11, max=12126, avg=170.75, stdev=97.11 clat percentiles (usec): | 1.00th=[ 29], 5.00th=[ 30], 10.00th=[ 40], 20.00th=[ 47], | 30.00th=[ 137], 40.00th=[ 143], 50.00th=[ 151], 60.00th=[ 169], | 70.00th=[ 253], 80.00th=[ 281], 90.00th=[ 302], 95.00th=[ 326], | 99.00th=[ 363], 99.50th=[ 379], 99.90th=[ 412], 99.95th=[ 429], | 99.99th=[ 457]
次ã®ã¹ããããã«æ³šæããŠãã ããã
slat (nsec): min=1159, max=12044k, avg=2379.65, stdev=3040.91 clat (usec): min=7, max=12122, avg=168.32, stdev=98.13 lat (usec): min=11, max=12126, avg=170.75, stdev=97.11
ãããã¯ããã¹ãäžã«åãåã£ãã¬ã€ãã³ã·å€ã§ãã ç§ãã¡ã«ãšã£ãŠæ倧ã®é¢å¿äºã¯
Slatã¯ãªã¯ãšã¹ããéä¿¡ãããæéïŒã€ãŸãããã£ã¹ã¯ã§ã¯ãªãLinux I / Oãµãã·ã¹ãã ã®ããã©ãŒãã³ã¹ã«é¢é£ãããã©ã¡ãŒã¿ãŒïŒã§ãããclatã¯ããããå®å
šãªã¬ã€ãã³ã·ãŒãã€ãŸã ããã€ã¹ããåä¿¡ãããªã¯ãšã¹ãã®å®è¡æéïŒãã®ãã©ã¡ãŒã¿ãŒãéèŠã§ãïŒã ãããã®æ°å€ãåæããæ¹æ³ã¯ã5幎åã«å
¬éããã
ãã®èšäºã«è©³ããèšèŒãããŠããŸããããã®é¢é£æ§ã¯å€±ãããŠããŸããã
Fioã¯äžè¬çã«åãå
¥ããããå®è©ã®ãããŠãŒãã£ãªãã£ã§ãããå®éã«ã¯ãé
延æéã«é¢ããããæ£ç¢ºãªæ
å ±ãååŸãããã®ã€ã³ãžã±ãŒã¿ãŒã®å€ãé«ãããå Žåã«èããããçç±ãç¹å®ããå¿
èŠãããå ŽåããããŸãã ããæ£ç¢ºãªèšºæã®ããã®ããŒã«ã¯ã
iovisorãããžã§ã¯ãã®äžéšãšããŠéçºãããŠã
ãŸã ïŒ
GitHubã®
ãªããžããªãåç
§ããŠãã ããããããã®ããŒã«ã¯ãã¹ãŠã
eBPFã¡ã«ããºã
ïŒæ¡åŒµããŒã¯ã¬ãŒãã±ãããã£ã«ã¿ãŒïŒã«åºã¥ããŠããŸããã·ã¹ãã ã®å
¥åºåæäœãå®è¡ããããããã®é
延æéã枬å®ããŸãã
ããã¯ãå€æ°ã®èªã¿åãããã³æžã蟌ã¿èŠæ±ãå®è¡ããããã£ã¹ã¯ã®ããã©ãŒãã³ã¹ã«åé¡ãããå ŽåïŒããšãã°ãè² è·ã®é«ãWebãããžã§ã¯ãã®ããŒã¿ããŒã¹ããã£ã¹ã¯äžã«ããå ŽåïŒã«éåžžã«åœ¹ç«ã¡ãŸãã
æãåçŽãªãªãã·ã§ã³ããå§ããŸãããæšæºã®fioãã¹ããå®è¡ããå¥ã®ç«¯æ«ã§å®è¡ãããbiosnoopã䜿çšããŠåæäœã®ã¬ã€ãã³ã·ã枬å®ããŸããã æäœäžãbiosnoopã¯æ¬¡ã®è¡šãæšæºåºåã«æžã蟌ã¿ãŸãã
TIME(s) COMM PID DISK T SECTOR BYTES LAT(ms) 300.271456000 fio 34161 nvme0n1 W 963474808 4096 0.01 300.271473000 fio 34161 nvme0n1 W 1861294368 4096 0.01 300.271491000 fio 34161 nvme0n1 W 715773904 4096 0.01 300.271508000 fio 34161 nvme0n1 W 1330778528 4096 0.01 300.271526000 fio 34161 nvme0n1 W 162922568 4096 0.01 300.271543000 fio 34161 nvme0n1 W 1291408728 4096 0.01
ãã®ããŒãã«ã¯8åã§æ§æãããŠããŸãã
- TIME -Unixã¿ã€ã ã¹ã¿ã³ã圢åŒã®æäœã®æéã
- COMM-æäœãå®è¡ããããã»ã¹ã®ååã
- PID-æäœãå®è¡ããããã»ã¹ã®PIDã
- T-æäœã®ã¿ã€ãïŒR-èªã¿åããW-æžã蟌ã¿ïŒ;
- ã»ã¯ã¿ãŒ -èšé²ãè¡ãããã»ã¯ã¿ãŒã
- BYTES-èšé²ããããããã¯ã®ãµã€ãºã
- LATïŒmsïŒ -æäœã®é
延æéã
ããŸããŸãªãã£ã¹ã¯ã«ã€ããŠå€ãã®æž¬å®ãè¡ãã次ã®ããšã«æ³šæãä¿ããŸããïŒãã¹ãå
šäœïŒããã³ãã¹ãã®æéã¯20åãã4æéã®ç¯å²ïŒã®Optaneã§ã¯ãã¬ã€ãã³ã·ãã©ã¡ãŒã¿ãŒã¯å€æŽããããäžã®è¡šã«èšèŒãããŠãã10ÎŒsã®å€ã«å¯Ÿå¿ããŸããä»ã®ãã©ã€ãã«ã¯å€åããããŸãã
åæãã¹ãã®çµæã«åºã¥ããŠãOptaneããã³é«è² è·äžã§åªããããã©ãŒãã³ã¹ãçºæ®ããæãéèŠãªã®ã¯äœã¬ã€ãã³ã·ãŒã§ãããšæ³å®ããããšã¯å®å
šã«å¯èœã§ãã ãã®ãããçŽç²ãªãåæãã«ãšã©ãŸãããå®éã®ïŒãŸãã¯å°ãªããšãå¯èœãªéãå®éã®ïŒè² è·ã§ãã¹ããå®è¡ããªãããšã«ããŸããã
ãããè¡ãã«ã¯ãRocksDBã«å«ãŸããããã©ãŒãã³ã¹æž¬å®ããŒã«ã䜿çšããŸãããããã¯ãFacebookã«ãã£ãŠéçºãããèå³æ·±ãæé·ãç¶ããããŒãšå€ã®ãã¢ã®ãªããžããªã§ãã 以äžã§ã¯ãå®è¡ãããã¹ãã®è©³çŽ°ã説æãããã®çµæãåæããŸãã
Optaneããã³RocksDBïŒããã©ãŒãã³ã¹ãã¹ã
RocksDBãéžã¶çç±
è¿å¹Žã倧éã®ããŒã¿ã®ãã©ãŒã«ããã¬ã©ã³ãã¹ãã¬ãŒãžã«å¯ŸããéèŠãæ¥æ¿ã«æ¡å€§ããŠããŸãã ãœãŒã·ã£ã«ãããã¯ãŒã¯ãäŒæ¥æ
å ±ã·ã¹ãã ãã€ã³ã¹ã¿ã³ãã¡ãã»ã³ãžã£ãŒãã¯ã©ãŠãã¹ãã¬ãŒãžãªã©ãããŸããŸãªåéã§å¿çšãããŠããŸãã ãã®ãããªã¹ãã¬ãŒãžã®ãœãããŠã§ã¢ãœãªã¥ãŒã·ã§ã³ã¯ãååãšããŠããããã
LSMããªãŒã«åºã¥ããŠæ§ç¯ãã
ãŸã -äŸãšããŠãBig TableãHBaseãCassandraãLevelDBãRiakãMongoDBãInfluxDBãåŒçšã§ããŸãã ãããã®æäœã«ã¯ããã£ã¹ã¯ãµãã·ã¹ãã ãå«ãæ·±å»ãªè² è·ã䌎ã
ãŸã ãããšãã°ã
ãã¡ããåç
§ããŠãã ããã Optaneã¯ããã®ãã¹ãŠã®èä¹
æ§ãšèä¹
æ§ãåããŠãããå®å
šã«é©åãªãœãªã¥ãŒã·ã§ã³ãšãªããŸãã
RocksdDB ïŒ
GitHubãªããžããªãåç
§ïŒã¯ãFacebookã«ãã£ãŠéçºãããKey-Valueãªããžããªã§ãããæªåé«ã
LevelDBãããžã§ã¯ãã®ãã©ãŒã¯ã§ãã
MySQLã®ã¹ãã¬ãŒãžãšã³ãžã³ã®ç·šæããã¢ããªã±ãŒã·ã§ã³ããŒã¿ã®ãã£ãã·ã¥ãŸã§ãããŸããŸãªã¿ã¹ã¯ã解決ããããã«äœ¿çšãããŸãã
次ã®èæ
®äºé
ã«åŸã£ãŠããã¹ãçšã«éžæããŸããã
- RocksDBã¯ãNVMeãå«ãé«éãã©ã€ãå°çšã«äœæãããã¹ãã¬ãŒãžãšããŠäœçœ®ä»ããããŠããŸãã
- RocksDBã¯ãè² è·ã®é«ãFacebookãããžã§ã¯ãã§æ£åžžã«äœ¿çšãããŠããŸãã
- RocksDBã«ã¯ãéåžžã«æ·±å»ãªè² è·ããããèå³æ·±ããã¹ããŠãŒãã£ãªãã£ãå«ãŸããŠããŸãïŒä»¥äžã®è©³çŽ°ãåç
§ïŒã
- æåŸã«ãä¿¡é Œæ§ãšå®å®æ§ãåããOptaneãã©ã®ããã«éãè² è·ã«èããããããç¥ããããšæããŸããã
以äžã§èª¬æãããã¹ãŠã®ãã¹ãã¯ã2ã€ã®ãã£ã¹ã¯ã§å®è¡ãããŸããã
- Intel Optane SSD 750 GB
- ãã¯ãã³MTFDHAX1T6MCE
ãã¹ãã®æºåïŒRocksDBã®ã³ã³ãã€ã«ãšããŒã¹ã®äœæ
GitHubã§å
¬éãããŠãããœãŒã¹ã³ãŒãããRocksDBãã³ã³ãã€ã«ããŸããïŒãããšä»¥äžã¯Ubuntu 16.04ã®ã³ãã³ãã®äŸã§ãïŒã
$ sudo apt-get install libgflags-dev libsnappy-dev zlib1g-dev libbz2-dev liblz4-dev libzstd-dev gcc g++ clang make git $ git clone https://github.com/facebook/rocksdb/ $ cd rocksdb $ make all
ã€ã³ã¹ããŒã«åŸãããŒã¿ãæžã蟌ããã¹ãçšã®ãã£ã¹ã¯ãæºåããå¿
èŠããããŸãã
RocksDBã®å
¬åŒããã¥ã¡ã³ãã§ã¯ãOptaneã§äœæããXFSãã¡ã€ã«ã·ã¹ãã ã䜿çšããããšããå§ãããŸãã
$ sudo apt-get install xfsprogs $ mkfs.xfs -f /dev/nvme0n1 $ mkdir /mnt/rocksdb $ mount -t xfs /dev/nvme0n1 /mnt/rocksdb
ããã§ãæºåäœæ¥ãå®äºããããŒã¿ããŒã¹ã®äœæã«é²ãããšãã§ããŸãã
RocksDBã¯ãå€å
žçãªæå³ã§ã®DBMSã§ã¯ãããŸãããããŒã¿ããŒã¹ãäœæããã«ã¯ãCãŸãã¯C ++ã§å°ããªããã°ã©ã ãäœæããå¿
èŠããããŸãã ãã®ãããªããã°ã©ã ã®äŸïŒ1ããã³2ïŒã¯ãexamplesãã£ã¬ã¯ããªã®å
¬åŒRocksDBãªããžããªã«ãããŸãã ãœãŒã¹ã³ãŒãã«ããã€ãã®å€æŽãå ããããŒã¿ããŒã¹ãžã®æ£ãããã¹ã瀺ãå¿
èŠããããŸãã ãã®å Žåã次ã®ããã«ãªããŸãã
$ cd rockdb/examples $ vi simple_example.cc
ãã®ãã¡ã€ã«ã§ã¯ãè¡ãèŠã€ããå¿
èŠããããŸã
std::string kDBPath ="/tmp/rocksdb_simple_example"
ãããŠãããŒã¿ããŒã¹ãžã®ãã¹ãèšè¿°ããŸãã
std::string kDBPath ="/mnt/rocksdb/testdb1"
ãã®åŸãã³ã³ãã€ã«ã«é²ãå¿
èŠããããŸãã
$ make $ ./simple_example
ãã®ã³ãã³ããå®è¡ãããšãæå®ãããã£ã¬ã¯ããªã«ããŒã¿ããŒã¹ãäœæãããŸãã ãã¹ãã§ããŒã¿ãæžã蟌ã¿ãŸãïŒãããŠèªã¿åããŸãïŒã db_benchãŠãŒãã£ãªãã£ã䜿çšããŠãã¹ãããŸãã 察å¿ãããã€ããªãã¡ã€ã«ã¯RocksDBã®ã«ãŒããã£ã¬ã¯ããªã«ãããŸãã
ãã¹ãæ¹æ³ã¯ã
å
¬åŒãããžã§ã¯ãwikiããŒãžã§è©³çŽ°ã«èª¬æãããŠã
ãŸã ã
ãªã³ã¯ã®ããã¹ãã泚ææ·±ãèªããšããã¹ãã®æå³ã¯10ååã®ããŒãããŒã¿ããŒã¹ã«æžã蟌ãããšã§ããããšãããããŸãïŒãããŠããã®ããŒã¿ããŒã¹ããã®ããŒã¿ã®ãã®åŸã®èªã¿åãã§ïŒã ãã¹ãŠã®ããŒã¿ã®åèšéã¯çŽ800 GBã§ãã Optaneã®ããªã¥ãŒã ã¯750 GBãããããŸããã ãããã£ãŠããã¹ãã®ããŒæ°ãæ£ç¢ºã«ååã«æžãããŸããã10åã§ã¯ãªãã5åã§ãã Optaneã®æ©èœãå®èšŒããã«ã¯ããã®æ°å€ã§ååã§ãã
ãã®å Žåãèšé²ãããããŒã¿ã®éã¯çŽ350 GBã§ãã
ãã®ããŒã¿ã¯ãã¹ãŠ
SST圢åŒã§ä¿åãããŸãïŒ
Sorted String Tableã®ç¥ã
ãã®èšäºãåç
§ã
ãŠãã ãã ïŒã åºåã§ã¯ãæ°åã®ããããSSTãã¡ã€ã«ãååŸããŸãïŒè©³çŽ°ã«ã€ããŠã¯ã
ãã¡ããã芧ãã ãã ã
ãã¹ããéå§ããåã«ãã·ã¹ãã ã§åæã«éãããšãã§ãããã¡ã€ã«ã®æ°ã®å¶éãå¢ããå¿
èŠããããŸããããããªããšäœãæ©èœããŸããããã¹ãã®éå§ããçŽ15ã20ååŸã«ããToo many open filesããšããã¡ãã»ãŒãžã衚瀺ãããŸãã
ãã¹ãŠãæ£åžžã«å®è¡ãããããã«ãnãªãã·ã§ã³ãæå®ããŠulimitã³ãã³ããå®è¡ããŸãã
$ ulimit -n
ããã©ã«ãã§ã¯ãã·ã¹ãã ã«ã¯1024ãã¡ã€ã«ã®å¶éããããŸãã åé¡ãåé¿ããããã«ãããã«100äžã«å¢ãããŸãã
$ ulimit -n 1000000
泚ïŒåèµ·ååŸããã®å¶éã¯ä¿åããããããã©ã«ãå€ã«æ»ããŸãã
以äžã§æºåäœæ¥ã¯å®äºã§ãã ãã¹ããçŽæ¥èšè¿°ããçµæãåæããŸãã
ãã¹ãã®èª¬æ
ã¯ããã«
äžèšã®ãªã³ã¯ã§èª¬æããææ³ã«åºã¥ããŠã次ã®ãã¹ããå®æœããŸããã
- é£ç¶ããé åºã§ã®ãã«ã¯ããŒã®ããŒãã
- ã©ã³ãã ãªããŒã®äžæ¬èªã¿èŸŒã¿ã
- ã©ã³ãã èšé²;
- ã©ã³ãã èªã¿åãã
ãã¹ãŠã®ãã¹ãã¯db_benchãŠãŒãã£ãªãã£ã䜿çšããŠå®è¡ãããŸããããã®ãŠãŒãã£ãªãã£ã®ãœãŒã¹ã³ãŒãã¯
rocksdbãªããžããªã«ãããŸã ã
åããŒã®ãµã€ãºã¯10ãã€ãã§ããµã€ãºã¯800ãã€ãã§ãã
åãã¹ãã®çµæããã詳现ã«æ€èšããŠãã ããã
ãã¹ã1.ããŒãé çªã«å€§éããŒããã
ãã®ãã¹ããå®è¡ããããã«ãäžèšã®ãªã³ã¯ã®æ瀺ã«ç€ºãããŠããã®ãšåããã©ã¡ãŒã¿ãŒã䜿çšããŸããã èšé²ãããããŒã®æ°ã®ã¿ãå€æŽããŸããïŒæ¢ã«è¿°ã¹ãŸããïŒïŒ1,000,000,000ã§ã¯ãªãã500,000,000ã§ãã
æåã¯ãããŒã¹ã¯ç©ºã§ãã ãã¹ãäžã«èšå
¥ãããŸãã ããŒã¿ã®èªã¿èŸŒã¿äžã«ããŒã¿ãèªã¿èŸŒãŸããŠããŸããã
ãã¹ããå®è¡ããdb_benchã³ãã³ãã¯æ¬¡ã®ããã«ãªããŸãã
bpl=10485760;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; \ mbc=20; mb=67108864;wbs=134217728; sync=0; r=50000000 t=1; vs=800; \ bs=4096; cs=1048576; of=500000; si=1000000; \ ./db_bench \ --benchmarks=fillseq --disable_seek_compaction=1 --mmap_read=0 \ --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs \ --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=6 \ --open_files=$of --verify_checksum=1 --sync=$sync --disable_wal=1 \ --compression_type=none --stats_interval=$si --compression_ratio=0.5 \ --write_buffer_size=$wbs --target_file_size_base=$mb \ --max_write_buffer_number=$wbn --max_background_compactions=$mbc \ --level0_file_num_compaction_trigger=$ctrig \ --level0_slowdown_writes_trigger=$delay \ --level0_stop_writes_trigger=$stop --num_levels=$levels \ --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz \ --stats_per_interval=1 --max_bytes_for_level_base=$bpl \ --use_existing_db=0 --db=/mnt/rocksdb/testdb
ã³ãã³ãã«ã¯ãã³ã¡ã³ãããå¿
èŠãããå€ãã®ãªãã·ã§ã³ãå«ãŸããŠããŸãããããã®ãªãã·ã§ã³ã¯ãåŸç¶ã®ãã¹ãã§äœ¿çšãããŸãã æåã«ãéèŠãªãã©ã¡ãŒã¿ãŒã®å€ãèšå®ããŸãã
- bpl-ã¬ãã«ããšã®æ倧ãã€ãæ°ã
- mcz-æå°å§çž®ã¬ãã«ã
- del-å€ããªã£ããã¡ã€ã«ãåé€ããå¿
èŠãããæéã
- ã¬ãã«-ã¬ãã«ã®æ°;
- ctrig-å§çž®ãéå§ããå¿
èŠããããã¡ã€ã«ã®æ°ã
- é
延-èšé²é床ãé
ãããå¿
èŠãããæé;
- stop-èšé²ãåæ¢ããå¿
èŠãããæéã
- wbn-æžã蟌ã¿ãããã¡ã®æ倧æ°ã
- mbcã¯ãããã¯ã°ã©ãŠã³ãå§çž®ã®æ倧æ°ã§ãã
- mbã¯æžã蟌ã¿ãããã¡ã®æ倧æ°ã§ãã
- wbs-æžã蟌ã¿ãããã¡ãµã€ãºã
- sync-åæãæå¹/ç¡å¹ã«ããŸãã
- rã¯ãããŒã¿ããŒã¹ã«æžã蟌ãŸããããŒãšå€ã®ãã¢ã®æ°ã§ãã
- tã¯ã¹ã¬ããã®æ°ã§ãã
- vsã¯å€ã§ãã
- bsã¯ãããã¯ãµã€ãºã§ãã
- cs-ãã£ãã·ã¥ãµã€ãºã
- f-éããŠãããã¡ã€ã«ã®æ°ïŒæ©èœããŸãããäžèšã®ã³ã¡ã³ããåç
§ïŒ;
- si-çµ±èšåéã®é »åºŠã
ã³ãã³ããå®è¡ããŠãæ®ãã®ãã©ã¡ãŒã¿ãŒã®è©³çŽ°ãèªãããšãã§ããŸã
./db_bench --help
ãã¹ãŠã®ãªãã·ã§ã³ã®è©³çŽ°ãªèª¬æã
ããã«èšèŒãããŠã
ãŸã ã
ãã¹ãã§ã¯ã©ã®ãããªçµæã瀺ãããŸãããïŒ é 次ããŠã³ããŒãæäœã¯
23åã§å®äºããŸããã æžã蟌ã¿é床ã¯
536.78 MB / sã§ããã
æ¯èŒã®ããã«ïŒMicron NVMeãã©ã€ãã§ã¯ãåãæé ã«
30å匷ããããæžã蟌ã¿é床ã¯
380.31 MB / sã§ãã
ãã¹ã2.ããŒãã©ã³ãã ã«ãã«ã¯ããŒããã
ã©ã³ãã èšé²ããã¹ãããããã«ã次ã®db_benchèšå®ã䜿çšãããŸããïŒã³ãã³ãã®å®å
šãªãªã¹ãã瀺ããŸãïŒã
bpl=10485760;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; \ mb=1073741824;wbs=268435456; sync=0; r=50000000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; \ ./db_bench \ --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 \ --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 \ --cache_numshardbits=4 --open_files=$of --verify_checksum=1 \ --sync=$sync --disable_wal=1 --compression_type=zlib --stats_interval=$si --compression_ratio=0.5 \ --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn \ --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig \ --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels \ --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz \ --stats_per_interval=1 --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 \ --disable_auto_compactions=1 --allow_concurrent_memtable_write=false --db=/mnt/rocksdb/testb1
ãã®ãã¹ãã«ã¯
1æé6åããã ãæžã蟌ã¿é床ã¯273.36 MB /ç§ã§ããã Microneã§ã¯ããã¹ãã¯
3æé30åã§å®è¡ãããèšé²é床ã¯ç°ãªããŸããå¹³åå€ã¯
49.7 MB / sã§ãã
ãã¹ã3.ã©ã³ãã èšé²
ãã®ãã¹ãã§ã¯ã5ååã®ããŒã以åã«äœæããããŒã¿ããŒã¹ã«æžãæããããšããŸããã
db_benchã³ãã³ãã®å®å
šãªãªã¹ãã次ã«ç€ºããŸãã
bpl=10485760;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; \ mbc=20; mb=67108864;wbs=134217728; sync=0; r=500000000; t=1; vs=800; \ bs=65536; cs=1048576; of=500000; si=1000000; \ ./db_bench \ --benchmarks=overwrite --disable_seek_compaction=1 --mmap_read=0 --statistics=1 \ --histogram=1 --num=$r --threads=$t --value_size=$vs --block_size=$bs \ --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of \ --verify_checksum=1 --sync=$sync --disable_wal=1 \ --compression_type=zlib --stats_interval=$si --compression_ratio=0.5 \ --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn \ --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig \ --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop \ --num_levels=$levels --delete_obsolete_files_period_micros=$del \ --min_level_to_compress=$mcz --stats_per_interval=1 \ --max_bytes_for_level_base=$bpl --use_existing_db=/mnt/rocksdb/testdb
ãã®ãã¹ãã§ã¯ã
49 MB /ç§ã®é床ã§
2æé51åãšããéåžžã«è¯ãçµæãåŸãããŸããïŒçŸæç¹ã§ã¯
38 MB /ç§ã«æžå°ããŸããïŒã
Microneã§ã¯ããã¹ãã«å°ã
æéãããããŸãïŒ
3æé16åïŒ ã é床ã¯ã»ãŒåãã§ãããå€åã¯ããé¡èã§ãã
ãã¹ã4.ã©ã³ãã èªã¿åã
ãã®ãã¹ãã®æå³ã¯ãããŒã¿ããŒã¹ãã5ååã®ããŒãã©ã³ãã ã«èªã¿åãããšã§ãã 以äžã¯ãdb_benchã³ãã³ããšãã¹ãŠã®ãªãã·ã§ã³ã®å®å
šãªãªã¹ãã§ãã
bpl=10485760;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; \ mbc=20; mb=67108864;wbs=134217728; sync=0; r=500000000; t=1; vs=800; \ bs=4096; cs=1048576; of=500000; si=1000000; \ ./db_bench \ --benchmarks=fillseq --disable_seek_compaction=1 --mmap_read=0 \ --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs \ --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=6 \ --open_files=$of --verify_checksum=1 --sync=$sync --disable_wal=1 \ --compression_type=none --stats_interval=$si --compression_ratio=0.5 \ --write_buffer_size=$wbs --target_file_size_base=$mb \ --max_write_buffer_number=$wbn --max_background_compactions=$mbc \ --level0_file_num_compaction_trigger=$ctrig \ --level0_slowdown_writes_trigger=$delay \ --level0_stop_writes_trigger=$stop --num_levels=$levels \ --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz \ --stats_per_interval=1 --max_bytes_for_level_base=$bpl \ --use_existing_db=0 bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; \ stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; sync=0; r=500000000; \ t=32; vs=800; bs=4096; cs=1048576; of=500000; si=1000000; \ ./db_bench \ --benchmarks=readrandom --disable_seek_compaction=1 --mmap_read=0 \ --statistics=1 --histogram=1 --num=$r --threads=$t --value_size=$vs \ --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=6 \ --open_files=$of --verify_checksum=1 --sync=$sync --disable_wal=1 \ --compression_type=none --stats_interval=$si --compression_ratio=0.5 \ --write_buffer_size=$wbs --target_file_size_base=$mb \ --max_write_buffer_number=$wbn --max_background_compactions=$mbc \ --level0_file_num_compaction_trigger=$ctrig \ --level0_slowdown_writes_trigger=$delay \ --level0_stop_writes_trigger=$stop --num_levels=$levels \ --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz \ --stats_per_interval=1 --max_bytes_for_level_base=$bpl \ --use_existing_db=1
, : , . db_bench .
32 . .
Optane
5 2 , Microne â
6 .
ãããã«
Intel Optane SSD 750 . , . , . Intel .
, Optane . , . Optane . : , .
Optane ,
- .
IMDT (Intel Memory Drive) . , . .