ã«ãŒãã«ã¯ãã¹ãŠã®æªã®æ ¹æºã§ãã
ããã§ãã€ãã³ãããŒã©ãŒã§epoll
ïŒïŒ /
kqueueïŒïŒã䜿çšããŠã誰ãé©ããªãã§ãããã
C10Kã®åé¡ã解決ããããã«ãããŸããŸãªãœãªã¥ãŒã·ã§ã³ïŒ
libevent /
libev /
libuv ïŒããããããŸããŸãªããã©ãŒãã³ã¹ãšããªãé«ããªãŒããŒãããããããŸãã ãã®èšäºã§ã¯ã
DPDKã䜿çšããŠ1,000äžã®æ¥ç¶ïŒC10MïŒãåŠçããåé¡ã解決ããäžè¬çãªã¢ããªã±ãŒã·ã§ã³ãœãªã¥ãŒã·ã§ã³ã§ãããã¯ãŒã¯èŠæ±ãåŠçããéã«æå€§ã®ããã©ãŒãã³ã¹ãå®çŸããæ¹æ³ã«ã€ããŠèª¬æããŸãã ãã®ã¿ã¹ã¯ã®äž»ãªæ©èœã¯ãOSã«ãŒãã«ãããŠãŒã¶ãŒç©ºéïŒãŠãŒã¶ãŒç©ºéïŒãžã®ãã©ãã£ãã¯ãåŠçãã責任ã®å§ä»»ãå²ã蟌ã¿ãš
DMAãã£ãã«ã®åŠçã®æ£ç¢ºãªå¶åŸ¡ã
VFIOã®äœ¿çšããã®ä»ã®ããŸãæç¢ºã§ãªãèšèã§ãã Java
Nettyã¯ã
Disruptorãã¿ãŒã³ãš
ãªãããŒããã£ãã·ã¥ã䜿çšããŠãã¿ãŒã²ããã¢ããªã±ãŒã·ã§ã³ç°å¢ãšããŠéžæãããŸããã
èŠããã«ãããã¯æ¢åã®ããŒããŠã§ã¢ãœãªã¥ãŒã·ã§ã³ãšåæ§ã®ããã©ãŒãã³ã¹ã®ãã©ãã£ãã¯ãåŠçããéåžžã«å¹ççãªæ¹æ³ã§ãã OSã«ãŒãã«èªäœãæäŸããè³éã䜿çšãããªãŒããŒãããã¯é«ãããããããã®ãããªã¿ã¹ã¯ã§ã¯ã»ãšãã©ã®åé¡ã®åå ã«ãªããŸãã åé¡ã¯ãã¿ãŒã²ãããããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ã®ãã©ã€ããŒããã®ãµããŒããšãã¢ããªã±ãŒã·ã§ã³å
šäœã®ã¢ãŒããã¯ãã£æ©èœã«ãããŸãã
ãã®èšäºã§ã¯ã髿§èœãœãªã¥ãŒã·ã§ã³ãæ§ç¯ããããã®ã
DPDKã®åé¡ã®ã€ã³ã¹ããŒã«ãæ§æã䜿çšããããã°ããããã¡ã€ãªã³ã°ãããã³å±éã«ã€ããŠè©³ãã説æããŠããŸãã
Netmap ã
OpenOnloadãããã³
pf_ringããããŸãã
ãããããã
netmapã®éçºã«ãããäž»ãªã¿ã¹ã¯ã¯ã䜿ãããããœãªã¥ãŒã·ã§ã³ãéçºããããšã§ããããããã£ãŠãæ¢åã®ãœãªã¥ãŒã·ã§ã³ã®ç§»æ€ã倧å¹
ã«ç°¡çŽ åã§ããæãäžè¬çãªåæ
selectïŒïŒã€ã³ã¿ãŒãã§ã€ã¹ãæäŸãããŸãã
netmap 'ironã®æè»æ§ãšæœè±¡åã®èгç¹ããã¯ãæããã«ååãªæ©èœããããŸããã ããã«ãããããããããã¯æãæé ãªäŸ¡æ Œã§åºãæ®åããŠãããœãªã¥ãŒã·ã§ã³ã§ãïŒ
ç¥ã®ããªã Windowsã§ãïŒã Netmapã¯
freebsdã«çŽæ¥ä»å±ãããã
ã«ãªãã
libpcapåŽã®ãµããŒããããªãè¯å¥œã«ãªããŸããã Luigi RizzoãšAlessio Fainaã®æ¯æŽãåããŠãããµå€§åŠã®ãããžã§ã¯ãã§ãã åœç¶ã忥çãªãµããŒãã«ã€ããŠã®è©±ã¯ãããŸããããè±èœãããã®ããªãããã«äœãããŠããŸãã
pf_ring
pf_ringã¯
pcap 'aãããªãŒããŒã¯ããã¯ãããææ®µãšããŠç»å ŽããŸãããæŽå²çã«ãéçºã®æç¹ã§ããã«äœ¿çšã§ããå®å®ãããœãªã¥ãŒã·ã§ã³ã¯ãããŸããã§ããã åãããããããã«æ¯ã¹ãŠå€ãã®æãããªå©ç¹ã¯ãããŸããããç¬èªã®
ZCããŒãžã§ã³ã§ã¯
IOMMUããµããŒããããŠããŸãã 倪å€ããã補åèªäœã¯é«æ§èœãå質ãç¹åŸŽãšããŠãããã
pcapãã³ããåéããã³åæããææ®µã«éããããŠãŒã¶ãŒã¢ããªã±ãŒã·ã§ã³ã§ãã©ãã£ãã¯ãåŠçããããšãç®çãšããŠããŸããã§ããã
pf_ring 'a
ZCã®äž»ãªæ©èœã¯ãæ¢åã®ãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ãã©ã€ããŒããã®å®å
šãªç¬ç«æ§ã§ãã
Openonload
OpenOnloadã¯ãSolarFlareã®é«åºŠã«å°éåããã髿§èœã®
å€ä»£ãããã¯ãŒã¯ã¹ã¿ãã¯
ã§ã ã
HP ã
IBM ã
Lenovo ã
Stratusåãã®ãã©ã³ãåããã10 / 40GbEã¢ããã¿ãŒã補é ããŠããŸãã æ®å¿µãªããã
OpenOnloadèªäœ
ã¯ãã¹ãŠã®æ¢åã®
SolarFlareã¢ããã¿ãŒããµããŒã
ããŠããŸããã
OpenOnloadã®äž»ãªæ©èœã¯ã
epoll ïŒïŒã¡ã«ããºã ãå«ãã
BSDãœã±ããAPIã®å®å
šãªçœ®ãæãã§ãã ã¯ããçŸåšãããªãã®
nginxã¯ãµãŒãããŒãã£ã®ä¿®æ£ãªãã§
38Gbitã®ããŒã
æã¡è² ããããšãã§ããŸãã
SolarFlareã¯åçšãµããŒããæäŸããŠãããå€ãã®å°æ¬ãã¹ã顧客ãããŸãã
OpenOnloadã®ä»®æ³åãã©ã®ããã«æ©èœãããã¯ããããŸãããã
nginxãã©ã³ãµãŒã®èåŸã«ããã³ã³ãããŒã«åº§ã£ãŠããå Žåãããã¯æãç°¡åã§æé ãªãœãªã¥ãŒã·ã§ã³ã§ããã
äžå¿«ãªãã©ãã«ã¯ãããŸããã 賌å
¥ãã䜿çšããèœã¡ãªãããã«ç¥ã£ãŠãã ãããããããã°èªãããšãã§ããªããªããŸãã
ãã®ä»
Napatechã®ãœãªã¥ãŒã·ã§ã³ããããŸãããç§ãç¥ã
éã ã圌ãã¯
SolarFlareã®ãããª
倩æã®ãªãç¬èªã®APIãåããã©ã€ãã©ãªãæã£ãŠããã ãã§ãã
åœç¶ãæ¢åã®ãã¹ãŠã®ãœãªã¥ãŒã·ã§ã³ãæ€èšããããã§ã¯ãããŸããããã¹ãŠã«ééããããšã¯ã§ããŸããã§ããããããããäžèšã®èª¬æãšå€§ããç°ãªãããšã¯ãªããšæããŸãã
DPDK
æŽå²çã«ã10 / 40GbEã§åäœããããã®æãäžè¬çãªã¢ããã¿ãŒã¯ã
e1000 igb ixgbe i40eãã©ã€ããŒã«ãã£ãŠæäŸããã
Intelã¢ããã¿ãŒã§ãã ãããã£ãŠããããã¯é«æ§èœãã©ãã£ãã¯åŠçããŒã«ã®é »ç¹ãªã¿ãŒã²ããã¢ããã¿ã§ãã ãã®ããã
Netmapãš
pf_ringã䜿çšããŸãããéçºè
ã¯
ããããè¯ã åéã§ãã
Intelããã©ãã£ãã¯ãåŠçããç¬èªã®ææ®µã®éçºãéå§ããªãã£ãå Žåãå¥åŠãªããšã«ãªããŸããããã¯
DPDKã§ãã
DPDKã¯
Intelã®ãªãŒãã³ãœãŒã¹ãããžã§ã¯ãã§ãããã©ã®ãªãã£ã¹ïŒ
6WIND ïŒãæ§ç¯ãããã¡ãŒã«ãŒããã©ã€ããŒïŒ
Mellanoxãªã©ïŒãæææäŸããããšã«åºã¥ããŠããŸãã åœç¶ãããã«åºã¥ããœãªã¥ãŒã·ã§ã³ã®åçšãµããŒãã¯çŽ æŽããããããªãå€ãã®ãã³ããŒïŒ6WINDãAricentãALTEN Calsoft LabsãAdvantechãBrocadeãRadisysãTietoãWind RiverãLannerãMobicaïŒã«ãã£ãŠæäŸãããŸãã
DPDKã¯æãåºç¯ãªæ©èœãåããŠãããæ¢åã®ããŒããŠã§ã¢ãæœè±¡åããŸãã
ããã¯äŸ¿å©ãªãã®ã§ã¯ãããŸããã§ãã-é«ããããããæå€§ã®çç£æ§ãéæããã®ã«ååãªæè»æ§ããããŸããã
ãµããŒããããŠãããã©ã€ããŒãšã«ãŒãã®ãªã¹ã
Linuxã«ãŒãã«ã®
Intelãã¹ãŠã®ãã©ã€ããŒ
- e1000ïŒ82540ã82545ã82546ïŒ
- e1000eïŒ82571..82574ã82583ãICH8..ICH10ãPCH..PCH2ïŒ
- igbïŒ82575..82576ã82580ãI210ãI211ãI350ãI354ãDH89xxïŒ
- ixgbeïŒ82598..82599ãX540ãX550ïŒ
- i40eïŒX710ãXL710ïŒ
- fm10k
ãããã¯ãã¹ãŠããŠãŒã¶ãŒç©ºéã§å®è¡ããããã®
ããŒãªã³ã°ã¢ãŒããã©ã€ããŒãšããŠç§»æ€ãããŸãïŒ
usermode ïŒã
ä»ã«äœãïŒ
å®éãã¯ãããŸã ãµããŒãããããŸã
- QEMU ã Xen ã VMware ESXiã«åºã¥ãä»®æ³å
- ãããã¡ã®ã³ããŒã«åºã¥ãæºä»®æ³åãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹
ãããã¯æª - ãã¹ãçšã®AF_PACKETãœã±ãããšPCAPãã³ã
- ãªã³ã°ãããã¡ãŒãåãããããã¯ãŒã¯ã¢ããã¿ãŒ
DPDKã¢ãŒããã¯ãã£
*é ã®äžã§æ©èœããŠããã®ã§ãçŸå®ã¯å°ãç°ãªãå ŽåããããŸã
DPDKèªäœã¯ãäžé£ã®ã©ã€ãã©ãªïŒ
lib dadã®å
容ïŒã§æ§æãããŠããŸãã
- librte_ acl - VLANã®
CEPã¢ã¯ã»ã¹å¶åŸ¡ãªã¹ã - librte_ compat-ãã€ããªã€ã³ã¿ãŒãã§ã€ã¹ïŒABIïŒäºææ§ã®ãšã¯ã¹ããŒã
- librte_ ether-ã€ãŒãµãããã¢ããã¿ãå¶åŸ¡ããã€ãŒãµããããã¬ãŒã ãæäœããŸã
- librte_ ivshmem - ivshmemãšã®ãããã¡ãŒã®å
±æ
- librte_ kvargs-ããŒãšå€ã®åŒæ°ã®è§£æ
- librte_ mbuf- ã¡ãã»ãŒãžãããã¡ç®¡çïŒ ã¡ãã»ãŒãžãããã¡ -mbufïŒ
- librte_ net -ARP / IPv4 / IPv6 / TCP / UDP / SCTPã䜿çšããBSD'sh IPã¹ã¿ãã¯ã®äžéš
- librte_ power- é»åããã³åšæ³¢æ°ã®ç®¡çïŒ cpufreq ïŒ
- librte_ sched -QOSéå±€ã¹ã±ãžã¥ãŒã©ãŒ
- librte_ vhost-ä»®æ³ãããã¯ãŒã¯ã¢ããã¿ãŒ
- librte_ cfgfile-æ§æãã¡ã€ã«ã®è§£æ
- librte_ ãã£ã¹ããªãã¥ãŒã¿ãŒ -æ¢åã®ã¿ã¹ã¯éã§ããã±ãŒãžãé
åžããææ®µ
- librte_ hash- ããã·ã¥é¢æ°
- librte_ jobstats-ã¿ã¹ã¯å®è¡æéã®æž¬å®
- librte_ lpm-åæ¹è¡šã®æ€çŽ¢ã«äœ¿çšãããæé·ãã¬ãã£ãã¯ã¹äžèŽé¢æ°
- librte_ mempool-ã€ã³ã¡ã¢ãªãªããžã§ã¯ãããŒã«ãããŒãžã£ãŒ
- librte_ pipeline-ããããã¬ãŒã ã¯ãŒã¯ã®ãã€ãã©ã€ã³
- librte_ reorder-ã¡ãã»ãŒãžãããã¡ãŒå
ã®ãã±ãããäžŠã¹æ¿ãã
- librte_ table-ã«ãã¯ã¢ããããŒãã«ã®å®è£
- librte_ cmdline-ã³ãã³ãã©ã€ã³ã§åŒæ°ãè§£æãã
- librte_ eal-ãã©ãããã©ãŒã äŸåç°å¢
- librte_ ip_frag -IPãã±ããã®æçå
- librte_ kni -KNIãšå¯Ÿè©±ããããã®API
- librte_ malloc-æšæž¬ãããã
- librte_ meter -QOSã¡ããªãã¯
- librte_ port-ãããã¯ãŒã¯ãã±ããã®ããŒãå®è£
- librte_ ring- ãªã³ã°ããã¯ããªãŒFIFOãã¥ãŒ
- librte_ timer-ã¿ã€ããŒãšã«ãŠã³ã¿ãŒ
Linuxã§ã®UIOãã©ã€ããŒïŒ
lib / librte_eal / linuxapp ïŒãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ïŒ
- uio_igb-ã€ãŒãµããããããã¯ãŒã¯ã¢ããã¿ãŒ
- xen_dom0-ååããã¯ãªã¢
ããã³BSD
ãŸãããŠãŒã¶ãŒç©ºéïŒuserspaceïŒã§å®è¡ãããåè¿°ã®
ããŒãªã³ã°ã¢ãŒããã©ã€ããŒïŒ
PMD ïŒïŒe1000ãe1000eãigbãixgbeãi40eãfm10kãªã©ã
ã«ãŒãã«ãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ ïŒKNIïŒã¯ãã«ãŒãã«ãããã¯ãŒã¯APIãšå¯Ÿè©±ãã
DPDKã§åäœããã€ã³ã¿ãŒãã§ã€ã¹ã®ããŒãã«
ioctlåŒã³åºãã
è¡ã ãäžè¬çãªãŠãŒãã£ãªãã£ïŒ
ethtool ã
ifconfig ã
tcpdump ïŒã䜿çšããŠãããã管çã§ããããã«ããç¹æ®ãªãã©ã€ããŒã§ãã
ã芧ã®ãšããã
DPDKã¯netmapã«ããä»ã®ãœãªã¥ãŒã·ã§ã³ãšæ¯èŒããŠãããŒããŠã§ã¢ã¢ãŒãã®ããŒã¯ãµã€ãã
é
äºããSDNãå®è£
ããããã®å€ãã®
å©ç¹ãæã£ãŠããŸãã
ã¿ãŒã²ããã·ã¹ãã ã®èŠä»¶ãšåŸ®èª¿æŽ
å
¬åŒææžã®äž»ãªæšå¥šäºé
ã¯ç¿»èš³ãããè£è¶³ãããŠããŸãã
DPDKã䜿çšããããã®
XENããã³
VMwareãã€ããŒãã€ã¶ãŒã®æ§æã®åé¡ã¯
解決ãããŠ
ããŸãã ã
å
šè¬
DPDKã
Intel Communications Chipset 89xxã®äžã«çœ®ããšã次ã®ããã«ãªããŸãã
ãã«ãããã«ã¯ã
coreutils ã
gcc ãã«ãŒãã«ããããŒã
glibcããããŒãå¿
èŠã§ãã
clangããµããŒããããŠããã
Intelã®
iccããµããŒããããŠãã
ããã§ãã
ãã«ããŒã¹ã¯ãªãããå®è¡ããã«ã¯-Python 2.6 / 2.7
Linuxã«ãŒãã«ã¯ãUIOãµããŒããšããã»ã¹ã®ã¢ãã¬ã¹ç©ºéã®ç£èŠã䜿çšããŠã³ã³ãã€ã«ããå¿
èŠããããŸãããããã¯ã«ãŒãã«ãã©ã¡ãŒã¿ãŒã§ãã
CONFIG_UIOCONFIG_UIO_PDRVCONFIG_UIO_PDRV_GENIRQCONFIG_UIO_PCI_GENERICãããŠ
CONFIG_PROC_PAGE_MONITORgrsecurityã§ã¯ãPROC_PAGE_MONITORãã©ã¡ãŒã¿ãŒã¯æ
å ±ãå€ããããšèããããŠãããšããäºå®ã«æ³šæãåèµ·ããããš
æããŸããããã¯ãã«ãŒãã«ã®è匱æ§ãæªçšãã
ASLRããã€ãã¹ããã®ã«åœ¹ç«ã¡ãŸãã
é«ç²ŸåºŠã®å®æçãªå²ã蟌ã¿ãæŽçããã«ã¯ã
HPETã¿ã€ããŒãå¿
èŠã§ãã
ç©ºå®€ç¶æ³ãèŠãããšãã§ããŸã
grep hpet /proc/timer_list
BIOSã§æå¹ã«ããŸã
é«åºŠ-> PCH-IOæ§æ->é«ç²ŸåºŠã¿ã€ããŒ
ãããŠã
CONFIG_HPETããã³
CONFIG_HPET_MMAPãæå¹ã«ããŠã«ãŒãã«ãæ§ç¯ããŸãã
ããã©ã«ãã§ã¯ã
HPDKãµããŒãã¯
DPDKèªäœã§ç¡å¹ã«ãªã£ãŠããããã
config / common_linuxappãã¡ã€ã«ã§CONFIG_RTE_LIBEAL_USE_HPETãã©ã°ãæåã§èšå®ããŠæå¹ã«ããå¿
èŠããã
ãŸã ã
å Žåã«ãã£ãŠã¯ã
HPETã䜿çšããããšããå§ãããŸããä»ã®å Žåã¯
TSCã§ãã
髿§èœãœãªã¥ãŒã·ã§ã³ãå®è£
ããã«ã¯ãç®çãç°ãªããäºãã®æ¬ ç¹ãè£ããããäž¡æ¹ã䜿çšããå¿
èŠããããŸãã éåžžãããã©ã«ãã¯
TSCã§ãã
HPETã¿ã€ããŒã®å¯çšæ§ã®åæåãšç¢ºèªã¯ã
rte_eal_hpet_init ïŒint
make_default ïŒ<
rte_cycles.h >ãåŒã³åºãããšã«ãã£ãŠè¡ãããŸãã APIããã¥ã¡ã³ãã§èŠéããŠããã®ã¯å¥åŠã§ãã
ã³ã¢çµ¶çž
ã·ã¹ãã ã¹ã±ãžã¥ãŒã©ããªãããŒãããã«ã¯ã髿§èœã¢ããªã±ãŒã·ã§ã³ã®ããŒãºã«åãããŠããã»ããµã®è«çã³ã¢ãåé¢ããã®ãäžè¬çã§ãã ããã¯ç¹ã«ãã¥ã¢ã«ããã»ããµã·ã¹ãã ã«åœãŠã¯ãŸããŸãã
ã¢ããªã±ãŒã·ã§ã³ãå¶æ°ã®ã«ãŒãã«2ã4ã6ã8ã10ã§å®è¡ãããŠããå Žå-ã«ãŒãã«ãã©ã¡ãŒã¿ãŒããæ°ã«å
¥ãã®ããŒãããŒããŒã«è¿œå ã§ããŸã
isolcpus = 2,4,6,8,10
åºç¯ãª
grubã®å Žåãããã¯
/ etc / default / grubæ§æã®GRUB_CMDLINE_LINUX_DEFAULTãã©ã¡ãŒã¿ãŒ
ã§ã ã
巚倧ããŒãž
ãããã¯ãŒã¯ãããã¡ã«ã¡ã¢ãªãå²ãåœãŠãã«
ã¯ã倧ããªããŒãžãå¿
èŠã§ãã ä»®æ³ã¡ã¢ãªã¢ãã¬ã¹ã
TLBã«å€æããããã«å¿
èŠãªåŒã³åºããå°ãªãããã倧ããªããŒãžã匷調衚瀺ãããšããã©ãŒãã³ã¹ã«ãã©ã¹ã®å¹æããããŸãã 確ãã«ãæçåãé¿ããããã«ãã«ãŒãã«ãããŒãããããã»ã¹ã§éç«ã£ãŠããã¯ãã§ãã
ãããè¡ãã«ã¯ãã«ãŒãã«ãã©ã¡ãŒã¿ãŒã远å ããŸãã
hugepages = 1024
ããã«ããã1024ããŒãžã®2MBãå²ãåœãŠãããŸãã
4ã®ã¬ãã€ãã®ããŒãžãéžæããã«ã¯ïŒ
default_hugepagesz = 1G hugepagesz = 1G hugepages = 4
ãã ããé©åãªãµããŒããå¿
èŠã§ã
ã/proc/cpuinfoã®ããã»ããµãââã©ã°
pdpe1gb ã
grep pdpe1gb /proc/cpuinfo | uniq
64ãããã¢ããªã±ãŒã·ã§ã³ã®å Žåã1GBããŒãžã®äœ¿çšãæšå¥šãããŸãã
NUMAã·ã¹ãã ã®ã«ãŒãã«éã®ããŒãžã®ååžã«é¢ããæ
å ±ãååŸããã«ã¯ã次ã®ã³ãã³ãã䜿çšã§ããŸã
cat /sys/devices/system/node/node*/meminfo | fgrep Huge
NUMAã·ã¹ãã ã§ã®ã©ãŒãžããŒãžã®å²ãåœãŠãšè§£æŸã«é¢ããããªã·ãŒã®ç®¡çã«ã€ããŠã¯ã
å
¬åŒããã¥ã¡ã³ããã芧ãã ããã
倧ããªããŒãžããµããŒãããã«ã¯ã
CONFIG_HUGETLBFSãã©ã¡ãŒã¿ãŒã䜿çšããŠã«ãŒãã«ãæ§ç¯ããå¿
èŠããããŸã
ã©ãŒãžããŒãžã«å²ãåœãŠãããã¡ã¢ãªé åã®ç®¡çã¯ãåå¥ã®ã«ãŒãã«ã¹ã¬ãã
khugepagedã§æé©åãå®è¡ãã
Transparent Hugepageã¡ã«ããºã ã«ãã£ãŠå®è¡ãããŸãã ããããµããŒãããã«ã¯ã
CONFIG_TRANSPARENT_HUGEPAGEããã³ããªã·ãŒ
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYSãŸãã¯
CONFIG_TRANSPARENT_HUGEPAGE_MADVISEããã©ã¡ãŒã¿ãŒãšãšãã«åéããå¿
èŠããããŸã
OSã®ããŒãäžã«å€§ããªããŒãžãå²ãåœãŠãå Žåã§ããããŸããŸãªçç±ã§2 MBããŒãžã®é£ç¶ã¡ã¢ãªé åãå²ãåœãŠãããšãã§ããªãå¯èœæ§ãæ®ã£ãŠããããããã®ã¡ã«ããºã ã¯åŒãç¶ãéèŠã§ãã
Intelã®ãã©ãã¯ãŒããã®
NUMAãšã¡ã¢ãªã«é¢ãã
è¶
倧äœããããŸãã
Rad Hatã®å€§ããªããŒãžã®äœ¿çšã«é¢ããçã
èšäºããããŸãã
ããŒãžãæ§æããŠåŒ·èª¿è¡šç€ºããåŸãããããããŠã³ãããå¿
èŠããããŸãããã®ããã«ã¯ãé©åãªããŠã³ããã€ã³ãã
/ etc / fstabã«è¿œå ããå¿
èŠããããŸã
nodev /mnt/huge hugetlbfs defaults 0 0
1GBããŒãžã®å ŽåãããŒãžãµã€ãºã¯è¿œå ãã©ã¡ãŒã¿ãŒã§æå®ããå¿
èŠããããŸã
nodev /mnt/huge hugetlbfs pagesize=1GB 0 0
ç§ã®å人çãªèгå¯ã«ãããšã
DPDKãã»ããã¢ããããŠäœ¿çšããéã®æå€§ã®åé¡ã¯ã倧ããªããŒãžã§æ£ç¢ºã«çºçããŸãã 倧ããªããŒãžã®ç®¡çã«ã¯ç¹ã«æ³šæãæãå¿
èŠããããŸãã
ã¡ãªã¿ã«ã
Power8ã§ã¯ãã©ãŒãžããŒãž
ã®ãµã€ãºã¯16 MBãš16 GBã§ãããç§ã«ãšã£ãŠã¯å°ãããéãã§ãã
ãšãã«ã®ãŒç®¡ç
DPDKã«ã¯ãããã»ããµã®åšæ³¢æ°ãå¶åŸ¡ããããŒã«ãæ¢ã«ãããããæšæºã®ããªã·ãŒã§ã¯æãåãããŸããã
ãããã䜿çšããã«ã¯ã
SpeedStepãš
C3 C6ãæå¹ã«ããå¿
èŠããããŸãã
BIOSã§ã¯ãèšå®ãž
ã®ãã¹ã¯æ¬¡ã®ããã«ãªããŸã
詳现èšå®->ããã»ããµæ§æ->匷åãããIntel SpeedStep Tech
詳现èšå®->ããã»ããµèšå®->ããã»ããµC3詳现èšå®->ããã»ããµèšå®->ããã»ããµC6
l3fwd-powerã¢ããªã±ãŒã·ã§ã³ã¯ã黿ºç®¡çæ©èœã䜿çšããL3ã¹ã€ããã®äŸãæäŸããŸãã
ã¢ã¯ã»ã¹æš©
rootæš©éã§ã¢ããªã±ãŒã·ã§ã³ãå®è¡ããããšã¯éåžžã«å®å
šã§ã¯ãªãããšã¯æããã§ãã
ACLã
䜿çšããŠãåã
ã®ãŠãŒã¶ãŒã°ã«ãŒãã®
ã¢ã¯ã»ã¹èš±å¯ã
äœæããããšããå§ãããŸãã
setfacl -su::rwx,g::rwx,o:---,g:dpdk:rw- /dev/hpet setfacl -su::rwx,g::rwx,o:---,g:dpdk:rwx /mnt/huge setfacl -su::rwx,g::rwx,o:---,g:dpdk:rw- /dev/uio0 setfacl -su::rwx,g::rwx,o:---,g:dpdk:rw- /sys/class/uio/uio0/device/config setfacl -su::rwx,g::rwx,o:---,g:dpdk:rwx /sys/class/uio/uio0/device/resource*
ããã«ããã䜿çšããããªãœãŒã¹ãšuio0ããã€ã¹ã®dpdkãŠãŒã¶ãŒã°ã«ãŒããžã®ãã«ã¢ã¯ã»ã¹ã远å ãããŸãã
ãã¡ãŒã ãŠã§ã¢
40GbEãããã¯ãŒã¯ã¢ããã¿ãŒã®å Žåãå°ããªãã±ããã®åŠçã¯ããªãå°é£ãªã¿ã¹ã¯ã§ããããã¡ãŒã ãŠã§ã¢ãããã¡ãŒã ãŠã§ã¢ãžã®
ã€ã³ãã«ã§ã¯è¿œå ã®æé©åãå°å
¥ãããŠããŸãã
FLV3Eã·ãªãŒãºã®
ãã¡ãŒã ãŠã§ã¢ãµããŒãã¯DPDK 2.2-rc2ã«å®è£
ãããŠããŸããããããŸã§ã®ãšãããæé©ãªããŒãžã§ã³ã¯
4.2.6ã§ãã ãã³ããŒã®ãµããŒãã«é£çµ¡ããŠã
ã€ã³ãã«ã«çŽæ¥ã¢ããããŒããäŸé Œããããèªåã§ã¢ããããŒãããããšãã§ããŸãã
PCIeããã€ã¹ã®æ¡åŒµã©ãã«ãèŠæ±ãµã€ãºãèªã¿åãèšè¿°å
extended_tagããã³
max_read_request_sizeãã¹ã®PCIeãã©ã¡ãŒã¿ãŒã¯ãå°ããªãã±ããã®åŠçé床ã«å€§ãã圱é¿ããŸãïŒ40GbEã¢ããã¿ãŒã§ã¯çŽ100ãã€ãïŒã äžéšã®BIOSããŒãžã§ã³ã§ã¯ãæåã§ã€ã³ã¹ããŒã«ã§ããŸã-100ãã€ãã®ãã±ããã®å Žåããããã125ãã€ããšã1ãã
次ã®ãã©ã¡ãŒã¿ãŒã䜿çšããŠDPDKããã«ããããšãã«ãconfig
/ common_linuxapp configã§å€ãèšå®ã§ããŸãã
CONFIG_RTE_PCI_CONFIG
CONFIG_RTE_PCI_EXTENDED_TAG
CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE
ãŸãã¯ã
setpci lspciã³ãã³ãã䜿çšããŸãã
ããã¯ã PCIeããã€ã¹ã®MAX_REQUESTãã©ã¡ãŒã¿ãŒãšMAX_PAYLOADãã©ã¡ãŒã¿ãŒã®éãã§ãããæ§æã«ã¯MAX_REQUESTã®ã¿ãå«ãŸããŠããŸãã
i40eãã©ã€ããŒã®å Žåãèªã¿åãèšè¿°åã®ãµã€ãºã16ãã€ãã«æžããããšã¯çã«ããªã£ãŠããŸãããããè¡ãã«ã¯ã次ã®ãã©ã¡ãŒã¿ãŒãèšå®ããŸãïŒ
config / common_linuxappãŸãã¯
config / common_bsdappã®CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESCãŸããæ¢åã®åªå
é äœïŒæå€§ã¹ã«ãŒããããŸãã¯ãã±ããé
å»¶ïŒã«å¿ããŠãã¬ã³ãŒãå²ã蟌ã¿CONFIG_RTE_LIBRTE_I40E_ITR_INTERVALã®åŠçã®æå°ééãæå®ããããšãã§ããŸãã
ãŸãã
Mellanox mlx4ãã©ã€ããŒã«ãåæ§ã®ãã©ã¡ãŒã¿ãŒããããŸãã
CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N
CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE
CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE
CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS
ããã¯ããããããã©ãŒãã³ã¹ã«äœããã®åœ±é¿ãåãŒããŸãã
ãããã¯ãŒã¯ã¢ããã¿ãŒã®ãã®ä»ã®ãã©ã¡ãŒã¿ãŒã¯ãã¹ãŠãããã°ã¢ãŒãã«é¢é£ä»ããããŠãããã¿ãŒã²ããã¢ããªã±ãŒã·ã§ã³ã®ãããã¡ã€ã«ãšãããã°ãéåžžã«çްããè¡ãããšãã§ããŸãããããã«ã€ããŠã¯åŸã§è©³ãã説æããŸãã
Intel VT-dçšã®IOMMU
ãã©ã¡ãŒã¿ã䜿çšããŠã«ãŒãã«ãæ§ç¯ããå¿
èŠããã
CONFIG_IOMMU_SUPPORTCONFIG_IOMMU_APICONFIG_INTEL_IOMMUigb_uioãã©ã€ããŒã®å ŽåãããŒããªãã·ã§ã³ãèšå®ããå¿
èŠããããŸã
iommu = pt
ããã«ããã
DMAã¢ãã¬ã¹ã®æ£ãã倿ãè¡ãããŸãïŒ
DMAåãããã³ã° ïŒã ãã€ããŒãã€ã¶ãŒã®ã¿ãŒã²ãããããã¯ãŒã¯ã¢ããã¿ãŒã«å¯Ÿãã
IOMMUãµããŒãã¯ãªãã«ãªã£
ãŠããŸãã
IOMMUèªäœã¯ã髿§èœãããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ã«ãšã£ãŠã¯ããªãç¡é§ã§ãã DPDKã¯1察1ã®ãããã³ã°ãå®è£
ããŠããããã
IOMMUãå®å
šã«ãµããŒãããå¿
èŠ
ã¯ãããŸããããããã¯å¥ã®ã»ãã¥ãªãã£éåã§ãã
ã«ãŒãã«ã¢ã»ã³ããªäžã«
INTEL_IOMMU_DEFAULT_ONãã©ã°ãèšå®ãããŠããå ŽåãããŒããã©ã¡ãŒã¿ãŒã䜿çšããå¿
èŠããããŸãã
intel_iommu = on
Intel
IOMMUã®æ£ããåæåãä¿èšŒããŸãã
UIO ïŒ
uio_pci_generic ã
igb_uio ïŒã®äœ¿çšã¯ã
VFIO ïŒvfio-pciïŒããµããŒãããã«ãŒãã«ã§ã¯ãªãã·ã§ã³ã§ãããã¿ãŒã²ãããããã¯ãŒã¯ã€ã³ã¿ãŒãã§ã€ã¹ãšã®çžäºäœçšã®æ©èœãå®è£
ãããŠããããšã«æ³šæããŠãã ããã
igb_uio㯠ãã¿ãŒã²ãããããã¯ãŒã¯ã¢ããã¿ãŒã«ããäžéšã®å²ã蟌ã¿ãä»®æ³æ©èœã®ãµããŒãããªãå Žåã«å¿
èŠ
ã§ã ãããã§ãªãå Žåã¯ã
uio_pci_genericãå®å
šã«äœ¿çš
ã§ããŸãã
igb_uioãã©ã€ããŒã«ã¯iommu = ptãã©ã¡ãŒã¿ãŒãå¿
èŠã§ãããvfio-pciãã©ã€ããŒã¯iommu = ptãã©ã¡ãŒã¿ãŒãšiommu = onã®äž¡æ¹ã§æ£ããæ©èœããŸãã
IOMMUã°ã«ãŒãã®äœæ¥ã®ç¹æ§ã«é¢é£ããŠã
VFIOèªäœã¯éåžžã«å¥åŠã«
æ©èœããŸããããã€ã¹ã«ãã£ãŠã¯ããã¹ãŠã®ããŒãã
VFIOã§ãã€ã³ãããå¿
èŠããããã®ãããã°ãäžéšã®ã¿å¿
èŠãªãã®ãããã°ãäœããã€ã³ãããå¿
èŠã®ãªããã®ããããŸãã
ããã€ã¹ã
PCI-PCIããªããžã®èåŸã«ããå Žåãããªããžãã©ã€ããŒã¯ã¿ãŒã²ããã¢ããã¿ãŒãšåã
IOMMUã°ã«ãŒãã«å«ãŸããŸãããããã£ãŠã
VFIOãããªããžã®èåŸã«ããããã€ã¹ã
ååŸã§ããããã«ãããªããžãã©ã€ããŒãã¢ã³ããŒãããå¿
èŠããããŸãã
ã¹ã¯ãªããã䜿çšããŠãæ¢åã®ããã€ã¹ãšäœ¿çšããããã©ã€ããŒã®å Žæã確èªã§ããŸãã
./tools/dpdk_nic_bind.py --status
ãŸãããã©ã€ããŒãç¹å®ã®ãããã¯ãŒã¯ããã€ã¹ã«æç€ºçã«ãã€ã³ãããããšãã§ããŸãã
./tools/dpdk_nic_bind.py --bind=uio_pci_generic 04:00.1 ./tools/dpdk_nic_bind.py --bind=uio_pci_generic eth1
ãã ãã䟿å©ã§ãã
èšçœ®
以äžã«èª¬æãã
ããã«ããœãŒã¹ãååŸããŠåéããŸãã
DPDKèªäœã«ã¯ãæ£ããã·ã¹ãã èšå®ãå®è¡ã§ãããµã³ãã«ã¢ããªã±ãŒã·ã§ã³ã®ã»ãããä»å±ããŠããŸãã
äžèšã®ããã«ãDPDKã®æ§æã¯ã
config / common_linuxappããã³
config / common_bsdappãã¡ã€ã«ã§ãã©ã¡ãŒã¿ãŒãèšå®ããããšã«ããè¡ãã
ãŸã ã ãã©ãããã©ãŒã åºæã®ãã©ã¡ãŒã¿ãŒã®ããã©ã«ãå€ã¯ã
config / defconfig_ *ãã¡ã€ã«ã«ä¿åãããŸãã
æåã«ãæ§æãã³ãã¬ãŒããé©çšããã
ãã«ããã©ã«ããŒããã¹ãŠã®æŽ»æ§ãšã¿ãŒã²ããã§äœæãããŸãã
make config T=x86_64-native-linuxapp-gcc
次ã®ã¿ãŒã²ããç°å¢ã¯
DPDK 2.2ã§å©çšå¯èœã§ãïŒç§çšïŒ
arm-armv7a-linuxapp-gcc arm64-armv8a-linuxapp-gcc arm64-thunderx-linuxapp-gcc arm64-xgene1-linuxapp-gcc i686-native-linuxapp-gcc i686-native-linuxapp-icc ppc_64-power8-linuxapp-gcc tile-tilegx-linuxapp-gcc x86_64-ivshmem-linuxapp-gcc x86_64-ivshmem-linuxapp-icc x86_64-native-bsdapp-clang x86_64-native-bsdapp-gcc x86_64-native-linuxapp-clang x86_64-native-linuxapp-gcc x86_64-native-linuxapp-icc x86_x32-native-linuxapp-gcc
ivshmemã¯
QEMUã¡ã«ããºã ã§ãäžè¬çãªå°çšããã€ã¹ã䜿çšããŠãã³ããŒããã«è€æ°ã®ã²ã¹ãä»®æ³ãã·ã³éã§ã¡ã¢ãªé åãå
±æã§ããŸãã ã²ã¹ãOSéã®éä¿¡ã®å Žåã
å
±æã¡ã¢ãªãžã®ã³ããŒãå¿
èŠã§ããã
DPDKã®å Žåã¯ããã§ã¯ãããŸããã
Ivshmemèªäœã¯éåžžã«
åçŽã§ãã
æ§æãã³ãã¬ãŒãã®æ®ãã®ç®çã¯æããã§ããã¯ãã§ããããã§ãªããã°ããªããããèªãã§ããã®ã§ããããïŒ
æ§æãã³ãã¬ãŒãã«å ããŠãä»ã®ãªãã·ã§ã³ã®ãã©ã¡ãŒã¿ãŒããããŸã
EXTRA_CPPFLAGS - EXTRA_CFLAGS - EXTRA_LDFLAGS - EXTRA_LDLIBS - RTE_KERNELDIR - CROSS - V=1 - D=1 - O - `build` DESTDIR - `/usr/local`
次ã«ãå€ãè¯ã
make
makeã®ç®æšã®ãªã¹ã
ã¯ããäžè¬çãªãã®ã§ãã
all build clean install uninstall examples examples_clean
åäœããã«ã¯ã
UIOã¢ãžã¥ãŒã«ãããŒãããå¿
èŠããããŸã
sudo modprobe uio_pci_generic
ãŸãã¯
sudo modprobe uio sudo insmod kmod/igb_uio.ko
VFIOã䜿çšããŠããå Žå
sudo modprobe vfio-pci
KNIã䜿çšãããŠããå Žå
insmod kmod/rte_kni.ko
ãµã³ãã«ããã«ãããŠå®è¡ãã
DPDKã¯2ã€ã®ç°å¢å€æ°ã䜿çšããŠäŸãæ§ç¯ããŸãã
- RTE_SDK- DPDKãã€ã³ã¹ããŒã«ãããŠãããã©ã«ããŒãžã®ãã¹
- RTE_TARGET-ã¢ã»ã³ããªã«äœ¿çšãããæ§æãã³ãã¬ãŒãã®åå
ãããã¯ã察å¿ãã
Makefileã§äœ¿çšãã
ãŸã ã
EALã¯ãã¢ããªã±ãŒã·ã§ã³ãæ§æããããã®ã³ãã³ãã©ã€ã³ãªãã·ã§ã³ãæ¢ã«æäŸããŠããŸãã
- -c <ãã¹ã¯>-ã¢ããªã±ãŒã·ã§ã³ãå®è¡ãããè«çã³ã¢ã®16鲿°ãã¹ã¯
- -n <number>ããã»ããµãŒããšã®ã¡ã¢ãªãã£ãã«
- -b <ãã¡ã€ã³ïŒãã¹ïŒidentifier.function>ã...- PCIããã€ã¹ã®ãã©ãã¯ãªã¹ã
- --use-device <domainïŒbusïŒidentifier.function>ã...- PCIããã€ã¹ã®ãã¯ã€ããªã¹ãããã©ãã¯ãšåæã«äœ¿çšããããšã¯ã§ããŸãã
- --socket-mem MB-ããã»ããµãœã±ããããšã®ã©ãŒãžããŒãžã«å²ãåœãŠãããã¡ã¢ãªã®é
- -m MB-ã©ãŒãžããŒãžã«å²ãåœãŠãããã¡ã¢ãªã®éãããã»ããµã®ç©ççãªå Žæã¯ç¡èŠãããŸã
- -r <number>ã®ã¡ã¢ãªã¹ããã
- -vããŒãžã§ã³
- --huge-dir-倧ããªããŒãžãããŠã³ãããããã©ã«ããŒ
- --file-prefix-ã©ãŒãžããŒãžã®ãã¡ã€ã«ã·ã¹ãã ã«ä¿åããããã¡ã€ã«ã®ãã¬ãã£ãã¯ã¹
- --proc-type-è€æ°ã®ããã»ã¹ã§ã¢ããªã±ãŒã·ã§ã³ãèµ·åããããã«--file-prefixãšãšãã«äœ¿çšãããããã»ã¹ã€ã³ã¹ã¿ã³ã¹
- --xen- dom0-ã©ãŒãžããŒãžããµããŒãããªãXen domain0ã§ã®å®è¡
- --vmware-tsc-map- RDTSCã®ä»£ããã«ã VMWareãæäŸããTSCã«ãŠã³ã¿ãŒã䜿çšããŸã
- --base-virtaddr-ããŒã¹ä»®æ³ã¢ãã¬ã¹
- --vfio-intr-VFIOã䜿çšããå²ã蟌ã¿ã®ã¿ã€ã
ã·ã¹ãã å
ã®ã«ãŒãã«
çªå·ã確èªããã«ã¯ã
hwlocããã±ãŒãžã®
lstopoã³ãã³ãã䜿çšã§ããŸãã
ã©ãŒãžããŒãžãšããŠå²ãåœãŠããããã¹ãŠã®ã¡ã¢ãªã䜿çšããããšããå§ãããŸããããã¯ã-mããã³--socket-memãªãã·ã§ã³ã䜿çšãããŠããªãå Žåã®ããã©ã«ãã®åäœã§ãã 倧ããããŒãžã§äœ¿çšã§ãããããå°ãªãé£ç¶ããã¡ã¢ãªé åãå²ãåœãŠããšã
EALåæåãšã©ãŒãçºçããå Žåã«ãã£ãŠã¯æªå®çŸ©ã®åäœãçºçããå¯èœæ§ããããŸãã
1GBã®ã¡ã¢ãªãå²ãåœãŠãã«ã¯
- nullãœã±ããïŒïŒã§--socket-mem = 1024ãæå®ããå¿
èŠããããŸã
- æåã®--socket-mem = 0.1024
- ãŒããš2çªç®-socket-mem = 1024,0,1024
Hello Worldããã«ãããŠå®è¡ããã«ã¯
export RTE_SDK=~/src/dpdk cd ${RTE_SDK}/examples/helloworld make ./build/helloworld -cf -n 2
ãããã£ãŠãã¢ããªã±ãŒã·ã§ã³ã¯2ã€ã®ã¡ã¢ãªã¹ããããã€ã³ã¹ããŒã«ãããŠããããšãèæ
®ããŠã4ã€ã®ã³ã¢ã§å®è¡ãããŸãã
ãããŠãç°ãªãã³ã¢ãã5ã€ã®Hello WorldãååŸããŸãã
é¶ãåµãããã³ãããã¯ãã£ã«ã®åé¡
ä»®æ³ãã·ã³ã®ããã©ãŒãã³ã¹ãæ¯èŒçé«ãããšãšã远å ã®ã¡ã¢ãªç®¡çã¡ã«ããºã ãå°å
¥ã§ããå¯èœæ§ããããããã¿ãŒã²ãããã©ãããã©ãŒã ãšããŠJavaãéžæããŸããã åé¡ã¯ã責任ãã©ã®ããã«é
åãããã§ããã¡ã¢ãªã®å²ãåœãŠå Žæãã¹ã¬ããã®ç®¡çå Žæãã¿ã¹ã¯ã®ã¹ã±ãžã¥ãŒã«æ¹æ³ã
DPDKã¡ã«ããºã ã®ç¹å¥ãªç¹ã¯ãããªãè€éã§äºéã®äŸ¡å€ããããŸãã
DPDK ã
Netty ãããã³
OpenJDKèªäœã®ãœãŒã¹ãéåžžã«äžæè°ã«æãå¿
èŠããããŸããã ãã®çµæã
DPDKçµ±åãéåžžã«æ·±ã
nettyã³ã³ããŒãã³ãã®ç¹æ®ããŒãžã§ã³ã
éçºãããŸãã ã
ç¶ç¶ããã