ãã€ãããããŒãã§ã¯ãã¿ã¹ã¯ã«ã€ããŠã§ããéã詳现ã«èª¬æããŸããã ã¹ããŒãªãŒã¯é·ããŠæå³ã®ãªããã®ã§ããããšãå€æããŸãããã³ãŒãã¯1è¡ããããŸããã§ããã ããããã¿ã¹ã¯ãç解ããªããã°ãæé©åãè¡ãããšã¯éåžžã«å°é£ã§ãã ãã¡ãããäžéšã®ææ³ã¯ãæå
ã®ã³ãŒãã®ã¿ã§é©çšã§ããŸãã ããšãã°ããã£ãã·ã¥ã®èšç®ãåå²ãæžãããŸãã ããããç§ã«ã¯ãã¿ã¹ã¯ãç解ããªããšã§ããªãããšãããããã§ãã ããã«ããã人ãšæé©åã³ã³ãã€ã©ãåºå¥ãããŸãã ãããã£ãŠãæåã®æé©åã¯äŸç¶ãšããŠå€§ããªåœ¹å²ãæãããŸããã³ã³ãã€ã©ã«ã¯ã³ãŒãã®ã¿ãããã人ã¯ã¿ã¹ã¯ãç解ããŠããŸãã ã³ã³ãã€ã©ã¯ãå€ã4ããååã«ã©ã³ãã ã§ãããšå€æããããšã¯ã§ããŸãããã人ã¯å€æã§ããŸãã

å®ç掻ã®Pillowã©ã€ãã©ãªã§ç³ã¿èŸŒã¿æ³ã䜿çšããŠç»åã®ãµã€ãºå€æŽæäœãæé©åããããšã«çŠç¹ãåœãŠãããšãæãåºãããŠãã ããã æ°å¹Žåã«è¡ã£ãå€æŽã«ã€ããŠã話ããŸãã ããããããã¯åèªããšã®ç¹°ãè¿ãã§ã¯ãããŸãããæé©åã¯ãã¬ãŒã·ã§ã³ã«éœåã®è¯ãé åºã§èª¬æãããŸãã ãããã®èšäºã®ããã«ãããŒãžã§ã³2.6.2ãããªããžããªã«å¥ã®ãã©ã³ããäœæããŸããããã®ç¬éããç©èªãç¶ããŸãã
ãã¹ãäž
èªãã ãã§ãªããèªåã§å®éšãããå Žåã¯ã pillow-perfãã¹ãããã±ãŒãžã圹ç«ã¡ãŸãã
Pillowã¯å€ãã®ã¢ãžã¥ãŒã«ã§æ§æãããŠãããã€ã³ã¯ãªã¡ã³ã¿ã«ã«ã³ã³ãã€ã«ããæ¹æ³ãããããªããããccacheãŠãŒãã£ãªãã£ã䜿çšããŠåã¢ã»ã³ããªã倧å¹
ã«é«éåããŸãã Pillow-perfã䜿çšãããšãå€ãã®æäœããã¹ãã§ããŸããã scale
é¢å¿ããããŸãã -n 3
ã¯ãæäœã®éå§åæ°ãèšå®ããŸãã ã³ãŒãã¯é
ãã§ãããç ãã«èœã¡ãªãããã«ãããå°ããªæ°ãåãããšãã§ããŸãã èµ·åæã®ããã©ãŒãã³ã¹ã¯æ¬¡ã®ãšããã§ãã
Scale 2560Ã1600 RGB image to 320x200 bil 0.08927 s 45.88 Mpx/s to 320x200 bic 0.13073 s 31.33 Mpx/s to 320x200 lzs 0.16436 s 24.92 Mpx/s to 2048x1280 bil 0.40833 s 10.03 Mpx/s to 2048x1280 bic 0.45507 s 9.00 Mpx/s to 2048x1280 lzs 0.52855 s 7.75 Mpx/s to 5478x3424 bil 1.49024 s 2.75 Mpx/s to 5478x3424 bic 1.84503 s 2.22 Mpx/s to 5478x3424 lzs 2.04901 s 2.00 Mpx/s
ã³ãããbf1df9aã®çµæã
ãããã®çµæã¯ãããŒãžã§ã³2.6ã®å
¬åŒãã³ãããŒã¯ã§åŸãããçµæãšãããã«ç°ãªããŸã ã ããã«ã¯ããã€ãã®çç±ããããŸãã
- å
¬åŒãã³ãããŒã¯ã§ã¯ãGCC 5.3ã§64ãããUbuntu 16.04ã䜿çšããŠããŸãã GCC 4.8ã§32ãããUbuntu 14.04ã䜿çšããŸããGCC4.8ã§ã¯ããããã®æé©åããã¹ãŠåããŠå®è¡ããŸããã èšäºã®çµããã«ããã®çç±ãæããã«ãªããŸãã
- èšäºã§ã¯ãæé©åã«é¢ä¿ããªããããã©ãŒãã³ã¹ã«åœ±é¿ãããã°ãä¿®æ£ããã³ããããã話ãå§ããŸãã
ã³ãŒãæ§é
èå³ã®ããã³ãŒãã®ã»ãšãã©ã¯ã ImagingStretch
é¢æ°ã®Antialias.cãã¡ã€ã«ã«ãããŸãã ãã®é¢æ°ã®ã³ãŒãã¯ã3ã€ã®éšåã«åããããšãã§ããŸãã
åã«èšã£ãããã«ã2ã€ã®ãã¹ã§ç»åã®ç³ã¿èŸŒã¿ã®ãµã€ãºå€æŽãè¡ãããšãã§ããŸããæåã¯ç»åã®å¹
ã®ã¿ã2çªç®ã¯é«ãããŸãã¯ãã®éã§ãã 1ã€ã®åŒã³åºãã®ImagingStretch
é¢æ°ã¯ãã©ã¡ããäžæ¹ã ããå®è¡ã§ããŸãã ããã§ã¯ãåãµã€ãºå€æŽäžã«å®éã«2ååŒã³åºãããããšãããããŸã ã ãã®é¢æ°ã¯ãäžè¬çãªããããŒã°ãå®è¡ãããã©ã¡ãŒã¿ãŒã«å¿ããŠããã®æäœãŸãã¯ãã®æäœãå®è¡ããŸãã å埩ã³ãŒãïŒãã®å Žåã¯ããããŒã°ïŒãåé€ããããã®ããªãçããã¢ãããŒãã
å
éšã§ã¯ãäž¡æ¹ã®ãã¹ã¯ã»ãŒåãããã«èŠããåŠçã®æ¹åãå€ããããã«èª¿æŽãããŠããŸãã ç°¡æœã«ããããã«ã1ã€ã ãã瀺ããŸãã
for (yy = 0; yy < imOut->ysize; yy++) {
Pillowã§ãµããŒããããããã€ãã®ãã¯ã»ã«è¡šçŸåœ¢åŒã«åå²ããŠããŸãïŒã·ã³ã°ã«ãã£ã³ãã«8ãããïŒã°ã¬ãŒã¹ã±ãŒã«ïŒããã«ããã£ã³ãã«8ãããïŒRGBãRGBAãLAãCMYKããã®ä»ïŒãã·ã³ã°ã«ãã£ã³ãã«32ãããããããŒãã ãããæãäžè¬çãªç»å圢åŒã§ãããããããã€ãã®8ããããã£ãã«ã®ã«ãŒãã®æ¬äœã«èå³ããããŸãã
æé©å1ïŒãã£ãã·ã¥ãå¹æçã«äœ¿çšãã
äžèšã§2ã€ã®ãã¹ã¯äŒŒãŠãããšè¿°ã¹ãŸãããããããã®éã«ã¯æãããªéãããããŸãã åçŽéè·¯ãèŠãŠãã ããïŒ
for (yy = 0; yy < imOut->ysize; yy++) {
æ°Žå¹³éè·¯ïŒ
for (xx = 0; xx < imOut->xsize; xx++) {
æçµç»åã®åã¯ãå
åŽã®ã«ãŒãã®åçŽæ¹åã®éè·¯ãšãæ°Žå¹³æ¹åã®è¡ãç¹°ãè¿ããŸãã æ°Žå¹³ãã¹ã¯ãããã»ããµãã£ãã·ã¥ã«ãšã£ãŠé倧ãªåé¡ã§ãã å
åŽã®ã«ãŒãã®åã¹ãããã§ã以äžã®1è¡ã«ã¢ã¯ã»ã¹ããŸããããã¯ãåã®ã¹ãããã§å¿
èŠãªå€ãšã¯ç°ãªãã¡ã¢ãªããã®å€ãèŠæ±ãããããšãæå³ããŸãã ç³ã¿èŸŒã¿ãµã€ãºãå°ããå Žåãããã¯è¯ããããŸããã å®éãææ°ã®ããã»ããµã§ã¯ãããã»ããµãRAMããèŠæ±ã§ãããã£ãã·ã¥ã©ã€ã³ã¯åžžã«64ãã€ãã§ãã ã€ãŸããç³ã¿èŸŒã¿ã«é¢ä¿ãããã¯ã»ã«ã16ãã¯ã»ã«æªæºã®å ŽåãããŒã¿ã®äžéšãRAMãããã£ãã·ã¥ã«æµªè²»ãããŸãã ããã§ããµã€ã¯ã«ãéã«ãªãã次ã®ãã¯ã»ã«ãã©ã€ã³ã®äžã§åŽ©å£ãããåãã©ã€ã³ã®æ¬¡ã®ãã¯ã»ã«ã厩å£ããããšãæ³åããŠãã ããã ãã®åŸãå¿
èŠãªãã¯ã»ã«ã®ã»ãšãã©ãæ¢ã«ãã£ãã·ã¥ã«ãããŸãã
ãã®ãããªã³ãŒãã®ç·šæã®2çªç®ã®ãã€ãã¹èŠå ã¯ãç³ã¿èŸŒã¿ã®é·ãè¡ïŒã€ãŸãã倧å¹
ãªæžå°ïŒã§çŸããŸãã å®éã«ã¯ãé£æ¥ããç³ã¿èŸŒã¿ã§ã¯ãå
ã®ãã¯ã»ã«ãéåžžã«å€§ãã亀差ããããããã®ããŒã¿ããã£ãã·ã¥ã«æ®ã£ãŠããã°ããã§ãããã ããããäžããäžã«ç§»åãããšãå€ãç³ã¿èŸŒã¿ã®ããŒã¿ã¯ãæ°ããç³ã¿èŸŒã¿ã®ããŒã¿ã«ãã£ãŠåŸã
ã«ãã£ãã·ã¥ããè¿œãåºããå§ããŸãã ãã®çµæãå®å
šãªå
éšã«ãŒããééãã次ã®å€éšã¹ããããéå§ããããšããã£ãã·ã¥ã«äžéšã®è¡ã衚瀺ãããªããªãããã¹ãŠäžéšã®è¡ã«çœ®ãæãããŸããåã³ã¡ã¢ãªããååŸããå¿
èŠããããŸãã ãããŠãäžäœã®ãã®ã«ãªããšããã£ãã·ã¥å
ã®ãã¹ãŠããã§ã«äžäœã®ãã®ã«çœ®ãæããããŠããŸãã ãµã€ã¯ã«ãå€æãããã®çµæããã£ãã·ã¥ã«å¿
èŠãªããŒã¿ãå«ãŸããªããªããŸãã
ãªãè¿åã¯ãããªã®ã§ããïŒ äžèšã®æ¬äŒŒã³ãŒãã§ã¯ãäž¡æ¹ã®å Žåã®2è¡ç®ãç³ã¿èŸŒã¿ã®ä¿æ°ã®èšç®ã§ããããšãããããŸãã åçŽæ¹åã®ééã®å Žåãä¿æ°ã¯æçµç»åã®è¡ïŒ yy
å€ïŒã®ã¿ã«äŸåããæ°Žå¹³æ¹åã®ééã®å Žåã¯çŸåšã®åïŒ xx
å€ïŒã«äŸåããŸãã ã€ãŸããæ°Žå¹³ã®éè·¯ã§ã¯ãåçŽã«2ã€ã®ãµã€ã¯ã«ã亀æããããšã¯ã§ããŸãããä¿æ°ã®èšç®ã¯ãxxãµã€ã¯ã«å
ã«ããå¿
èŠããããŸãã å
éšã«ãŒãå
ã®ä¿æ°ã®ã«ãŠã³ããéå§ãããšããã¹ãŠã®ããã©ãŒãã³ã¹ãäœäžããŸãã ç¹ã«ãäžè§é¢æ°ãããLanczosãã£ã«ã¿ãŒã䜿çšããŠä¿æ°ãèšç®ããå Žåã
åã¹ãããã§ä¿æ°ãèšç®ããããšã¯ã§ããŸããããããã§ã1åã®ãã¹ãŠã®ãã¯ã»ã«ã§åãã§ãã ãã®ããããã¹ãŠã®åã«ã€ããŠãã¹ãŠã®ä¿æ°ãäºåã«èšç®ã§ããå
åŽã®ã«ãŒãã§ã¯ãã§ã«èšç®ãããŠããŸãã ãã£ãŠã¿ãŸãããã
ã³ãŒãã«ã¯ãä¿æ°çšã®ã¡ã¢ãªå²ãåœãŠããããŸãã
k = malloc(kmax * sizeof(float));
ããã§ããã®ãããªé
åã®é
åãå¿
èŠã§ãã ããããåçŽåããããšã¯å¯èœã§ã-å¹³ããªã¡ã¢ãªãå²ãåœãŠãŠã2次å
ã®ã¢ãã¬ã¹æå®ããšãã¥ã¬ãŒãããŸãã
kk = malloc(imOut->xsize * kmax * sizeof(float));
ãŸãã xmin
ãšxmax
ãxxã«äŸåããã©ããã«æ ŒçŽããå¿
èŠããããŸãã ãããã®äžã§ãåèšç®ããªãããã«é
åãäœæããŸãã
xbounds = malloc(imOut->xsize * 2 * sizeof(float));
ãŸããã«ãŒãå
ã§ww
å€ã䜿çšãããŸããããã¯ãç³ã¿èŸŒã¿å€ãæ£èŠåããããã«å¿
èŠã§ãã ww = 1 / âk [x]ã ããªãã¯ãããå
šãä¿åããããšã¯ã§ãããç³ã¿èŸŒã¿ã®çµæã§ã¯ãªããä¿æ°èªäœãæ£èŠåããŸãã ã€ãŸããä¿æ°ãèšç®ããåŸãããäžåºŠä¿æ°ã調ã¹ãŠåèšã§å²ãå¿
èŠããããŸãããã®çµæããã¹ãŠã®ä¿æ°ã®åèšã¯1ã«ãªããŸãã
k = &kk[xx * kmax]; for (x = (int) xmin; x < (int) xmax; x++) { float w = filterp->filter((x - center + 0.5) * ss); k[x - (int) xmin] = w; ww = ww + w; } for (x = (int) xmin; x < (int) xmax; x++) { k[x - (int) xmin] /= ww; }
ããã§ãæçµçã«90°ã®èµ°æ»ã§ãã¯ã»ã«ãæ¡åŒµã§ããŸãã
Scale 2560Ã1600 RGB image to 320x200 bil 0.04759 s 86.08 Mpx/s 87.6 % to 320x200 bic 0.08970 s 45.66 Mpx/s 45.7 % to 320x200 lzs 0.11604 s 35.30 Mpx/s 41.6 % to 2048x1280 bil 0.24501 s 16.72 Mpx/s 66.7 % to 2048x1280 bic 0.30398 s 13.47 Mpx/s 49.7 % to 2048x1280 lzs 0.37300 s 10.98 Mpx/s 41.7 % to 5478x3424 bil 1.06362 s 3.85 Mpx/s 40.1 % to 5478x3424 bic 1.32330 s 3.10 Mpx/s 39.4 % to 5478x3424 lzs 1.56232 s 2.62 Mpx/s 31.2 %
ã³ãããd35755cã®çµæã
4åç®ã¯åã®ãªãã·ã§ã³ãšæ¯èŒããå éã瀺ããè¡šã®äžã«ã¯ã³ããããžã®ãªã³ã¯ããããããã§è¡ãããå€æŽãæ確ã«è¡šç€ºãããŸãã
æé©å2ïŒåºåå¶é
ã³ãŒãã§ã¯ãããã€ãã®å Žæã«æ¬¡ã®æ§é ããããŸãã
if (ss < 0.5) imOut->image[yy][xx*4+b] = (UINT8) 0; else if (ss >= 255.0) imOut->image[yy][xx*4+b] = (UINT8) 255; else imOut->image[yy][xx*4+b] = (UINT8) ss;
ããã¯ãèšç®çµæã8ããããè¶
ããå Žåã[0ã255]å
ã®ãã¯ã»ã«å€ã®å¶éã§ãã å®éããã¹ãŠã®æ£ã®ç³ã¿èŸŒã¿ä¿æ°ã®åèšã¯1ãã倧ãããªãããã¹ãŠã®è² ã®ç³ã¿èŸŒã¿ä¿æ°ã®åèšã¯ãŒãããå°ãããªããŸãã ãã®ãããç¹å®ã®ãœãŒã¹ã€ã¡ãŒãžã§ã¯ããªãŒããŒãããŒãçºçããå¯èœæ§ããããŸãã ãã®ãªãŒããŒãããŒã¯ãèŒåºŠã®çªç¶ã®å€åãè£æ£ããçµæã§ããããšã©ãŒã§ã¯ãããŸããã
ã³ãŒããèŠãŠãã ããã 1ã€ã®å
¥åå€æ°ss
ãš1ã€ã®åºåimOut->image[yy]
ããã®å€ã¯è€æ°ã®å Žæã«å²ãåœãŠãããŸãã æªãããšã¯ãæµ®åå°æ°ç¹æ°ãæ¯èŒãããããšã§ãã ãã¹ãŠãæŽæ°ã«å€æããŠããæ¯èŒããæ¹ãé«éã§ããæçµçã«ã¯çµæå
šäœãå¿
èŠã ããã§ãã åèšã§ããã®é¢æ°ãååŸããŸãã
static inline UINT8 clip8(float in) { int out = (int) in; if (out >= 255) return 255; if (out <= 0) return 0; return (UINT8) out; }
䜿çšæ³ïŒ
imOut->image[yy][xx*4+b] = clip8(ss);
ããã«ãããããã©ãŒãã³ã¹ã¯åäžããŸããããããã§ã¯ãããŸããã
Scale 2560Ã1600 RGB image to 320x200 bil 0.04644 s 88.20 Mpx/s 2.5 % to 320x200 bic 0.08157 s 50.21 Mpx/s 10.0 % to 320x200 lzs 0.11131 s 36.80 Mpx/s 4.2 % to 2048x1280 bil 0.22348 s 18.33 Mpx/s 9.6 % to 2048x1280 bic 0.28599 s 14.32 Mpx/s 6.3 % to 2048x1280 lzs 0.35462 s 11.55 Mpx/s 5.2 % to 5478x3424 bil 0.94587 s 4.33 Mpx/s 12.4 % to 5478x3424 bic 1.18599 s 3.45 Mpx/s 11.6 % to 5478x3424 lzs 1.45088 s 2.82 Mpx/s 7.7 %
ã³ããã54d3b9dã®çµæã
ã芧ã®ãšããããã®æé©åã¯ããŠã£ã³ããŠãå°ãããåºå解å床ãé«ããã£ã«ã¿ãŒã«å€§ããªå¹æããããããŸããïŒå¯äžã®äŸå€ã¯320x200ãã€ãªãã¢ã§ãããçç±ã¯èšããŸããïŒã å®éããã£ã«ã¿ãŒãŠã£ã³ããŠãå°ãããæçµçãªè§£å床ã倧ããã»ã©ãæé©åããå€ã®ã¯ãªããã³ã°ã«ããããã©ãŒãã³ã¹ãžã®å¯äžã倧ãããªããŸãã
æé©å3ïŒå埩åæ°ãäžå®ã®ã«ãŒããåã
åã³æ°Žå¹³æ¹åã®ã¹ããããããèŠããšãæ倧4ã€ã®ãã¹ãããããµã€ã¯ã«ãã«ãŠã³ãã§ããŸãã
for (yy = 0; yy < imOut->ysize; yy++) {
åºåç»åã®åè¡ãšååãç¹°ãè¿ããïŒã€ãŸããåãã¯ã»ã«ïŒãå
ã®ç»åã®æããããŸããåãã¯ã»ã«ãå
éšã§ç¹°ãè¿ãããŸãã ãããã b
ãšã¯äœã§ããïŒ b
ã¯ç»åãã£ã³ãã«ã®ç¹°ãè¿ãã§ãã æããã«ããã£ã³ãã«ã®æ°ã¯é¢æ°å
šäœã§å€åããã4ãè¶
ããããšã¯ãããŸããïŒç»åãPillowã«ä¿åãããæ¹æ³ã®ããïŒã ãããã£ãŠãèããããã±ãŒã¹ã¯4ã€ã ãã§ãã ãŸããã·ã³ã°ã«ãã£ãã«ã®8ãããç»åãç°ãªãæ¹æ³ã§ä¿åããããšããäºå®ãèãããšã3ã€ã®ã±ãŒã¹ããããŸãã ãããã£ãŠã2ã€ã3ã€ãããã³4ã€ã®ãã£ãã«ã«å¯ŸããŠã3ã€ã®å¥ã
ã®å
éšãµã€ã¯ã«ãäœæã§ããŸãã ãããŠãé©åãªæ°ã®ãã£ãã«ã«åå²ããŸãã ããŸãã¹ããŒã¹ããšããªãããã«ã3ãã£ã³ãã«ã®å Žåã®ã³ãŒãã®ã¿ã瀺ããŸãã
for (xx = 0; xx < imOut->xsize; xx++) { if (imIn->bands == 4) {
xxã«ãŒããŸã§ãããã§åæ¢ããŠãã©ã³ãã1ã¬ãã«äžã«ç§»åããããšã¯ã§ããŸããã
if (imIn->bands == 4) { for (xx = 0; xx < imOut->xsize; xx++) {
Scale 2560Ã1600 RGB image to 320x200 bil 0.03885 s 105.43 Mpx/s 19.5 % to 320x200 bic 0.05923 s 69.15 Mpx/s 37.7 % to 320x200 lzs 0.09176 s 44.64 Mpx/s 21.3 % to 2048x1280 bil 0.19679 s 20.81 Mpx/s 13.6 % to 2048x1280 bic 0.24257 s 16.89 Mpx/s 17.9 % to 2048x1280 lzs 0.30501 s 13.43 Mpx/s 16.3 % to 5478x3424 bil 0.88552 s 4.63 Mpx/s 6.8 % to 5478x3424 bic 1.08753 s 3.77 Mpx/s 9.1 % to 5478x3424 lzs 1.32788 s 3.08 Mpx/s 9.3 %
ã³ããã95a9e30ã®çµæã
åæ§ã®ããšãåçŽéè·¯ã«ã€ããŠãå¯èœã§ãã çŸåšããã®ãããªã³ãŒãããããŸãïŒ
for (xx = 0; xx < imOut->xsize*4; xx++) { ss = 0.0; for (y = (int) ymin; y < (int) ymax; y++) ss = ss + (UINT8) imIn->image[y][xx] * k[y-(int) ymin]; ss = ss * ww + 0.5; imOut->image[yy][xx] = clip8(ss); }
ãã£ãã«ã«åå¥ã®å埩ã¯ãããŸããã代ããã«ã xx
ã¯å¹
ã«4ãæããŠå埩ããŸããã€ãŸãã xx
ã¯ç»åå
ã®æ°ã«é¢ä¿ãªãåãã£ãã«ãééããŸãã ã³ã¡ã³ãã®FIXMEã¯ããããä¿®æ£ããå¿
èŠããããšã ãèšã£ãŠããŸãã åãæ¹æ³ã§ä¿®æ£ãããŸã-å
ã®ç»åã®ç°ãªãæ°ã®ãã£ã³ãã«ã®ã³ãŒããåå²ããããšã«ãã£ãŠã ããã§ã¯ã³ãŒããæäŸããŸãããã³ããããžã®ãªã³ã¯ã¯ä»¥äžã«ãããŸãã
Scale 2560Ã1600 RGB image to 320x200 bil 0.03336 s 122.80 Mpx/s 16.5 % to 320x200 bic 0.05439 s 75.31 Mpx/s 8.9 % to 320x200 lzs 0.08317 s 49.25 Mpx/s 10.3 % to 2048x1280 bil 0.16310 s 25.11 Mpx/s 20.7 % to 2048x1280 bic 0.19669 s 20.82 Mpx/s 23.3 % to 2048x1280 lzs 0.24614 s 16.64 Mpx/s 23.9 % to 5478x3424 bil 0.65588 s 6.25 Mpx/s 35.0 % to 5478x3424 bic 0.80276 s 5.10 Mpx/s 35.5 % to 5478x3424 lzs 0.96007 s 4.27 Mpx/s 38.3 %
ã³ãããf227c35ã®çµæã
ã芧ã®ããã«ãæ°Žå¹³æ¹åã®éè·¯ã¯ããã©ãŒãã³ã¹ãåäžãããŠåçãçž®å°ããåçŽæ¹åã®éè·¯ã¯ããã©ãŒãã³ã¹ãåäžãããŸããã
æé©å4ïŒæŽæ°ã«ãŠã³ã¿ãŒ
for (y = (int) ymin; y < (int) ymax; y++) { ss0 = ss0 + (UINT8) imIn->image[y][xx*4+0] * k[y-(int) ymin]; ss1 = ss1 + (UINT8) imIn->image[y][xx*4+1] * k[y-(int) ymin]; ss2 = ss2 + (UINT8) imIn->image[y][xx*4+2] * k[y-(int) ymin]; }
æãå
åŽã®ã«ãŒããèŠããšãå€æ°ymax
ããã³ymax
floatãšããŠå®£èšymax
ãŠããããåã¹ãããã§æŽæ°ã«ymax
ãŠããããšãããããŸãã ããã«ãã«ãŒãã®å€åŽã§ã¯ã floor
é¢æ°ãšceil
é¢æ°ã䜿çšããŠå€ãå²ãåœãŠãŸãã ã€ãŸããå®éã«ã¯æŽæ°ã¯åžžã«å€æ°ã«æ ŒçŽãããŸãããäœããã®çç±ã§æŽæ°ãšããŠæµ®åå°æ°ç¹ãšããŠå®£èšãããŸãã xmin
ãšxmax
ã«ã€ããŠxmin
åãããšãxmax
ãŸãã 亀æããŠæž¬å®ããŸãã
Scale 2560Ã1600 RGB image to 320x200 bil 0.03009 s 136.10 Mpx/s 10.9 % to 320x200 bic 0.05187 s 78.97 Mpx/s 4.9 % to 320x200 lzs 0.08113 s 50.49 Mpx/s 2.5 % to 2048x1280 bil 0.14017 s 29.22 Mpx/s 16.4 % to 2048x1280 bic 0.17750 s 23.08 Mpx/s 10.8 % to 2048x1280 lzs 0.22597 s 18.13 Mpx/s 8.9 % to 5478x3424 bil 0.58726 s 6.97 Mpx/s 11.7 % to 5478x3424 bic 0.74648 s 5.49 Mpx/s 7.5 % to 5478x3424 lzs 0.90867 s 4.51 Mpx/s 5.7 %
ã³ããã57e8925ã®çµæã
æçµè¡çºãšãã¹
ç§ã¯èªãããç§ã¯çµæã«éåžžã«æºè¶³ããŠããã ã³ãŒããå¹³å2.5åãªãŒããŒã¯ããã¯ã§ããŸããã ããã«ããã®é«éåãå®çŸããããã«ãã©ã€ãã©ãªãŠãŒã¶ãŒã¯è¿œå ã®æ©åšãã€ã³ã¹ããŒã«ããå¿
èŠããªãã以åãšåãããã«åãããã»ããµã®åãã³ã¢ã§ãµã€ãºå€æŽãå®è¡ãããŸãã å¿
èŠãªã®ã¯ãPillowã®ããŒãžã§ã³ãããŒãžã§ã³2.7ã«ã¢ããã°ã¬ãŒãããããšã ãã§ããã
ãããããªãªãŒã¹2.7以åã«ã¯ãŸã æéããããåäœããã¯ãã®ãµãŒããŒã§æ°ããã³ãŒãããã§ãã¯ããã®ã«çŠããŸããã ã³ãŒãã移æ€ããã³ã³ãã€ã«ããŸããããæåã¯äœããå°ç¡ãã«ãããšæããŸããã
Scale 2560Ã1600 RGB image 320x200 bil 0.08056 s 50.84 Mpx/s 320x200 bic 0.16054 s 25.51 Mpx/s 320x200 lzs 0.24116 s 16.98 Mpx/s 2048x1280 bil 0.18300 s 22.38 Mpx/s 2048x1280 bic 0.31103 s 13.17 Mpx/s 2048x1280 lzs 0.43999 s 9.31 Mpx/s 5478x3424 bil 0.75046 s 5.46 Mpx/s 5478x3424 bic 1.22468 s 3.34 Mpx/s 5478x3424 lzs 1.70451 s 2.40 Mpx/s
ã³ããã57e8925ã®çµæã å¥ã®ãã·ã³ã§åä¿¡ãããæ¯èŒã«ã¯é¢äžããŸããã
é»ã£ãŠïŒ çµæã¯ãæé©ååãšã»ãšãã©åãã§ãã ãã¹ãŠã10åãã§ãã¯ããé©åãªã³ãŒããæ©èœããããšã確èªããããã«å°å·ããŸããã ããã¯æãç°å¢ââããã®å¯äœçšã§ã¯ãªãã30è¡ã®æå°éã®äŸã§ãéããåçŸãããŸããã Stack Overflowã«ã€ããŠè³ªåãããšãããæçµçã«æãããªãã¿ãŒã³ãèŠã€ããããšãã§ããŸããã64ããããã©ãããã©ãŒã çšã®GCCã§ã³ã³ãã€ã«ãããå Žåãã³ãŒãã¯ãã£ããå®è¡ãããŸããã ãããŠãããã¯ãŸãã«ããŒã«ã«UbuntaãšãµãŒããŒã®éãã§ããïŒããŒã«ã«ã«ã¯32ãããããããŸããã
ããŠãMuruã®æ å
ãç§ã¯ã¯ã¬ã€ãžãŒã§ã¯ãããŸãããããã¯ã³ã³ãã€ã©ã®æ¬åœã®ãã°ã§ãã ããã«ããã°ã¯GCC 4.9ã§ä¿®æ£ãããŸããããGCC 4.8ã¯Ubuntu 14.04 LTSã«å«ãŸããŠããŸãããããã¯åœæé¢é£ããããŸãããã€ãŸããã»ãšãã©ã®å Žåãã©ã€ãã©ãªã®ã»ãšãã©ã®ãŠãŒã¶ãŒã«ãã£ãŠã€ã³ã¹ããŒã«ãããŸããã ãããç¡èŠããããšã¯äžå¯èœã§ãããæé©åã¯ããããäœãããçç£ç©ãå«ãã倧å€æ°ã®äººã«ãšã£ãŠããŸããããªãå Žåã¯è¯ãããšã§ãã SOã®è³ªåãæŽæ°ããTwitterã§å«ã³ãŸããã V8ãšã³ãžã³ãšæé©åã®å€©æã®éçºè
ã®1人ã§ããVyacheslav Egorovã圌ã®ããšã«æ¥ãŠãåé¡ã®æ ¹åºã«ãã©ãçãã解決çãèŠã€ããŸããã
åé¡ã®æ¬è³ªãç解ããã«ã¯ãããã»ããµã®æŽå²ãšçŸåšã®ã¢ãŒããã¯ãã£ã詳ãã調ã¹ãå¿
èŠããããŸãã æã
ãx86ããã»ããµã¯æµ®åå°æ°ç¹æ°ãæ±ãæ¹æ³ãç¥ããªãã£ããããx87åœä»€ã®ã»ãããåããã³ããã»ããµãçºæãããŸããã 圌ã¯äžå€®åŠçè£
眮ãšåãã¹ã¬ããããåœä»€ãå®è¡ããŸãããããã¶ãŒããŒãã«å¥ã®ããã€ã¹ãšããŠã€ã³ã¹ããŒã«ãããŸããã ããã«ãã³ããã»ããµãŒã¯äžå€®ã®ã³ã³ãã¥ãŒã¿ãŒã«çµã¿èŸŒãŸãå§ããç©ççã«ã¯1ã€ã®ããã€ã¹ã«ãªããŸããã ã©ããããçãããäžé£ã®SSEïŒã¹ããªãŒãã³ã°SIMDæ¡åŒµåœä»€ïŒåœä»€ã3çªç®ã®Pentiumã«ç»å ŽããŸããã ãšããã§ãSIMDåœä»€ã«ã€ããŠã¯ãäžé£ã®èšäºã®ç¬¬2éšã«ãªããŸãã ãã®ååã«ãããããããSSEã«ã¯æµ®åå°æ°ç¹æ°ãæäœããããã®SIMDã³ãã³ãã ãã§ãªããã¹ã«ã©ãŒèšç®çšã®åçã®ã³ãã³ããå«ãŸããŠããŸããã ã€ãŸããSSEã«ã¯x87ã»ãããè€è£œããäžé£ã®åœä»€ãå«ãŸããŠããŸãããããšã³ã³ãŒãæ¹æ³ãç°ãªããåäœããããã«ç°ãªããŸãã
ãã ããã³ã³ãã€ã©ãŒã¯æµ®åå°æ°ç¹èšç®çšã®SSEã³ãŒããçæããããšãæ¥ãã§ããŸããã§ããããå€ãx87ã¹ã€ãŒããåŒãç¶ã䜿çšããŸããã çµå±ã®ãšãããæããçµã¿èŸŒãŸããŠããx87ãšã¯ç°ãªããããã»ããµã®SSEãä¿èšŒãã人ã¯ããŸããã§ããã 64ãããããã»ããµã¢ãŒãã®ç»å Žã«ããããã¹ãŠãå€ãããŸããã 64ãããã¢ãŒãã§ã¯ãSSE2åœä»€ã®ã»ãããå¿
é ã«ãªããŸããã ã€ãŸããx86çšã®64ãããããã°ã©ã ãäœæããŠããå Žåãå°ãªããšãSSE2åœä»€ã䜿çšã§ããŸãã ããã¯ãã³ã³ãã€ã©ã64ãããã¢ãŒãã§æµ®åå°æ°ç¹èšç®çšã®SSEåœä»€ãçæãããšãã«äœ¿çšãããã®ã§ãã ããã¯ãã¯ãã«åãšã¯äœã®é¢ä¿ããªãããšãæãåºãããŠãã ãã;ç§ãã¡ã¯éåžžã®ã¹ã«ã©ãŒèšç®ã«ã€ããŠè©±ããŠããŸãã
ããããŸãã«ç§ãã¡ã®ã±ãŒã¹ã§èµ·ããããšã§ãã32ãããã¢ãŒããš64ãããã¢ãŒãã§ã¯ç°ãªãåœä»€ã»ããã䜿çšãããŸãã ãããããããŸã§ã®ãšãããããã¯ãææ°ã®SSEã³ãŒããåŸæ¥ã®x87ã¹ã€ãŒããããæ°åé
ãçç±ã説æããŠããŸããã ãã®çŸè±¡ã説æããã«ã¯ãããã»ããµãåœä»€ãå®è¡ããæ¹æ³ãæ£ç¢ºã«ææ¡ããå¿
èŠããããŸãã
ãããããããããã»ããµãŒã¯æ¬åœã«æ瀺ã«åŸããŸããã 圌ãã¯åœä»€ãåããããã解èªããå®å
šã«å®è¡ããçµæãèšããããšããã«çœ®ããŸããã ããã»ããµã¯ããªã銬鹿ã ã£ãã ææ°ã®ããã»ããµã¯ãã¯ããã«ã¹ããŒãã§è€éã§ãããæ°åã®ç°ãªããµãã·ã¹ãã ã§æ§æãããŠããŸãã 䞊ååŠçãè¡ããªã1ã€ã®ã³ã¢ã§ããããã»ããµã¯1ã¯ããã¯ãµã€ã¯ã«ã§äžåºŠã«è€æ°ã®åœä»€ãå®è¡ããŸãã ããŸããŸãªæ®µéã§çºçããŸããäžéšã®åœä»€ã¯ãŸã ãã³ãŒãäžãäžéšã¯ãã£ãã·ã¥ããã®ãªã¯ãšã¹ããäžéšã¯ç®è¡ãããã¯ã«è»¢éãããŸãã åããã»ããµãµãã·ã¹ãã ã¯ãç¬èªã®éšåã«é¢äžããŠããŸãã ããã¯ã³ã³ãã¢ãšåŒã°ããŸãã

å³ã§ã¯ãç°ãªããµãã·ã¹ãã ãç°ãªãè²ã§ç€ºãããŠããŸãã ã³ãã³ãã®å®è¡ã«ã¯4ã5ã¯ããã¯ãµã€ã¯ã«ãå¿
èŠã§ããããã€ãã©ã€ã³ã®ãããã§ãåã¯ããã¯ãµã€ã¯ã«ã§1ã€ã®æ°ããã³ãã³ããéžæããã1ã€ã®ã³ãã³ããå®è¡ãå®äºããŸãã
ã³ã³ãã¢ã¯ããå¹ççã«åäœããããåäžã«å
å¡«ãããã¢ã€ãã«ç¶æ
ã®ãµãã·ã¹ãã ãå°ãªããªããŸãã ããã»ããµã«ã¯ããã€ãã©ã€ã³ã®æé©ãªå
å¡«ãèšç»ãããµãã·ã¹ãã ããããŸããåœä»€ãã¹ã¯ãããã1ã€ã®åœä»€ãè€æ°ã«åå²ããè€æ°ã1ã€ã«çµåããŸãã
, â . - , .

, 2 1. . .
, , , :
Instruction: cvtsi2ss xmm, r32 dst[31:0] := Convert_Int32_To_FP32(b[31:0]) dst[127:32] := a[127:32]
32 . , dst
- a
, , xmm , dst
a
â , 96 , . . , , , . , 32- float. , , . .
, . cvtsi2ss
, xorps
. . , , , , xorps
+ cvtsi2ss
- :
dst[31:0] := Convert_Int32_To_FP32(b[31:0]) dst[127:32] := 0
, GCC 4.8 , , . , , , . 64- .
Scale 2560Ã1600 RGB image 320x200 bil 0.02447 s 167.42 Mpx/s 320x200 bic 0.04624 s 88.58 Mpx/s 320x200 lzs 0.07142 s 57.35 Mpx/s 2048x1280 bil 0.08656 s 47.32 Mpx/s 2048x1280 bic 0.12079 s 33.91 Mpx/s 2048x1280 lzs 0.16484 s 24.85 Mpx/s 5478x3424 bil 0.38566 s 10.62 Mpx/s 5478x3424 bic 0.52408 s 7.82 Mpx/s 5478x3424 lzs 0.65726 s 6.23 Mpx/s
81fc88e . .
, , . , , , . ImageMagick : 64- , GCC 4.9 40% . , SSE.
: 2, SIMD