OpenCL 2.0ã®æ°æ©èœã®äžã«ã¯ãããã€ãã®äŸ¿å©ãªæ°ããçµã¿èŸŒã¿é¢æ°ãããããã¯ãŒã¯ã°ã«ãŒãé¢æ°ãç»å ŽããŸããã ãããã®çµã¿èŸŒã¿é¢æ°ã¯ãã¯ãŒã¯ã°ã«ãŒãã¬ãã«ã§åäœããåºã䜿çšãããŠãã䞊åããªããã£ããæäŸããŸãã ãã®èšäºã§ã¯ãã¯ãŒã¯ã°ã«ãŒãã®æ©èœã«ã€ããŠç°¡åã«èª¬æããOpenCL Intel HDã°ã©ãã£ãã¯ã¹ããã€ã¹ã®ããã©ãŒãã³ã¹ããŒã¿ãæäŸããç°çš®ã¯ãŒã¯ã°ã«ãŒãã®äœ¿çšæ¹æ³ã«ã€ããŠèª¬æããŸãã
ã¯ãŒãã³ã°ã°ã«ãŒãã®æ©èœã®èª¬æ
ã¯ãŒã¯ã°ã«ãŒãã®æ©èœã«ã¯ãã¯ãŒã¯ã°ã«ãŒãã®ã¬ãã«ã®3ã€ã®å€å
žçãªã¢ã«ãŽãªãºã ïŒ
å€ãããŒããã£ã¹ãããªãã¥ãŒã¹ãã¹ãã£ã³ ïŒãšãã¯ãŒã¯ã°ã«ãŒãå
šäœã«å¯ŸããŠå®è¡ãããæäœã®è«ççµæããã§ãã¯ãã2ã€ã®çµã¿èŸŒã¿é¢æ°ãå«ãŸããŸãã
åæžããã³
ã¹ãã£ã³ã¢ã«ãŽãªãºã
ã¯ãå ç®ãæå°ãããã³
æ倧ã®æäœããµããŒãããŠããŸãã
ã¯ãŒã¯ã°ã«ãŒãã®çµã¿èŸŒã¿é¢æ°ã®æ©èœã¯ãååããæããã§ãã
- work_group_broadcastïŒïŒã¯ãéžæããã¯ãŒã¯ã¢ã€ãã ã®å€ããã¹ãŠã®ã¯ãŒã¯ã°ã«ãŒãã¢ã€ãã ã«æ¡åŒµããŸãã
- work_group_reduceïŒïŒã¯ãã¯ãŒã¯ã°ã«ãŒãã®ãã¹ãŠã®èŠçŽ ã®åèšãæå°ããŸãã¯æ倧å€ãèšç®ããçµæã®å€ãã¯ãŒã¯ã°ã«ãŒãã®ãã¹ãŠã®èŠçŽ ã«æ¡åŒµããŸãã
- work_group_scanïŒïŒã¯ã以åã®ãã¹ãŠã®äœæ¥é
ç®ã®åèšãæå°ããŸãã¯æ倧å€ãèšç®ããŸãïŒçŸåšã®äœæ¥é
ç®ãå«ããããšãã§ããŸãïŒã
- work_group_allïŒïŒã¯ãåäœæ¥é
ç®ã«å¯ŸããŠèšç®ãããåãè«çåŒã®è«çANDãè¿ããŸãã
- work_group_anyïŒïŒã¯work_group_allïŒïŒãšåæ§ã«æ©èœããŸãããè«çORã䜿çšããŸãã
ãªã¹ããããŠããçµã¿èŸŒã¿é¢æ°ã«é¢ããéèŠãªå¶éïŒã¹ã«ã©ãŒããŒã¿åã«ã®ã¿é©çšãããŸãïŒããšãã°ãäžè¬çãªåint4ãšfloat4ã¯ãµããŒããããŠããŸããïŒã ãŸããcharãucharãªã©ã®8ãããããŒã¿åã¯ãµããŒããããŠããŸããã
ã¯ãŒãã³ã°ã°ã«ãŒãã®æ©èœã¯ããã®ååã瀺ããšãããã¯ãŒãã³ã°ã°ã«ãŒãå
šäœã§åžžã«äžŠè¡ããŠæ©èœããŸãã ããããæé»ã®çµæãçããŸããã¯ãŒãã³ã°ã°ã«ãŒãã®æ©èœã«å¯ŸãããããªãææŠãéå£ãšããŠæ©èœããŸãã
ã¯ãŒãã³ã°ã°ã«ãŒãã®æ©èœã䜿çšããã«ã¯ã2ã€ã®äž»ãªã¢ã€ãã¢ããããŸãã 第äžã«ãã¯ãŒãã³ã°ã°ã«ãŒãã®æ©èœã¯äŸ¿å©ã§ãã OpenCL 1.2ã§åãæ©èœãå®è£
ããããã«å¿
èŠãªååã«å€§ããªã³ãŒããèšè¿°ãã代ããã«ã1ã€ã®çµã¿èŸŒã¿é¢æ°ã䜿çšããæ¹ãã¯ããã«ç°¡åã§ãã 第äºã«ãã¯ãŒãã³ã°ã°ã«ãŒãã®æ©èœã¯ãæ©åšã®æé©åã䜿çšãããããçç£æ§ã®ç¹ã§ããå¹æçã§ãã
äŸãšããŠã次ã®ã¿ã¹ã¯ïŒã¢ã«ãŽãªãºã ã®äžéšã§ããå¯èœæ§ããããŸãïŒãèããŠã¿ãŸãããïŒãã倧ããªé
åãšåããµã€ãºã®åŸå±é
åã®æ¥é èŸã®åèšãèšç®ããŸãã ãã®ãããåã¹ã¬ãŒãé
åã®åèŠçŽ ã®ãã¬ãã£ãã¯ã¹ã®åèšãèšç®ããåãããŒã¯ã¢ããã§ã¿ãŒã²ããã¡ã¢ãªé åã«ä¿åããå¿
èŠããããŸãã 次ã®å³ã«ããœãŒã¹ãšã¿ãŒã²ããã®ããŒã¿ã¬ã€ã¢ãŠãã瀺ããŸãã
ãã®ã¿ã¹ã¯ã®ã·ã³ãã«ãªOpenCLã³ã¢ã¯æ¬¡ã®ããã«ãªããŸãã
- åé
åïŒå³ã®ç·ïŒã¯1ã€ã®ã¯ãŒãã³ã°ã°ã«ãŒãã«ãã£ãŠåŠçãããŸãã
- åäœæ¥é
ç®ã«ã€ããŠãåã®é
ç®ã®åçŽãªforïŒïŒã«ãŒãã䜿çšããŠã¹ãã£ã³ãå®è¡ããã环ç©ãã¬ãã£ãã¯ã¹å€ãè¿œå ãããçµæãå®å
ã«ä¿åãããŸãã
- ã¯ãŒã¯ã°ã«ãŒãã®ãµã€ãºãå
¥åé
åãããå°ããå ŽåããœãŒã¹ã€ã³ããã¯ã¹ãšãšã³ãã€ã³ããã¯ã¹ã¯ã¯ãŒã¯ã°ã«ãŒãã®ãµã€ãºã ãã·ãããããéçŽãã¬ãã£ãã¯ã¹ãæŽæ°ããããã®ããã»ã¹ããœãŒã¹è¡ã®æåŸãŸã§ç¹°ãè¿ãããŸãã
察å¿ããã³ãŒãã以äžã«ç€ºããŸãã
ã³ãŒã__kernel void Calc_wg_offsets_naive( __global const uint* gHistArray, __global uint* gPrefixsumArray, uint bin_size ) { uint lid = get_local_id(0); uint binId = get_group_id(0);
ãã®ãããªåå§çãªã¢ãããŒãã¯ãã»ãšãã©ã®å Žåéåžžã«å¹æçã§ã¯ãããŸããïŒéåžžã«å°ããªã¯ãŒã¯ã°ã«ãŒããé€ãïŒã æããã«ãå
éšã®forïŒïŒã«ãŒãã¯åé·ãªããŒããšè¿œå ã®æäœãå€ãããŸãã ãã®æé ã¯æããã«æé©åã§ããŸãã ããã«ãã¯ãŒãã³ã°ã°ã«ãŒãã®ãµã€ãºãå¢å ãããšãåé·æ§ãå¢å ããŸãã ããŒããŠã§ã¢ãªãœãŒã¹ãããå¹ççã«äœ¿çšããã«ã¯ãIntel HD Graphicsã«ã¯Blellochãªã©ã®ããå¹ççãªã¢ã«ãŽãªãºã ãå¿
èŠã§ãã 詳现ã«ã€ããŠã¯æ€èšããŸãããããã¯ãå€å
žçãª
GPU Gemsã®èšäºã§é¡èã«èª¬æãããŠããŸãã
䞊åã¹ãã£ã³ã䜿çšããOpenCL 1.2ã³ãŒãã¯æ¬¡ã®ããã«ãªããŸãã
ã³ãŒã #define WARP_SHIFT 4 #define GRP_SHIFT 8 #define BANK_OFFSET(n) ((n) >> WARP_SHIFT + (n) >> GRP_SHIFT) __kernel void Calc_wg_offsets_Blelloch(__global const uint* gHistArray, __global uint* gPrefixsumArray, uint bin_size ,__local uint* temp ) { int lid = get_local_id(0); uint binId = get_group_id(0); int n = get_local_size(0) * 2; uint group_offset = binId * bin_size; uint maxval = 0; do {
ååãšããŠããã®ãããªã³ãŒãã¯ããå¹ççã«æ©èœããããŒããŠã§ã¢ãªãœãŒã¹ã«ããã»ã©é«ãè² è·ããããŸããããããã€ãã®æ³šæäºé
ããããŸãã
ãã®ã³ãŒãã«ã¯ãããŒã«ã«ã¡ã¢ãªãšã°ããŒãã«ã¡ã¢ãªéã§ããŒã¿ã移åããããã®ãªãŒããŒããããšãããã€ãã®çŠæ¢äºé
ããããŸãã æ¬åœã«é«ãå¹çãéæããã«ã¯ãã¢ã«ãŽãªãºã ã«ååã«å€§ããªã¯ãŒã¯ã°ã«ãŒããµã€ãºãå¿
èŠã§ãã å°ããªã¯ãŒã¯ã°ã«ãŒãïŒ<16ïŒã§ã¯ãçç£æ§ãåçŽãªãµã€ã¯ã«ã®çç£æ§ãããé«ããªãå¯èœæ§ã¯äœãã§ãã
ããã«ãã³ãŒãã®è€éããšãå
±æããŒã«ã«ã¡ã¢ãªïŒ
BANK_OFFSETãã¯ããªã©ïŒã®ç«¶åãæé€ããããã«èšèšãããè¿œå ã®ããžãã¯ã«æ³šæããŠãã ããã
ã¯ãŒãã³ã°ã°ã«ãŒãã®äœ¿çšã¯ãèšåããããã¹ãŠã®åé¡ãåé¿ããŸãã æé©åãããOpenCLã³ãŒãã®å¯Ÿå¿ããããŒãžã§ã³ã以äžã«ç€ºããŸãã
ã³ãŒã __kernel void Calc_wg_offsets_wgf( __global const uint* gHistArray, __global uint* gPrefixsumArray, uint bin_size ) { uint lid = get_local_id(0); uint binId = get_group_id(0); uint group_offset = binId * bin_size; uint maxval = 0; do { uint binValue = gHistArray[group_offset + lid]; uint prefix_sum = work_group_scan_exclusive_add( binValue ); gPrefixsumArray[group_offset + lid] = prefix_sum + maxval; maxval += work_group_broadcast( prefix_sum + binValue, get_local_size(0)-1 ); group_offset += get_local_size(0); } while(group_offset < (binId+1) * bin_size); }
äž¡æ¹ã®æé©åãããã¢ã«ãŽãªãºã ã®ããã©ãŒãã³ã¹çµæã¯ãååãªéã®å
¥åããŒã¿ã§æž¬å®ãããŸããïŒåã¯ãŒãã³ã°ã°ã«ãŒãã¯ãããŒã«ã«ãµã€ãºã«å¿ããŠãå€éšãµã€ã¯ã«ã®8192 ... 2048åã®ç¹°ãè¿ãã«å¯Ÿå¿ãã65 536èŠçŽ ãã¹ãã£ã³ããŸãïŒã
äºæ³ã©ãããããŒã«ã«ãµã€ãºã倧ãããªããšåçŽãªã«ãŒãã®åäœãéåžžã«é
ããªããæé©åãããäž¡æ¹ã®ãªãã·ã§ã³ã®ããã©ãŒãã³ã¹ãåäžããŸãã
ç¹å®ã®ã¢ã«ãŽãªãºã ã«å¯ŸããŠã¯ãŒãã³ã°ã°ã«ãŒãã®æé©ãªãµã€ãºãèšå®ããå Žåãã³ã¢ã®æ¯èŒã¯æ¬¡ã®ããã«ãªããŸãã
work_group_scan_exclusive_addïŒïŒã䜿çšãããšããããããµã€ãºã®ã¯ãŒã¯ã°ã«ãŒãã®ããã©ãŒãã³ã¹ã倧å¹
ã«åäžãããšåæã«ãã³ãŒããç°¡çŽ åãããããšã«æ³šæããŠãã ããã
ç°çš®OpenCL 2.0ã¯ãŒã¯ã°ã«ãŒã
OpenCLå®è¡ã¢ãã«ã«ã¯ãNDRangeã®åã
ã®ã¯ãŒã¯ã¢ã€ãã ã®ã°ã«ãŒãã§ããã¯ãŒã¯ã°ã«ãŒãã®æŠå¿µãå«ãŸããŠããŸãã ã¢ããªã±ãŒã·ã§ã³ãOpenCL 1.xã䜿çšããå ŽåãNDRangeã®ãµã€ãºã¯å®å
šã«ïŒãã¬ãŒã¹ãªãã§ïŒã¯ãŒã¯ã°ã«ãŒãã®ãµã€ãºã§é€ç®ããå¿
èŠããããŸãã
clEnqueueNDRangeKernelåŒã³åºãã«ãå®å
šã«åå²ãããŠããªã
global_sizeãã©ã¡ãŒã¿ãŒãš
local_sizeãã©ã¡ãŒã¿ãŒãå«ãŸããŠããå ŽåãåŒã³åºãã¯ãšã©ãŒã³ãŒã
CL_INVALID_WORK_GROUP_SIZEãè¿ããŸãã
clEnqueueNDRangeKernelåŒã³åºãã
local_sizeãã©ã¡ãŒã¿ãŒã«NULLå€ãæå®ããå®è¡å¯èœã¢ãžã¥ãŒã«ãã¯ãŒã¯ã°ã«ãŒãã®ãµã€ãºãéžæã§ããå Žåãå®è¡å¯èœã¢ãžã¥ãŒã«ã¯ãã°ããŒãã«NDRangeãµã€ãºãå®å
šã«åå²ã§ãããµã€ãºãéžæããå¿
èŠããããŸãã
NDRangeã®ãµã€ãºãå®å
šã«åå²ãããããã«ãã¯ãŒãã³ã°ã°ã«ãŒãã®ãã®ãããªãµã€ãºãéžæããå¿
èŠæ§ã¯ãéçºè
ã«ãšã£ãŠå°é£ãåŒãèµ·ããå¯èœæ§ããããŸãã åçŽãª3x3ç»åãŒããã¢ã«ãŽãªãºã ãæ€èšããŠãã ããã ãã®ã¢ã«ãŽãªãºã ã§ã¯ãååºåãã¯ã»ã«ã¯ãé£æ¥ãã3x3é åã®å
¥åãã¯ã»ã«ã®å€ã®å¹³åå€ãšããŠèšç®ãããŸãã ç»åãã¬ãŒã ã«ããåºåãã¯ã»ã«ãåŠçããå Žåããããã®ãã¯ã»ã«ã¯å
¥åç»åã®å¢çã®å€åŽã®ãã¯ã»ã«ã«äŸåãããããåé¡ãçºçããŸãã
äžéšã®ã¢ããªã±ãŒã·ã§ã³ã§ã¯ããã¬ãŒã ã®å
¥åå€ã¯éèŠã§ã¯ãªããã¹ãããããããšãã§ããŸãã ãã®å ŽåãNDRangeã®ãµã€ãºã¯ãåºåã€ã¡ãŒãžã®ãµã€ãºãããã¬ãŒã ã®é åãåŒãããµã€ãºãšåãã§ãã å€ãã®å Žåãå®å
šã«åé¢ããã®ãé£ããNDRangeãµã€ãºã«ãªããŸãã ããšãã°ã3x3ãã£ã«ã¿ãŒã1920x1080ç»åã«é©çšããã«ã¯ãäž¡åŽã«1ãã¯ã»ã«ã®åãã®ãã¬ãŒã ãå¿
èŠã§ãã ãããè¡ãæãç°¡åãªæ¹æ³ã¯ã1918x1078ã³ã¢ã䜿çšããããšã§ãã ãããã1918幎ã1078幎ããæé©ãªãµã€ãºã®ã¯ãŒãã³ã°ã°ã«ãŒããæäŸããå€ã«å®å
šã«åå²ãããŠããŸããã
OpenCL 2.0ã«ã¯ãåã®ã»ã¯ã·ã§ã³ã§èª¬æããåé¡ãä¿®æ£ããæ°ããæ©èœããããŸãã ããããç°çš®ã¯ãŒã¯ã°ã«ãŒãã«ã€ããŠèª¬æããŠããŸããOpenCL2.0å®è¡å¯èœã¢ãžã¥ãŒã«ã¯ãNDRangeãä»»æã®æ¬¡å
ã®ç°çš®ãµã€ãºã®ã¯ãŒã¯ã°ã«ãŒãã«åå²ã§ããŸãã éçºè
ãNDRangeãµã€ãºãå®å
šã«åå²ããªãã¯ãŒã¯ã°ã«ãŒãã®ãµã€ãºãæå®ãããšãå®è¡å¯èœã¢ãžã¥ãŒã«ã¯NDRangeãåå²ããŠãæå®ããããµã€ãºã®ã¯ãŒã¯ã°ã«ãŒããã§ããã ãå€ãäœæããæ®ãã®ã¯ãŒã¯ã°ã«ãŒãã¯ç°ãªããµã€ãºã«ãªããŸãã
ããã«ãããéçºè
ã
local_sizeãã©ã¡ãŒã¿ãŒã®NULLå€ã
clEnqueueNDRangeKernelã«æž¡ããšãOpenCLã¯ä»»æã®NDRangeãµã€ãºã«å¯ŸããŠä»»æã®ãµã€ãºã®ã¯ãŒã¯ã°ã«ãŒãã䜿çšã§ããŸãã äžè¬ã«ãã¢ããªã±ãŒã·ã§ã³ããžãã¯ãç¹å®ã®ã¯ãŒã¯ã°ã«ãŒããµã€ãºãå¿
èŠãšããªãå Žåã
local_sizeãã©ã¡ãŒã¿ãŒã§NULLå€ã䜿çšããããšã¯ãã«ãŒãã«ãå®è¡ããããã®åªå
ãããæ¹æ³ã®ãŸãŸã§ãã
ã«ãŒãã«ã³ãŒãå
ã§ãçµã¿èŸŒã¿ã®
get_local_sizeïŒïŒé¢æ°ã¯ãåŒã³åºãå
ã®ã¯ãŒã¯ã°ã«ãŒãã®å®éã®ãµã€ãºãè¿ããŸãã ã«ãŒãã«ã
clEnqueueNDRangeKernelã®local_sizeãã©ã¡ãŒã¿ãŒã«æå®ãããæ£ç¢ºãªãµã€ãºãå¿
èŠãšããå Žåã
get_get_enqueued_local_sizeïŒïŒçµã¿èŸŒã¿é¢æ°ã¯ãããã®å€ãè¿ããŸãã
ç°çš®ã¯ãŒã¯ã°ã«ãŒãã®äœ¿çšãæå¹ã«ããã«ã¯ãOpenCL 2.0ã®ãã®æ©èœãšä»ã®æ©èœãå«ã
-cl-std = CL2.0ãã©ã°ã䜿çšããŠã«ãŒãã«ãã³ã³ãã€ã«ããå¿
èŠããããŸãã ãã®ãã©ã°ããªããšãããã€ã¹ãOpenCL 2.0ããµããŒãããŠããå Žåã§ããã³ã³ãã€ã©ã¯OpenCL 1.2ã䜿çšããŸãã ããã«ã
-cl-uniform-work-group-sizeãã©ã°ã䜿çšããŠã
-cl-std = CL2.0ãã©ã°çšã«ã³ã³ãã€ã«ãããã«ãŒãã«ã®ç°çš®ã¯ãŒã¯ã°ã«ãŒããç¡å¹ã«ããããšãã§ããŸãã ããã¯ãOpenCL 2.0ã«å®å
šã«ç§»è¡ãããŸã§ãã¬ã¬ã·ãŒã«ãŒãã«ã³ãŒãã«åœ¹ç«ã€ããšããããŸãã
OpenCL 2.0ã®ç°çš®ã¯ãŒã¯ã°ã«ãŒãæ©èœã«ãããOpenCLã®äœ¿ãããããåäžããäžéšã®ã³ã¢ã®ããã©ãŒãã³ã¹ãåäžããŸãã éçºè
ã¯ãå®å
šã«å
±æãããŠããªãNDRangeãµã€ãºãæäœããããã®ã·ã¹ãã ããã³ã«ãŒãã«ã³ãŒããè¿œå ããªããªããŸããã ãã®æ©èœãå©çšããããã«äœæãããã³ãŒãã¯ãSIMDãšã¡ã¢ãªã¢ã¯ã»ã¹ã®åçåã掻çšã§ããŸãããããã®å©ç¹ã¯ãã¯ãŒã¯ã°ã«ãŒãã«é©åãªãµã€ãºãéžæããããšã«ãã£ãŠæäŸãããŸãã
ã«ãªãã¥ã©ã ã³ãŒãã¯ãäžèšã®3x3ãŒããã¢ã«ãŽãªãºã ãå®è£
ããŠããŸãã ã³ãŒãã®æãèå³æ·±ãéšåã¯ãmain.cppãã¡ã€ã«ã«ãããŸãã
段èœã®åãªãã·ã§ã³ã 5-8ã§ã¯ãNDRangeã®åé
ã®ããããã§get_local_size
ïŒïŒããã³
get_get_enqueued_local_sizeïŒïŒãåŒã³åºããçµæãç»é¢ã«è¡šç€ºãããŸãã ãããã£ãŠãNDRangeãã¯ãŒãã³ã°ã°ã«ãŒãã«åå²ãããæ§åãããããŸãã ãŒããã¢ã«ãŽãªãºã ãå®è£
ããã«ãŒãã«ã¯ãBoxBlur.clã«ä¿åãããŸãã éåžžã«åçŽãªå®è£
ãå«ãŸããŠããŸããããŒãããé©çšããæãå¹æçãªæ¹æ³ã§ã¯ãããŸããã
ãã®ãã¥ãŒããªã¢ã«ããã«ãããŠå®è¡ããã«ã¯ã次ã®èŠä»¶ãæºããPCãå¿
èŠã§ãã
- Broadwellãšããã³ãŒãããŒã ã®Intel®Coreâ¢ããã»ããµã·ãªãŒãºã
- Microsoft Windows * 8ãŸãã¯8.1ã
- Intel®SDK for OpenCLâ¢ã¢ããªã±ãŒã·ã§ã³ããŒãžã§ã³2014 R2以éã
- Microsoft Visual Studio * 2012以éã
ã«ãªãã¥ã©ã ã¯ãäžèšã®ã»ã¯ã·ã§ã³ã§èª¬æããåNDRangeããªã¢ã³ãã®å
¥åãããããããèªã¿åããåºåãããããããæžã蟌ãã³ã³ãœãŒã«ã¢ããªã±ãŒã·ã§ã³ã§ãã ãã®ãã¥ãŒããªã¢ã«ã§ã¯ãããã€ãã®ã³ãã³ãã©ã€ã³ãªãã·ã§ã³ããµããŒãããŠããŸãã-hã-ïŒ ïŒãã«ãããã¹ãã衚瀺ããŠçµäºïŒã-i <å
¥åãã¬ãã£ãã¯ã¹>ïŒå
¥åããããããã®ãã¬ãã£ãã¯ã¹ïŒã-o <åºåãã¬ãã£ãã¯ã¹>ïŒåºåããããããã®ãã¬ãã£ãã¯ã¹ïŒã
æäŸãããå³é¢ã®ãã¬ãŒãã³ã°ããã°ã©ã ãéå§ãããšãçµæã¯æ¬¡ã®ããã«ãªããŸãã
é衚瀺ã®ããã¹ã Input file: input.bmp Output files: output_0.bmp, output_1.bmp, output_2.bmp, output_3.bmp Device: Intel(R) HD Graphics 5500 Vendor: Intel(R) Corporation Compiled with OpenCL 1.2 and using a NULL local size: Work Item get_global_id() get_local_size() get_enqueued_local_size() ------------------------------------------------------------------------- Top left (0,0) (1,239) undefined Top right (637,0) (1,239) undefined Bottom left (0,477) (1,239) undefined Bottom right (637,477) (1,239) undefined Compiled with OpenCL 1.2 and using an even local size: Trying to launch a non-uniform workgroup with a kernel compiled using OpenCL 1.2 failed (as expected.) Compiled with OpenCL 2.0 and using a NULL local size: Work Item get_global_id() get_local_size() get_enqueued_local_size() Top left (0,0) (1,239) (1,239) Top right (637, 0) (1,239) (1,239) Bottom left (0,477) (1,239) (1,239) Bottom right (637,477) (1,239) (1,239) Compiled with OpenCL 2.0 and using an even local size: Work Item get_global_id() get_local_size() get_enqueued_local_size() Top left (0,0) (16,16) (16,16) Top right (637,0) (14,16) (16,16) Bottom left (0,477) (16,14) (16,16) Bottom right (637,477) (14,14) (16,16) Done!
å
¥åç»åã®ãµã€ãºã¯640x480ã§ãããããããããã®å Žåã®NDRangeã®ãµã€ãºã¯638x478ã§ãã äžèšã®çµæã¯ã
local_sizeãã©ã¡ãŒã¿ãŒã®NULLå€ã䜿çšããŠOpenCL 1.2ã«ãŒãã«ãèµ·åãããšãåã¯ãŒã¯ã°ã«ãŒãïŒ
1ã239ïŒã«å¥æ°ãµã€ãºã®äœ¿çšã匷å¶ãããããšã
瀺ããŠããŸãã 2ã®ã¹ãä¹ã§ã¯ãªãã¯ãŒã¯ã°ã«ãŒããµã€ãºã¯ãäžéšã®ã³ã¢ã§éåžžã«é
ãåäœããå ŽåããããŸãã SIMDãã€ãã©ã€ã³ã¯ã¢ã€ãã«ç¶æ
ã§ããå¯èœæ§ããããåæã¡ã¢ãªã¢ã¯ã»ã¹ãæãªãããå¯èœæ§ããããŸãã
æå®ãããã¯ãŒã¯ã°ã«ãŒããµã€ãºïŒ16x16ïŒã§OpenCL 1.2ã«ãŒãã«ãå®è¡ãããšã648ã478ã16ã§å²ãåããªãããããšã©ãŒãã¹ããŒãããŸãã
NULLå€ã®
local_sizeãã©ã¡ãŒã¿ãŒã§OpenCL 2.0ã«ãŒãã«ãèµ·åãããšãOpenCLå®è¡å¯èœãã¡ã€ã«ãNDRangeãä»»æã®ãµã€ãºã®ã¯ãŒã¯ã°ã«ãŒãã«åå²ã§ããŸãã çµæãäžã«ç€ºããŸããå®è¡å¯èœã¢ãžã¥ãŒã«ã¯ãOpenCL 1.2ã«ãŒãã«ã®å Žåãšåãæ¹æ³ã§ãã¯ãŒã¯ã°ã«ãŒãã®åäžãªãµã€ãºãåŒãç¶ã䜿çšããŠããããšãããããŸãã
ç¹å®ã®ã¯ãŒã¯ã°ã«ãŒããµã€ãºïŒ16x16ïŒã§OpenCL 2.0ã«ãŒãã«ãå®è¡ãããšãNDRangeãµã€ãºãç°çš®ã®ã¯ãŒã¯ã°ã«ãŒãã«åå²ãããŸãã å·Šäžã®ã¯ãŒãã³ã°ã°ã«ãŒãã¯16x16ãå³äžã¯14x16ãå·Šäžã¯16x14ãå³äžã¯14x14ã§ãã ã»ãšãã©ã®å Žåãã¯ãŒã¯ã°ã«ãŒãã®ãµã€ãºã¯16x16ã§ããããããã®ã³ã¢ã¯SIMDãã€ãã©ã€ã³ãéåžžã«å¹ççã«äœ¿çšããã¡ã¢ãªã¢ã¯ã»ã¹ã¯éåžžã«é«éã«ãªããŸãã
IDZ Webãµã€ãã®èšäºã®ãã«ããŒãžã§ã³ïŒè±èªã®ãªãªãžãã«èšäºïŒ