ãã®èšäºã§ã¯ãOpenMPã«åºã¥ããŠãã«ãã³ã¢ã·ã°ãã«ããã»ããµãããã°ã©ãã³ã°ããæ¹æ³ã«ã€ããŠèª¬æããŸãã OpenMPãã£ã¬ã¯ãã£ããèæ
®ããããã®æå³ãšãŠãŒã¹ã±ãŒã¹ãåæãããŸãã ããžã¿ã«ä¿¡å·ããã»ããµã«éç¹ã眮ãããŠããŸãã OpenMPãã£ã¬ã¯ãã£ãã®é©çšäŸã¯ãããžã¿ã«ä¿¡å·åŠçã®ã¿ã¹ã¯ã®è¿ãã§éžæãããŸãã å®è£
ã¯ã8ã€ã®DSPã³ã¢ãå«ãTexas Instruments TMS320C6678ããã»ããµã§å®è¡ãããŸãã èšäºã®ããŒãIã§ã¯ãåºæ¬çãªOpenMPãã£ã¬ã¯ãã£ãã«ã€ããŠèª¬æããŸãã èšäºã®ç¬¬2éšã§ã¯ããã£ã¬ã¯ãã£ãã®ãªã¹ããè£è¶³ãããšãšãã«ãOpenMPã®å
éšçµç¹ãšãœãããŠã§ã¢æé©åã®åé¡ãæ€èšããäºå®ã§ãã
ãã®èšäºã¯ãæ¯å¹Žãªã£ã¶ã³ç¡ç·å·¥åŠå€§åŠã§éå¬ãããŠãããTexas Instruments C66xãã«ãã³ã¢ããžã¿ã«ä¿¡å·åŠçããã»ããµãããã°ã©ã ã®ç¶ç¶æè²ã³ãŒã¹ã®äžç°ãšããŠåŠçã«æäŸãããè¬çŸ©ãšå®è·µè³æãåæ ããŠããŸãã ãã®èšäºã¯ãç§åŠããã³æè¡ãžã£ãŒãã«ã®ããããã§ã®åºçãèšç»ããŠããŸããããæ€èšäžã®åé¡ã®è©³çŽ°ã«ããããã«ãã³ã¢DSPããã»ããµã®ãã¬ãŒãã³ã°ããã¥ã¢ã«ã®è³æãèç©ããããšã決å®ãããŸããã ãããŸã§ã®éããã®è³æã¯èç©ãããã€ã³ã¿ãŒãããã«ç¡æã§ã¢ã¯ã»ã¹ã§ããå¯èœæ§ããããŸãã ãã£ãŒãããã¯ãšææ¡ãæè¿ããŸãã
ã¯ããã«
é«æ§èœããã»ããµãšã¬ã¡ã³ãã®çç£ã®ããã®çŸä»£ç£æ¥ã¯ãçŸåšããã«ãã³ã¢ã¢ãŒããã¯ãã£ãžã®ç§»è¡ã«é¢é£ããç¹åŸŽçãªã©ãŠã³ããçµéšããŠããŸã[1ã2]ã ãã®ç§»è¡ã¯ãããã»ããµã®èªç¶ãªé²åéçšãããããã匷å¶çãªæ段ã§ãã ãšãã«ã®ãŒå¹çã®æ¥æ¿ãªäœäžã«ãããã³ã³ãã¥ãŒãã£ã³ã°æ§èœã®å¯Ÿå¿ããå¢å ã«äŒŽããã¯ããã¯åšæ³¢æ°ã®å°ååããã³å¢å ã®çµè·¯ã«æ²¿ã£ãåå°äœæè¡ã®ãããªãéçºã¯äžå¯èœã«ãªããŸããã ããã»ããµãã¯ãããžã®ã¡ãŒã«ãŒã¯ããã®ç¶æ³ããã®è«ççãªæ¹æ³ãšããŠãã«ãã³ã¢ã¢ãŒããã¯ãã£ãžã®ç§»è¡ãæ€èšããŸãããããã«ãããããã»ããµã®åŠçèœåãé«ããããšãã§ããŸããã ãã®ã©ãŠã³ãã¯ãäžè¬çãªããã»ããµãã¯ãããžãŒãç¹ã«ãç¹å®ã®ã¢ããªã±ãŒã·ã§ã³åéãšãèšç®å¹çãå
éšããã³å€éšããŒã¿è»¢éå¹çãäœæ¶è²»é»åããµã€ãºãããã³äŸ¡æ Œã«å¯Ÿããç¹å¥ãªèŠä»¶ãåããããžã¿ã«ä¿¡å·åŠçããã»ããµã«äžè¬çã§ãã
ãªã¢ã«ã¿ã€ã ä¿¡å·åŠçã·ã¹ãã ã®éçºè
ã®èŠ³ç¹ãããããžã¿ã«ã·ã°ãã«ããã»ããµïŒDSPïŒã®ãã«ãã³ã¢ã¢ãŒããã¯ãã£ã®äœ¿çšãžã®ç§»è¡ã¯ã3ã€ã®äž»èŠãªåé¡ã§è¡šçŸã§ããŸãã 1ã€ç®ã¯ãããŒããŠã§ã¢ãã©ãããã©ãŒã ã®éçºããã®æ©èœãç¹å®ã®ãããã¯ã®å²ãåœãŠãšãããã®åäœã¢ãŒãã§ãããã¡ãŒã«ãŒã«ãã£ãŠå®ããããŠããŸã[1]ã 2ã€ç®ã¯ãåŠçã¢ã«ãŽãªãºã ã®é©å¿ãšããã«ãã³ã¢DSPïŒMTsSPïŒã§ã®å®è£
ã®ããã«ã·ã¹ãã ãç·šæããååã§ã[3]ã 3çªç®ã¯ãICMPã§å®è£
ãããããžã¿ã«ä¿¡å·åŠççšã®ãœãããŠã§ã¢ïŒãœãããŠã§ã¢ïŒã®éçºã§ãã åæã«ãICSPã®ãœãããŠã§ã¢ã®éçºã«ã¯ãã³ã¢éã§ã®ç¹å®ã®ã³ãŒããã©ã°ã¡ã³ãã®åæ£ãããŒã¿åé¢ãã³ã¢ã®åæãã«ãŒãã«éã®ããŒã¿ããã³ãµãŒãã¹æ
å ±ã®äº€æããã£ãã·ã¥ã®åæãªã©ãåŸæ¥ã®ã·ã³ã°ã«ã³ã¢ã¢ããªã±ãŒã·ã§ã³ã®éçºãšããã€ãã®æ ¹æ¬çãªéãããããŸãã
æ¢åã®ãã·ã³ã°ã«ã³ã¢ããœãããŠã§ã¢ããã«ãã³ã¢ãã©ãããã©ãŒã ã«ç§»æ€ããããŸãã¯æ°ããã䞊åããœãããŠã§ã¢è£œåãéçºããããã®æãé
åçãªãœãªã¥ãŒã·ã§ã³ã®1ã€ã¯ãOpen Multi-ProcessingïŒOpenMPïŒããŒã«ã§ãã OpenMPã¯ãäž»ã«æãäžè¬çãªCèšèªã®æšæºããã°ã©ãã³ã°èšèªã«åã蟌ãããšãã§ããã³ã³ãã€ã©ãã£ã¬ã¯ãã£ããé¢æ°ãããã³ç°å¢å€æ°ã®ã»ããã§ããã䞊åã³ã³ãã¥ãŒãã£ã³ã°ãæŽçããããšã«ããæ©èœãæ¡åŒµããŸãã ãããOpenMPã¢ãããŒãã®äž»ãªå©ç¹ã§ãã æ°ãã䞊åããã°ã©ãã³ã°èšèªãçºæ/åŠç¿ããå¿
èŠã¯ãããŸããã æšæºã³ãŒãã®ã³ã³ãã€ã©ã«åçŽã§æ確ãªãã£ã¬ã¯ãã£ããè¿œå ããããšã«ãããã·ã³ã°ã«ã³ã¢ããã°ã©ã ã¯ç°¡åã«ãã«ãã³ã¢ããã°ã©ã ã«å€ãããŸãã å¿
èŠãªã®ã¯ããã®ããã»ããµãŒã®ã³ã³ãã€ã©ãŒãOpenMPããµããŒãããããšã ãã§ãã ã€ãŸããããã»ããµãŒã®è£œé å
ã¯ãã³ã³ãã€ã©ãŒãOpenMPæšæºãã£ã¬ã¯ãã£ãããç解ããã察å¿ããã¢ã»ã³ãã©ãŒã³ãŒãã«å€æããããšã確èªããå¿
èŠããããŸãã
OpenMPæšæºã¯ãããã€ãã®äž»èŠãªã³ã³ãã¥ãŒã¿ãŒã¡ãŒã«ãŒã®åäŒã«ãã£ãŠéçºãããOpenMP Architecture Review BoardïŒARBïŒ[4]ã«ãã£ãŠèŠå¶ãããŠããŸãã ããã«ãããã¯æ±çšã§ãããç¹å®ã®ã¡ãŒã«ãŒã®ç¹å®ã®ããŒããŠã§ã¢ãã©ãããã©ãŒã åãã§ã¯ãããŸããã ARBã¯ãæšæºã®å°æ¥ã®ããŒãžã§ã³ã®ä»æ§ãå
¬éããŠããŸã[5]ã OpenMP [6]ã®ã¯ã€ãã¯ãªãã¡ã¬ã³ã¹ãèå³æ·±ããã®ã§ãã
æè¿ãèšå€§ãªæ°ã®äœåããããŸããŸãªã¢ããªã±ãŒã·ã§ã³ããã³ããŸããŸãªãã©ãããã©ãŒã ã§ã®OpenMPã®äœ¿çšã«æ³šãããŠããŸã[7-12]ã ç¹ã«èå³æ·±ãã®ã¯ãOpenMPã®äœ¿çšã«é¢ããåºæ¬çãªç¥èãå®å
šã«èº«ã«ã€ããããšãã§ããæ¬ã§ãã åœå
ã®æç®ã§ã¯ããããã¯æ
å ±æºã§ã[13-16]ã
ãã®ããŒããŒã§ã¯ãOpenMPã®ãã£ã¬ã¯ãã£ããé¢æ°ãç°å¢å€æ°ã«ã€ããŠèª¬æããŸãã ãã®å Žåãäœæ¥ã®è©³çŽ°ã¯ãããžã¿ã«ä¿¡å·åŠçã®ã¿ã¹ã¯ã«å¯Ÿããæ¹åã§ãã ç¹å®ã®ãã£ã¬ã¯ãã£ãã®æå³ã瀺ãäŸã¯ãICSPã§ã®å®è£
ã«éç¹ã眮ããŠããŸãã ããŒããŠã§ã¢ãã©ãããã©ãŒã ãšããŠã8ã€ã®DSPã³ã¢ãå«ãTexas Instrumentsã®MTsSP TMS320C6678ããã»ããµ[17]ãéžæããŸããã ãã®ICSPãã©ãããã©ãŒã ã¯ãåœå
åžå Žã§æãå
é²çãªéèŠã®1ã€ã§ãã ããã«ããã®ããŒããŒã§ã¯ããªã¢ã«ã¿ã€ã ä¿¡å·åŠçã¿ã¹ã¯ã«é¢é£ããOpenMPã¡ã«ããºã ã®å
éšçµç¹ã®åé¡ãããã³æé©åã®åé¡ãæ€èšããŠããŸãã
åé¡ã®å£°æ
ãããã£ãŠãåŠçã¿ã¹ã¯ã¯ãåãé·ãã®2ã€ã®å
¥åä¿¡å·ã®åèšãšããŠåºåä¿¡å·ãçæããããšã«ãªããŸãã
z(n) = x(n) + y(n), n = 0, 1, âŠ, N-1
æšæºC / C ++èšèªã§ã®ãã®ã¿ã¹ã¯ã®ãã·ã³ã°ã«ã³ã¢ãå®è£
ã¯ã次ã®ããã«ãªããŸãã
void vecsum(float * x, float * y, float * z, int N) { for ( int i=0; i<N; i++) z[i] = x[i] + y[i]; }
ä»ã8ã³ã¢ããã»ããµTMS320C6678ããããšããŸãã åé¡ã¯ããã«ãã³ã¢ã¢ãŒããã¯ãã£ã®æ©èœã䜿çšããŠãã®ããã°ã©ã ãå®è£
ããæ¹æ³ã§ããïŒ
1ã€ã®è§£æ±ºçã¯ã8ã€ã®å¥åã®ããã°ã©ã ãéçºãããããã8ã€ã®ã³ã¢ã«åå¥ã«ããŒãããããšã§ãã ããã«ã¯ãã¡ã¢ãªå
ã®é
åã®äœçœ®ãã«ãŒãã«éã®é
åã®éšåã®åé¢ãªã©ãå
±åå®è¡ã«ãŒã«ãèæ
®ããå¿
èŠããã8ã€ã®åå¥ã®ãããžã§ã¯ããååšããŸãã ããã«ãã³ã¢ãåæããè¿œå ããã°ã©ã ãäœæããå¿
èŠããããŸãã1ã€ã®ã³ã¢ãã¢ã¬ã€ã®äžéšã®åœ¢æãå®äºããå Žåãããã¯ã¢ã¬ã€å
šäœã®æºåãã§ããŠããããšãæå³ããŸããã ãã¹ãŠã®ã³ã¢ã®å®äºãæåã§ç¢ºèªãããããã¹ãŠã®ã³ã¢ãããã©ã°ãéä¿¡ããŠ1ã€ã®ãã¡ã€ã³ãã³ã¢ã®åŠçãå®äºããå¿
èŠããããŸããããã«ãããåºåé
åã®æºåç¶æ³ã«é¢ããé©åãªã¡ãã»ãŒãžã衚瀺ãããŸãã
説æããã¢ãããŒãã¯æ£ç¢ºãã€å¹æçã§ãããå®è£
ããã®ã¯éåžžã«é£ããããããã«ããŠãéçºè
ã¯æ¢åã®ãœãããŠã§ã¢ã倧å¹
ã«æ¹è¯ããå¿
èŠããããŸãã ãœãŒã¹ã³ãŒããžã®æå°éã®å€æŽã§ãã·ã³ã°ã«ã³ã¢ãããã«ãã³ã¢ãžã®å®è£
ã«ç§»è¡ã§ããããã«ããããšèããŠããŸãã ãããOpenMPã解決ããåé¡ã§ãã
OpenMPã®åæèšå®
ããã°ã©ã ã§OpenMPã䜿çšããåã«ãæããã«ããã®æ©èœããããžã§ã¯ãã«æ¥ç¶ããå¿
èŠããããŸãã TMS320C6678ããã»ããµã®å Žåãããã¯ãããžã§ã¯ãæ§æãã¡ã€ã«ãšäœ¿çšãããã©ãããã©ãŒã ãå€æŽããããšãããã³ãããžã§ã¯ãããããã£ã«OpenMPã³ã³ããŒãã³ããžã®ãªã³ã¯ãå«ããããšãæå³ããŸãã ãã®èšäºã§ã¯ãç¹å®ã®ããŒããŠã§ã¢ãã©ãããã©ãŒã ã«åºæã®ãã®ãããªèšå®ã¯èæ
®ããŸããã ããäžè¬çãªåæOpenMPèšå®ãæ€èšããŠãã ããã
OpenMPã¯Cèšèªã®æ¡åŒµæ©èœã§ãããããããã°ã©ã ã«ãã£ã¬ã¯ãã£ããšæ©èœãå«ããã«ã¯ããã®æ©èœã®èª¬æãã¡ã€ã«ãå«ããå¿
èŠããããŸãã
#include <ti/omp/omp.h>
次ã«ãåŠçããã³ã¢ã®æ°ãã³ã³ãã€ã©ãŒïŒããã³OpenMPæ©èœïŒã«äŒããå¿
èŠããããŸãã OpenMPã¯ã«ãŒãã«ã§ã¯ãªãã䞊åã¹ã¬ããã§åäœããããšã«æ³šæããŠãã ããã 䞊åãããŒã¯è«ççãªæŠå¿µã§ãããã³ã¢ã¯ç©ççãªããŒããŠã§ã¢ã§ãã ç¹ã«ãè€æ°ã®äžŠåã¹ã¬ããã1ã€ã®ã³ã¢ã«å®è£
ã§ããŸãã åæã«ãã³ãŒãã®çã®äžŠåå®è¡ã¯ãåœç¶ã䞊åã¹ã¬ããã®æ°ãã³ã¢ã®æ°ãšäžèŽããåã¹ã¬ãããç¬èªã®ã³ã¢ã«å®è£
ãããŠããããšãæå³ããŸãã å°æ¥çã«ã¯ãããããŸãã«ç¶æ³ã®ããã«èŠãããšä»®å®ããŸãã ãã ãã䞊åã¹ã¬ããã®æ°ãšãã®å®è£
ã®ã«ãŒãã«çªå·ã¯äžèŽããå¿
èŠããªãããšã«æ³šæããŠãã ããïŒ
OpenMPã®åæèšå®ã«ã次ã®OpenMPé¢æ°ã䜿çšããŠäžŠåã¹ã¬ããã®æ°ãå²ãåœãŠãŸãã
omp_set_num_threads(8)
ã³ã¢ïŒã¹ã¬ããïŒã®æ°ã8ã«èšå®ããŸãã
䞊åãã£ã¬ã¯ãã£ã
ãããã£ãŠãäžèšã®ããã°ã©ã ã®ã³ãŒãã8ã³ã¢ã§å®è¡ããå¿
èŠããããŸãã OpenMPã§ã¯ã次ã®ããã«ã³ãŒãã«parallelãã£ã¬ã¯ãã£ããè¿œå ããã ãã§ãã
#include <ti/omp/omp.h> void vecsum (float * x, float * y, float * z, int N) { omp_set_num_threads(8); #pragma omp parallel { for ( int i=0; i<N; i++) z[i] = x[i] + y[i]; } }
ãã¹ãŠã®OpenMPãã£ã¬ã¯ãã£ãã¯ã次ã®åœ¢åŒã®æ§é ã®åœ¢åŒã§çºè¡ãããŸãã
#pragma omp <_> [[(,)][[(,)]] âŠ].
ç§ãã¡ã®å Žåããªãã·ã§ã³ã䜿çšããŸããã䞊åãã£ã¬ã¯ãã£ãã¯ãäžæ¬åŒ§ã§åŒ·èª¿è¡šç€ºããã次ã®ã³ãŒããã©ã°ã¡ã³ãã䞊åé åãåç
§ãã1ã€ã§ã¯ãªãæå®ãããã³ã¢å
šäœã§å®è¡ããå¿
èŠãããããšãæå³ããŸãã
1ã€ã®ã¡ã€ã³ã³ã¢ãŸãã¯ãªãŒãã£ã³ã°ã³ã¢ïŒãã¹ã¿ãŒã³ã¢ïŒã§å®è¡ãããããã°ã©ã ãååŸãããã©ã¬ã«ãã£ã¬ã¯ãã£ãã§åŒ·èª¿è¡šç€ºãããŠãããã©ã°ã¡ã³ãã¯ããªãŒãã£ã³ã°ã«ãŒãã«ãšã¹ã¬ãŒãã«ãŒãã«ã®äž¡æ¹ãå«ãç¹å®ã®æ°ã®ã³ã¢ã§å®è¡ãããŸãã çµæã®å®è£
ã§ã¯ãåããµã€ã¯ã«ã®å ç®ãã¯ãã«ã8ã³ã¢ã§ããã«å®è¡ãããŸãã
OpenMPã§ã®äžŠåã³ã³ãã¥ãŒãã£ã³ã°ã®å
žåçãªçµç¹æ§é ãå³1ã«ç€ºããŸãã
å³1. OpenMPã§ã®äžŠåã³ã³ãã¥ãŒãã£ã³ã°ã®åçããã°ã©ã ã³ãŒãã®å®è¡ã¯åžžã«ããã¹ã¿ãŒã¹ã¬ããã®1ã€ã®ã³ã¢ã§å®è¡ãããé 次é åããå§ãŸããŸãã 察å¿ããOpenMPãã£ã¬ã¯ãã£ãã§ç€ºããã䞊åé åã®éå§ç¹ã§ãã¹ããªãŒã ã»ããïŒäžŠåé åïŒã®OpenMPãã£ã¬ã¯ãã£ãã«ç¶ãã³ãŒãã®äžŠåå®è¡ã®ç·šæãè¡ãããŸãã ç°¡åã«ããããã«ãå³ã«ã¯4ã€ã®äžŠåãããŒã®ã¿ã瀺ãããŠããŸãã 䞊åé åã®çµããã§ããããŒã¯çµåãããäºãã®äœæ¥ã®å®äºãåŸ
ã£ãŠãããé 次é åãåã³ç¶ããŸãã
ãããã£ãŠãããã°ã©ã ãå®è£
ããããã«8ã€ã®ã³ã¢ã䜿çšããããšãã§ããŸãããããã¹ãŠã®ã³ã¢ãåãäœæ¥ãè¡ãããããã®ãããªäžŠååã«ã¯æå³ããããŸããã 8ã€ã®ã³ã¢ã8ååãåºåããŒã¿é
åã圢æããŸããã åŠçæéã¯ççž®ãããŠããŸããã æããã«ãäœæ¥ãç°ãªãã³ã¢ã«åå²ããå¿
èŠããããŸãã
ã¢ãããžãŒãæããŸãããã 8人ã®ããŒã ãäœããŸãããã ãããã®1ã€ãã¡ã€ã³ã§ãã æ®ãã¯åœŒã®ã¢ã·ã¹ã¿ã³ãã§ãã 圌ãã¯ããŸããŸãªæŽ»åã®ãªã¯ãšã¹ããåãåããŸãã äž»ãªåŸæ¥å¡ã¯æ³šæãåãå
¥ããŠå®è¡ããå¯èœãªå Žåã¯ã¢ã·ã¹ã¿ã³ããæ¥ç¶ããŸãã åŸæ¥å¡ãæåã«åãçµãã äœæ¥ã¯ãããã¹ããè±èªãããã·ã¢èªã«ç¿»èš³ããããšã§ããã ããŒã ãªãŒããŒã¯äœæ¥ãéå§ãããœãŒã¹ããã¹ããåããèŸæžãæºåããåã¢ã·ã¹ã¿ã³ãã®ããã¹ããã³ããŒããŠãåãããã¹ããå
šå¡ã«é
åžããŸããã 翻蚳ãå®äºããŸãã ã¿ã¹ã¯ã¯æ£ãã解決ãããŸãã ãã ãã7人ã®ã¢ã·ã¹ã¿ã³ããããããšã«ããå©çã¯ãããŸããã ãŸã£ããéã§ãã åããã£ã¯ã·ã§ããªãã³ã³ãã¥ãŒã¿ãŒããŸãã¯ãœãŒã¹ã³ãŒããå
±æããå¿
èŠãããå Žåãã¿ã¹ã¯ãå®äºããã®ã«æéããããããšããããŸãã OpenMPã¯æåã®äŸã§ãæ©èœããŸãã ä»äºã®åé¢ãå¿
èŠã§ãã ååŸæ¥å¡ã¯ãäžè¬çãªããã¹ãã®ã©ã®éšåãèªåã翻蚳ãã¹ããã瀺ãå¿
èŠããããŸããé
åãåèšããåé¡ã®ã³ã³ããã¹ãã§ã«ãŒãã«éã§äœæ¥ãåå²ããæãããªæ¹æ³ã¯ãã«ãŒãã«ã®æ°ã«å¿ããŠã«ãŒãã«éã§ãµã€ã¯ã«ã®å埩ãåæ£ããããšã§ãã ã³ãŒããå®è¡ãããŠããã«ãŒãã«ãèŠã€ãããã®æ°ã«å¿ããŠã«ãŒãã®å埩ç¯å²ãèšå®ããã«ã¯ã䞊åé åå
ã§ååã§ãã
#include <ti/omp/omp.h> void vecsum (float * x, float * y, float * z, int N) { omp_set_num_threads(8); #pragma omp parallel { core_num = omp_get_thread_num(); a=(N/8)*core_num; b=a+N/8; for (int i=a; i<b; i++) z[i] = x[i] + y[i]; } }
ã«ãŒãã«çªå·ã¯ãOpenMPé¢æ°omp_get_thread_numïŒïŒ;ã«ãã£ãŠèªã¿åãããŸãã ãã®æ©èœã¯ã䞊åé åå
ã§ã¯ãã¹ãŠã®ã³ã¢ã§åãããã«å®è¡ãããŸãããç°ãªãã³ã¢ã§ã¯ç°ãªãçµæãåŸãããŸãã ããã«ããã䞊åé åå
ã§äœæ¥ãããã«åå²ããããšãå¯èœã«ãªããŸãã ç°¡åã«ããããã«ããµã€ã¯ã«Nã®å埩åæ°ã¯ã«ãŒãã«æ°ã®åæ°ã§ãããšä»®å®ããŸãã ã«ãŒãã«çªå·ã®èªã¿åãã¯ãç¹å¥ãªã«ãŒãã«çªå·ã¬ãžã¹ã¿ïŒTMS320C6678ããã»ããµã®DNUMã¬ãžã¹ã¿ïŒã®åã³ã¢ã®ååšã«åºã¥ãããŒããŠã§ã¢ã«åºã¥ããŠè¡ãããšãã§ããŸãã ã¢ã»ã³ãã©ã³ãã³ããCSLããããµããŒãã©ã€ãã©ãªã®æ©èœãªã©ãããŸããŸãªæ¹æ³ã§ã¢ã¯ã»ã¹ã§ããŸãã ãã ããOpenMPã¢ãã€ã³ãæäŸããæ©èœãå©çšã§ããŸãã ãã ããããã§ã¯ãOpenMPã®ã«ãŒãã«çªå·ãšäžŠåé åçªå·ãç°ãªãæŠå¿µã§ãããšããäºå®ã«åã³æ³šæãæãå¿
èŠããããŸãã ããšãã°ã3çªç®ã®äžŠåã¹ã¬ããã¯ãããšãã°5çªç®ã®ã³ã¢ã§å®è¡ãããŸãã ããã«ã次ã®äžŠåé åã§ããŸãã¯åã䞊åé åãééãããšãã«ãããšãã°4çªç®ã®ã³ã¢ã§3çªç®ã®ã¹ã¬ãããå®è¡ã§ããŸãã ãªã©ãªã©ã
8ã³ã¢ã§å®è¡ãããããã°ã©ã ããããŸããã åã³ã¢ã¯å
¥åé
åã®ç¬èªã®éšåãåŠçããåºåé
åã®å¯Ÿå¿ããé åã圢æããŸãã ååŸæ¥å¡ã¯ããã¹ãã®1/8ã®éšåã翻蚳ããçæ³çã«ã¯ãåé¡ã解決ããã®ã«8åã®å éãåŸãããšãã§ããŸãã
Forããã³Parallel forãã£ã¬ã¯ãã£ã
æãåçŽãªãã£ã¬ã¯ãã£ãparallelãæ€èšããŸãããããã«ãããè€æ°ã®ã³ã¢ã§äžŠè¡ããŠå®è¡ããå¿
èŠãããã³ãŒãå
ã®ãã©ã°ã¡ã³ããéžæã§ããŸãã ãã ãããã®ãã£ã¬ã¯ãã£ãã¯ããã¹ãŠã®ã«ãŒãã«ãåãã³ãŒããå®è¡ããäœæ¥ã®åé¢ããªãããšãæå³ããŸãã ç§ãã¡ã¯èªåã§ãããããªããã°ãªããŸããã§ããã
䞊åé åå
ã®äœæ¥ãã«ãŒãã«éã§ã©ã®ããã«åå²ãããããèªåçã«ç€ºããå Žåã«ãã£ãŠã¯è¿œå ã®forãã£ã¬ã¯ãã£ãã䜿çšããŸãã ãã®ãã£ã¬ã¯ãã£ãã¯foråã®ã«ãŒãã®çŽåã®äžŠåé åå
ã§äœ¿çšãããã«ãŒãã«éã§ã«ãŒãã®ç¹°ãè¿ããåæ£ããå¿
èŠãããããšã瀺ããŸãã 䞊åãã£ã¬ã¯ãã£ããšforãã£ã¬ã¯ãã£ãã¯å¥ã
ã«äœ¿çšã§ããŸãã
#pragma omp parallel #pragma omp for
ãŸããã¬ã³ãŒããåæžããããã«ã1ã€ã®ãã£ã¬ã¯ãã£ãã§äžç·ã«äœ¿çšã§ããŸãã
#pragma omp parallel for
é
åã®äŸã§ãã£ã¬ã¯ãã£ãfor parallelã䜿çšãããšã次ã®ããã°ã©ã ã³ãŒãã«ãªããŸãã
#include <ti/omp/omp.h> void vecsum (float * x, float * y, float * z, int N) { int i; omp_set_num_threads(8); #pragma omp parallel for for (i=0; i<N; i++) z[i] = x[i] + y[i]; }
ãã®ããã°ã©ã ãå
ã®ã·ã³ã°ã«ã³ã¢å®è£
ãšæ¯èŒãããšãéãã¯ãããããã§ããããšãããããŸãã omp.hããããŒãã¡ã€ã«ãæ¥ç¶ãããã©ã¬ã«ã¹ã¬ããã®æ°ãèšå®ãã1è¡ïŒãã©ã¬ã«forãã£ã¬ã¯ãã£ãïŒãè¿œå ããŸããã
泚é1.æšè«ã§æå³çã«é ããã1ã€ã®éãã¯ãå€æ°iã®å®£èšãã«ãŒãããé¢æ°å€æ°ãèšè¿°ããã»ã¯ã·ã§ã³ã«ãããæ£ç¢ºã«ã¯ã³ãŒãã®äžŠåé åããé 次é åã«è»¢éããããšã§ãã ãã®ã¢ã¯ã·ã§ã³ã説æããã«ã¯ææå°æ©ã§ãããããã¯åºæ¬çãªãã®ã§ããããã©ã€ããŒããªãã·ã§ã³ãšå
±æãªãã·ã§ã³ã«é¢ããã»ã¯ã·ã§ã³ã§åŸã»ã©èª¬æããŸãã
åè2.ã«ãŒãã®ç¹°ãè¿ãã¯ã«ãŒãã«éã§åå²ããããšèšããŸãããã©ã®ããã«æ£ç¢ºã«åå²ããããã¯è¿°ã¹ãŠããŸããã ã©ã®ã³ã¢ã§å®è¡ããããµã€ã¯ã«ã®å
·äœçãªå埩ã¯äœã§ããïŒ OpenMPã«ã¯ã䞊åã¹ã¬ããã«å埩ãåæ£ããããã®ã«ãŒã«ãèšå®ããæ©èœããããŸãããããã®æ©èœã«ã€ããŠã¯åŸã§èª¬æããŸãã ãã ãã以åã«æ€èšããæ¹æ³ã§æåã§ã®ã¿ç¹å®ã®ã«ãŒãã«ãç¹å®ã®å埩ã«åºå®ããããšãã§ããŸãã 確ãã«ãéåžžããã®ãããªãã€ã³ãã£ã³ã°ã¯å¿
èŠãããŸããã ãµã€ã¯ã«ã®å埩åæ°ãã«ãŒãã«æ°ã®åæ°ã§ãªãå Žåãã«ãŒãã«å
šäœã®å埩ã®åæ£ãå®è¡ãããè² è·ãå¯èœãªéãåçã«åæ£ãããŸãã
ã»ã¯ã·ã§ã³ãšäžŠåã»ã¯ã·ã§ã³ã®ãã£ã¬ã¯ãã£ã
ã³ã¢éã®äœæ¥ã®åé¢ã¯ãããŒã¿ã®åé¢ã«åºã¥ããŠããŸãã¯ã¿ã¹ã¯ã®åé¢ã«åºã¥ããŠè¡ãããšãã§ããŸãã ã¢ãããžãŒãæãåºããŠãã ããã ãã¹ãŠã®åŸæ¥å¡ãåãããšïŒããã¹ãã翻蚳ããŠããïŒãããŠãããããããããç°ãªãããã¹ãã翻蚳ããŠããå Žåãããã¯æåã®ã¿ã€ãã®äœæ¥åºåãã€ãŸãããŒã¿åé¢ãæå³ããŸãã åŸæ¥å¡ãããŸããŸãªã¢ã¯ã·ã§ã³ãå®è¡ããå Žåãããšãã°ã1ã€ã¯ããã¹ãå
šäœã翻蚳ãããã1ã€ã¯åœŒã®èŸæžã§åèªãæ¢ãã3ã€ç®ã¯ç¿»èš³ããã¹ããå
¥åããŸãã 調æ»ãã䞊åãã£ã¬ã¯ãã£ããšforãã£ã¬ã¯ãã£ãã«ãããããŒã¿ãåå²ããŠäœæ¥ãå
±æã§ããŸããã ã«ãŒãã«éã§ã¿ã¹ã¯ãåé¢ãããšãã»ã¯ã·ã§ã³ãã£ã¬ã¯ãã£ããå®è¡ã§ããŸããããã¯ãforãã£ã¬ã¯ãã£ãã®å Žåã®ããã«ããã©ã¬ã«ãã£ã¬ã¯ãã£ããšã¯ç¬ç«ããŠããŸãã¯äžç·ã«äœ¿çšããŠã¬ã³ãŒããåæžã§ããŸãã
#pragma omp parallel #pragma omp sections
ãããŠ
#pragma omp parallel sections
äŸãšããŠã3ã€ã®ããã»ããµã³ã¢ã䜿çšããããã°ã©ã ãæäŸããŸããåã³ã¢ã¯ãå
¥åä¿¡å·xãåŠçããç¬èªã®ã¢ã«ãŽãªãºã ãå®è¡ããŸãã
#include <ti/omp/omp.h> void sect_example (float* x) { omp_set_num_threads(3); #pragma omp parallel sections { #pragma omp section Algorithm1(x); #pragma omp section Algorithm2(x); #pragma omp section Algorithm3(x); } }
å
±æããã©ã€ããŒããããã©ã«ãã®ãªãã·ã§ã³
æ€èšã®ããã«æ°ããäŸãéžæããŸãã 2ã€ã®ãã¯ãã«ã®ã¹ã«ã©ãŒç©ãèšç®ããŸãã ãã®æé ãå®è£
ããåçŽãªCããã°ã©ã ã¯æ¬¡ã®ããã«ãªããŸãã
float x[N]; float y[N]; void dotp (void) { int i; float sum; sum = 0; for (i=0; i<N; i++) sum = sum + x[i]*y[i]; }
å®è¡çµæïŒ16èŠçŽ ã®ãã¹ãé
åã®å ŽåïŒã¯çããããšãå€æããŸããã
[TMS320C66x_0] sum = 331.0
parallel forãã£ã¬ã¯ãã£ãã䜿çšããŠããã®ããã°ã©ã ã®äžŠåå®è£
ã«é²ã¿ãŸãããã
float x[N]; float y[N]; void dotp (void) { int i; float sum; sum = 0; #pragmaomp parallel for { for (i=0; i<N; i++) sum = sum + x[i]*y[i]; } }
å®è¡çµæïŒ
[TMS320C66x_0] sum= 6.0
ããã°ã©ã ã¯ééã£ãçµæãåºããŸãïŒ ãªãã§ïŒ
ãã®è³ªåã«çããã«ã¯ãå€æ°ã®å€ãã·ãŒã±ã³ã·ã£ã«é åãšãã©ã¬ã«é åã§ã©ã®ããã«æ¥ç¶ãããŠããããç解ããå¿
èŠããããŸãã OpenMPã®ããžãã¯ã«ã€ããŠè©³ãã説æããŸãã
dotpïŒïŒé¢æ°ã¯ã0çªç®ã®ããã»ããµã³ã¢ã®ã·ãŒã±ã³ã·ã£ã«é åãšããŠå®è¡ãéå§ããŸãã åæã«ãé
åxããã³yã¯ãå€æ°Iããã³sumãšåæ§ã«ãããã»ããµã¡ã¢ãªå
ã§ç·šæãããŸãã parallelãã£ã¬ã¯ãã£ãã«éãããšãOpenMPãŠãŒãã£ãªãã£é¢æ°ãæ©èœããã³ã¢ã®åŸç¶ã®äžŠåæäœãæŽçããŸãã ã«ãŒãã«ã¯åæåãããåæãããããŒã¿ãæºåãããäžè¬çãªéå§ãè¡ãããŸãã å€æ°ãšé
åã¯ã©ããªããŸããïŒ
OpenMPã®ãã¹ãŠã®ãªããžã§ã¯ãïŒå€æ°ãšé
åïŒã¯ãå
±æïŒå
±æïŒãšãã©ã€ããŒãïŒãã©ã€ããŒãïŒã«åããããšãã§ããŸãã å
±æãªããžã§ã¯ãã¯å
±æã¡ã¢ãªã«é
眮ããã䞊åé åå
ã®ãã¹ãŠã®ã³ã¢ã«ãã£ãŠåãåºç€ã§äœ¿çšãããŸãã å
±éãªããžã§ã¯ãã¯ãé 次é åå
ã®åãååã®ãªããžã§ã¯ããšäžèŽããŸãã ãããã¯ãã®æå³ãä¿æãããŸãŸãã·ãŒã±ã³ã·ã£ã«ãããªãŒãžã§ã³ã«å¹³è¡ã«ç§»åããå€æŽãªãã«æ»ããŸãã 䞊åé åå
ã®ãã®ãããªãªããžã§ã¯ããžã®ã¢ã¯ã»ã¹ã¯ããã¹ãŠã®ã³ã¢ã«å¯ŸããŠåãåºç€ã§å®è¡ãããå
±æã®ç«¶åãçºçããå¯èœæ§ããããŸãã ãã®äŸã§ã¯ãå€æ°xãšyã®é
åã¯ããã©ã«ãã§å
±éã§ããããšãå€æããŸããã ãã¹ãŠã®ã³ã¢ãããããªãŒãšåãå€æ°åèšã䜿çšããããšãããããŸããã ãã®çµæãããã€ãã®ã³ã¢ãããããªãŒã®åãé»æµå€ãåæã«èªã¿åãããããã«éšåçãªå¯äžãè¿œå ããæ°ããå€ãããããªãŒã«æžã蟌ãç¶æ³ãæã
çºçããŸãã åæã«ãæåŸã«èšé²ããã³ã¢ã¯ä»ã®ã³ã¢ã®çµæãæ¶å»ããŸãã ãã®ããããã®äŸã§ã¯ééã£ãçµæãåºãŸããã
äžè¬å€æ°ãšãã©ã€ããŒãå€æ°ã䜿çšããåçãå³2ã«ç€ºããŸãã
å³2.ãããªãã¯å€æ°ãšãã©ã€ããŒãå€æ°ãæäœããOpenMPã®å³ãã©ã€ããŒããªããžã§ã¯ãã¯ãã³ã¢ããšã«åå¥ã«äœæãããå
ã®ãªããžã§ã¯ãã®ã³ããŒã§ãã ãããã®ã³ããŒã¯ã䞊åé åã®åæåäžã«åçã«äœæãããŸãã ãã®äŸã§ã¯ãã«ãŒãå埩ã«ãŠã³ã¿ãŒãšããŠã®å€æ°iã¯ããã©ã«ãã§ãã©ã€ããŒããšèŠãªãããŸãã 䞊åãã£ã¬ã¯ãã£ãã«å°éãããšããã®å€æ°ã®8ã€ã®ã³ããŒïŒäžŠåã¹ã¬ããã®æ°ã«ããïŒãããã»ããµã¡ã¢ãªã«äœæãããŸãã ãã©ã€ããŒãå€æ°ã¯ãåã³ã¢ã®ãã©ã€ããŒãã¡ã¢ãªã«é
眮ãããŸãïŒããŒã«ã«ã¡ã¢ãªã«é
眮ããããšããäžè¬ã«ãå€æ°ã®å®£èšæ¹æ³ãã¡ã¢ãªã®æ§ææ¹æ³ã«å¿ããŠé
眮ããããšãã§ããŸãïŒã ãã©ã€ããŒãã³ããŒã¯ãã·ãŒã±ã³ã·ã£ã«ãªãŒãžã§ã³ã®ãœãŒã¹ãªããžã§ã¯ãã«æ±ºããŠé¢é£ä»ããããŸããã ããã©ã«ãã§ã¯ããœãŒã¹ãªããžã§ã¯ãã®å€ã¯äžŠåé åã«è»¢éãããŸããã ãªããžã§ã¯ãã®ãã©ã€ããŒãã³ããŒãã䞊åé åå®è¡ã®éå§æã«ã©ã®ããã«ãªã£ãŠãããã¯ããããŸããã 䞊åé åã®æåŸã§ããã©ã€ããŒãã³ããŒã®å€ã¯ããããã®å€ãé 次é åã«è»¢éããããã®ç¹å¥ãªæªçœ®ãè¬ããããªãéããåã«å€±ãããŸããããã«ã€ããŠã¯åŸã§èª¬æããŸãã
ã©ã®ãªããžã§ã¯ãããã©ã€ããŒããšèŠãªãã¹ãããã©ã®ãªããžã§ã¯ããå
±éãšèŠãªãããã³ã³ãã€ã©ãŒã«æ瀺çã«äŒããããã«ãOpenMPãã£ã¬ã¯ãã£ããšãšãã«å
±æããã³ãã©ã€ããŒããªãã·ã§ã³ã䜿çšãããŸãã äžè¬ãŸãã¯ãã©ã€ããŒãã«é¢é£ãããªããžã§ã¯ãã®ãªã¹ãã¯ã察å¿ãããªãã·ã§ã³ã®åŸã«æ¬åŒ§ã§å²ãŸããã«ã³ãã§ç€ºãããŸãã ãã®å Žåãå€æ°iãšsumã¯ãã©ã€ããŒãã§ãããé
åxãšyã¯å
±æãããŠããå¿
èŠããããŸãã ãããã£ãŠã次ã®åœ¢åŒã®æ§é ã䜿çšããŸãã
#pragma omp parallel for private(i, sum) shared(x, y)
䞊åé åãéããšãã ããã§ãåã³ã¢ã«ã¯ç¬èªã®ããããªãŒããããèç©ã¯äºãã«ç¬ç«ããŠè¡ãããŸãã ããã«ãåæå€ãäžæãªã®ã§ãããããªãŒããŒãã«ãªã»ããããå¿
èŠããããŸãã ããã«ãåã³ã¢ã§åŸãããç¹å®ã®çµæãã©ã®ããã«çµã¿åãããããšããåé¡ãçããŸãã 1ã€ã®ãªãã·ã§ã³ã¯ã8ã»ã«ã®ç¹æ®ãªå
±éé
åã䜿çšããããšã§ããåã³ã¢ã¯çµæã䞊åé åå
ã«é
眮ãã䞊åé åãé¢ããåŸãã¡ã€ã³ã³ã¢ã¯ãã®é
åã®èŠçŽ ãåèšããŠæçµçµæã圢æããŸãã 次ã®ããã°ã©ã ã³ãŒããååŸããŸãã
float x[N]; float y[N]; float z[8]; void dotp (void) { int i, core_num; float sum; sum = 0; #pragma omp parallel private(i, sum, core_num) shared(x, y, z) { core_num = omp_get_thread_num(); sum = 0; #pragma omp for for (i=0; i<N; i++) sum = sum + x[i]*y[i]; z[core_num] = sum; } for (i=0; i<8; i++) sum = sum + z[i]; }
å®è¡çµæïŒ
[TMS320C66x_0] sum= 331.0
ããã°ã©ã ã¯æ£ããåäœããŸãããå°ãé¢åã§ãã ããã«åçŽåããæ¹æ³ã«ã€ããŠèª¬æããŸãã
èå³æ·±ãã®ã¯ã䞊åé åã®åæåäžã«OpenMPé
ååããã©ã€ããŒããªããžã§ã¯ããšããŠæå®ãããšãå€æ°ã®å Žåãšåãããã«åäœããããšã§ãããããã®é
åã®ãã©ã€ããŒãã³ããŒãåçã«äœæãããŸãã ããã¯ãç°¡åãªå®éšãè¡ãããšã§ç¢ºèªã§ããŸãããã©ã€ããŒããªãã·ã§ã³ã䜿çšããŠé
åã宣èšãããã®é
åãžã®ãã€ã³ã¿ãŒã®å€ãã·ãªã¢ã«ããã³ãã©ã¬ã«é åã§åºåããŸãã 9ã€ã®ç°ãªãã¢ãã¬ã¹ã衚瀺ãããŸãïŒã³ã¢ã®æ°-8ïŒã
次ã«ãé
åã®èŠçŽ ã®å€ãäºãã«é¢é£ããŠããªãããšã確èªã§ããŸãã ãŸããåã䞊åé åãç¶ããŠå
¥åãããšãé
åã®ãã©ã€ããŒãã³ããŒã®ã¢ãã¬ã¹ãç°ãªãå Žåããããããã©ã«ãã§ã¯èŠçŽ å€ã¯ä¿åãããŸããã ããã¯ãã¹ãŠã䞊åé åãéãããéãããããOpenMPãã£ã¬ã¯ãã£ããéåžžã«é¢åã§ãããç¹å®ã®å®è¡æéãå¿
èŠãšãããšããäºå®ã«ã€ãªãããŸãã
䞊åé åãéãããã®ãã£ã¬ã¯ãã£ãã§ãªããžã§ã¯ãã®ã¿ã€ãïŒãããªãã¯/ãã©ã€ããŒãïŒãæ瀺çã«ç€ºãããŠããªãå ŽåãOpenMPã¯[5]ã§èª¬æãããŠããç¹å®ã®ã«ãŒã«ã«åŸã£ãŠãåäœãããŸãã OpenMPãªããžã§ã¯ãã¯ããã©ã«ããšããŠèª¬æãããŠããŸããã ã¿ã€ãããã©ã€ããŒãã§ãããå
±æã§ãããã¯ãOpenMPæäœã®ãã©ã¡ãŒã¿ãŒã®1ã€ã§ããç°å¢å€æ°ã«ãã£ãŠæ±ºãŸããŸãã
ãã®ãã©ã¡ãŒã¿ãŒã¯ãæäœäžã«èšå®ããã³å€æŽã§ããŸããäŸå€ã¯ãã«ãŒãå埩ã«ãŠã³ã¿ãŒãšããŠäœ¿çšãããå€æ°ã§ããããã©ã«ãã§ã¯ãã©ã€ããŒããšèŠãªãããŸãã確ãã«ããã®èŠåã¯forãparallel forãªã©ã®ãã£ã¬ã¯ãã£ãã«ã®ã¿é©çšãããããããããã®å€æ°ã«ã¯ç¹ã«æ³šæãæãããšããå§ãããŸãããã®ç¹ã§ãããã©ã«ããªãã·ã§ã³ã䜿çšãããšäŸ¿å©ã§ãããã®ãªãã·ã§ã³ã䜿çšãããšãã«ãŒã«ãé©çšããããªããžã§ã¯ãïŒããã©ã«ãã®ã¿ã€ãïŒãæå®ã§ããŸããåæã«ããã®ãªãã·ã§ã³ã®ãã©ã¡ãŒã¿ãŒãšããŠnoneãéžæããå Žåãå€æ°ã¯ããã©ã«ãã®åãåãå
¥ããããªãããšãæå³ããŸããã€ãŸãã䞊åé åã§çºçãããã¹ãŠã®ãªããžã§ã¯ãã®åã®å¿
é ã®æ瀺çãªæ瀺ãå¿
èŠã§ãã #pragma omp parallel private(sum, core_num) shared(x, y, z) default(i)
ãŸãã¯ïŒ #pragma omp parallel private(i, sum, core_num) shared(x, y, z) default(none)
åæžãªãã·ã§ã³
8ã€ã®ã³ã¢ã«ã¹ã«ã©ãŒç©ãå®è£
ããèæ
®ãããäŸã§ã¯ã1ã€ã®æ¬ ç¹ã«æ³šæããŸãããã³ã¢ã®éšåçãªçµæãçµåããã«ã¯ã³ãŒãã倧å¹
ã«å€æŽããå¿
èŠããããé¢åã§äžäŸ¿ã§ããåæã«ãopenMPã®æŠå¿µã¯ãã·ã³ã°ã«ã³ã¢ãããã«ãã³ã¢ãžã®å®è£
ããŸãã¯ãã®éãžã®ç§»è¡ã«ãããæ倧ã®éææ§ãæå³ããŸããåã®ã»ã¯ã·ã§ã³ã§èª¬æããããã°ã©ã ãç°¡çŽ åããããã«ãåæžãªãã·ã§ã³ã䜿çšã§ããŸããåæžãªãã·ã§ã³ã䜿çšãããšãã«ãŒãã«ã®çµæãçµåããå¿
èŠãããããšãã³ã³ãã€ã©ãŒã«äŒããããšãã§ãããã®ãããªçµåã®èŠåãèšå®ã§ããŸããåæžãªãã·ã§ã³ã¯ãå€ãã®æãäžè¬çãªç¶æ³ã«å¯Ÿå¿ããŠããŸãããªãã·ã§ã³ã®æ§æã¯æ¬¡ã®ãšããã§ãã reduction ( : )
identifier-ãã©ã€ããŒããªçµæãçµåããã©ã®æäœãå®è¡ãããã決å®ããŸããç¹å®ã®çµæãè¡šãå€æ°ã®åæå€ãèšå®ããŸãã
ãªããžã§ã¯ãã®ãªã¹ãâã«ãŒãã«ã®æäœã®ç¹å®ã®çµæãå®åŒåããããã«äœ¿çšãããå€æ°ã®åå
çŸåšOpenMPæšæºã§æäŸãããŠããåæžãªãã·ã§ã³ã䜿çšããããã®ãã¹ãŠã®å¯èœãªãªãã·ã§ã³ãè¡š1ã«ç€ºããŸããå¯èœãªæäœèå¥åïŒ+ã*ã-ãïŒã|ã^ã&&ã||ãmaxãmin察å¿ããå€æ°ã®åæå€ïŒ0ã 1ã0ã0ã0ã0ã1ã0ããã®ã¿ã€ãã®æå°å€ããã®ã¿ã€ãã®æ倧å€ãã¹ã«ã©ãŒè£œåããã°ã©ã ã§ã¯ãsumå€æ°ã«èå¥åã+ããæå®ããçž®çŽãªãã·ã§ã³ã䜿çšããŸãã float x[N]; float y[N]; void dotp (void) { int i; float sum; #pragma omp parallel for private(i) shared(x, y) reduction(+:sum) for (i=0; i<N; i++) sum += x[i]*y[i]; }
å®è¡çµæïŒ [TMS320C66x_0] sum= 331.0
ããã°ã©ã ã¯æ£ããçµæãæäŸãããšåæã«ãéåžžã«ã³ã³ãã¯ãã«èŠããå
ã®ãã·ãŒã±ã³ã·ã£ã«ãã³ãŒããšã®æå°éã®éãã®ã¿ãå«ã¿ãŸãïŒOpenMP Sync
ãã«ãã³ã¢ããã»ããµã§çºçããäž»ãªåé¡ã®1ã€ã¯ãã³ã¢ã®åæã®åé¡ã§ããè€æ°ã®ã³ã¢ã1ã€ã®äžè¬çãªåé¡ãåæã«è§£æ±ºããå ŽåãååãšããŠãã¢ã¯ã·ã§ã³ã調æŽããå¿
èŠããããŸããããã³ã¢ãå¥ã®ã³ã¢ãããæ©ãããã€ãã®æ©èœãå®è¡ãå§ãããšãäžè¬çãªäœæ¥ã®çµæãäžæ£ç¢ºã«ãªãããšããããŸãããã¹ãŠã®ã«ãŒãã«ã1ã€ã®å
±éå€æ°ã§åäœããããã«ãããšãã«ããã§ã«ãã®åé¡ã«éšåçã«ééããŸãããççŸã¯ééã£ãçµæããããããŸãããäžè¬çãªå Žåãã«ãŒãã«ã®åæã¯ãããã°ã©ã ã³ãŒãã®ç¹å®ã®ãã€ã³ãã§ãã¹ãŠã®ã«ãŒãã«ãŸãã¯ãã®å¿
èŠãªéšåãäœæ¥ãåæ¢ããç¹å®ã®ãã€ã³ãïŒåæãã€ã³ãïŒã«å°éããããšãä»ã®ã«ãŒãã«ã«éç¥ããä»ã®ãã¹ãŠã®ã«ãŒãã«ããã®ãã€ã³ãã«å°éãããŸã§äœæ¥ãç¶è¡ããªããšããäºå®ããæããŸãåæã 1ã€ã®äžŠåãã©ã°ã¡ã³ããå®äºãããšããã¥ãŒã¯ãªã¢ã¹ã¯äºãã«åŸ
æ©ãã次ã®ãã©ã°ã¡ã³ãã«ç§»åããŠäœæ¥ã調æŽããŸããã³ã¢ïŒãŸãã¯äžŠåã¹ã¬ããïŒã®åæã¯ãå®è¡å¯èœãªããã°ã©ã ã³ãŒãã«ããåæã ãã§ãªããããŒã¿ã«ããåæãæå³ããããšã«æ³šæããããšãéèŠã§ãããã£ãã·ã¥ã®åæããããŸãïŒãã£ãã·ã¥ã§å€æŽãããããŒã¿ã®ã¡ã€ã³ã¡ã¢ãªãžã®æ»ããããã¯éåžžã«éèŠãªãã€ã³ãã§ããOpenMPã³ã³ã»ããã®ã«ãŒãã«ã¯äž»ã«å
±æã¡ã¢ãªã§åäœãããã®ãã©ã°ã¡ã³ãã¯åã³ã¢ã®ããŒã«ã«ã¡ã¢ãªã«ãã£ãã·ã¥ãããŸãããã®çµæãæåã®ã³ã¢ã®ãã£ãã·ã¥ãšå
±æïŒã¡ã€ã³ïŒã¡ã¢ãªã®éåæåã«ããã1ã€ã®ã³ã¢ã«ãã£ãŠå€æŽãããå
±æå€æ°ã®å€ãä»ã®ã³ã¢ã«ãã£ãŠæ£ããèªã¿åãããªãå ŽåããããŸããOpenMPã«ã¯ãæé»çãšæ瀺çã®2çš®é¡ã®åæããããŸããæé»çãªåæã¯ã䞊åé åã®çµãããããã³omp forãompã»ã¯ã·ã§ã³ãªã©ãå«ã䞊åé åå
ã«é©çšã§ããããã€ãã®ãã£ã¬ã¯ãã£ãã®çµããã§èªåçã«çºçããŸãããã®å Žåããã£ãã·ã¥ã®åæãèªåçã«è¡ãããŸããåé¡ã解決ããããã®ã¢ã«ãŽãªãºã ããèªååæãæäŸãããªã䞊åé åå
ã®ããã°ã©ã ã®ãããã®ãã€ã³ãã§ã«ãŒãã«ãåæããå¿
èŠãããå Žåãéçºè
ã¯æ瀺çãªåæã䜿çšã§ããŸã-ç¹å¥ãªãã£ã¬ã¯ãã£ãã䜿çšããŠOpenMPã³ã³ãã€ã©ã«ãããã°ã©ã ã®ãã®ãã€ã³ãã§åæãå¿
èŠã§ããããšãæ瀺çã«ç€ºããŸã ãããã®ãã£ã¬ã¯ãã£ãã®ã¡ã€ã³ãæ€èšããŠãã ãããããªã¢æ什
ããªã¢ãã£ã¬ã¯ãã£ãã¯æ¬¡ã®ããã«èšè¿°ãããŸãã #pragma omp barrier
䞊åé åå
ã®äžŠåOpenMPã¹ããªãŒã ã®åæãã€ã³ããæ瀺çã«èšå®ããŸãã以äžã¯ããã£ã¬ã¯ãã£ãã®äœ¿çšäŸã§ãã #define CORE_NUM 8 float z[CORE_NUM]; void arr_proc(void) { omp_set_num_threads(CORE_NUM); int i, core_num; float sum; #pragma omp parallel private(core_num, i, sum) { core_num=omp_get_thread_num(); z[core_num]=core_num; #pragma omp barrier sum = 0; for(i=0;i<CORE_NUM;i++) sum=sum+z[i]; #pragma omp barrier z[core_num]=sum; } for(i=0;i<CORE_NUM;i++) printf("z[%d] = %f\n", i, z[i]); }
ãã®ããã°ã©ã ã§ã¯ã次ã®ç¶æ³ãã·ãã¥ã¬ãŒãããŸãããä¿¡å·ã®åŠçã«ãzé
åã§ããŒã¿ãçæããã¹ããããzé
åã§ããŒã¿ãåŠçããã¹ããããzé
åã§åŠççµæãèšé²ããã¹ããããå«ããŸããããã°ã©ã ã®å Žåãæåã®æ®µéã§ãåã³ã¢ã¯å
±æã¡ã¢ãªã«ããzé
åã®å¯Ÿå¿ããã»ã«ã«ãã®çªå·ãæžã蟌ã¿ãŸããããã«ããã¹ãŠã®ã³ã¢ã¯å
¥åé
åã®åãåŠçãå®è¡ããŸããã€ãŸããèŠçŽ ã®åèšãèŠã€ããŸãã次ã«ããã¹ãŠã®ã«ãŒãã«ããã«ãŒãã«çªå·ã«å¯Ÿå¿ããzé
åã®ã»ã«ã«çµæãæžã蟌ã¿ãŸããçµæãšããŠãé
åå
ã®ãã¹ãŠã®ã»ã«ã¯åãã§ãªããã°ãªããŸããããã ããããã¯ããªã¢ãã£ã¬ã¯ãã£ãããªããã°çºçããŸãããé
åzã®ãã¹ãŠã®ã»ã«ã¯ç°ãªããäžè¬çã«ã¯ä»»æã§ãã第1段éãã第2段éã«ç§»è¡ãããšãã«ãŒãã«ã¯ãäºããåŸ
ããã«ããŸã æºåãã§ããŠããªãããŒã¿ã®åŠçãéå§ããŸãã2çªç®ã®æ®µéãã3çªç®ã®æ®µéã«ç§»è¡ãããšãã«ãŒãã«ã¯çµæãzé
åã«æžã蟌ã¿å§ããŸãããä»ã®ã«ãŒãã«ã¯ãã®é
åã®å€ãèªã¿åã£ãŠåŠçã«äœ¿çšã§ããŸããäž¡æ¹ã®ããªã¢ãã£ã¬ã¯ãã£ãã®ååšã®ã¿ããããã°ã©ã ã®æ£ããå®è¡ãšãzé
åã®ãã¹ãŠã®èŠçŽ ã§ã®åãèšç®çµæã®èšé²ãä¿èšŒããŸããå®è¡å¯èœã³ãŒãã«ããåæã¯ãããŒã¿ã®åæ-ãã£ãã·ã¥åæãæå³ããŸããéèŠãªæ什
éèŠãªãã£ã¬ã¯ãã£ãã¯æ¬¡ã®ããã«æžãããŠããŸãã #pragma omp critical [ ]
ãŸããäžåºŠã«1ã€ã®ã³ã¢ã®ã¿ãå®è¡ã§ãã䞊åé åå
ã®ã³ãŒããéžæããŸãã, . . , , , , . , . , . , , : , , ; .ä¿¡å·åŠçã®å Žåãç¶æ³ã¯åæ§ã§ããç¹å®ã®ã³ãŒããã©ã°ã¡ã³ããè€æ°ã®ã³ã¢ã§åæã«å®è¡ã§ããªãããšãåŠçã¢ã«ãŽãªãºã ã瀺åããŠããå Žåããã®ãããªãã©ã°ã¡ã³ãã¯criticalãã£ã¬ã¯ãã£ãã«ãã£ãŠåºå¥ã§ããŸãããã®ãã£ã¬ã¯ãã£ãã®é©çšäŸã¯æ¬¡ã®ããã«ãªããŸãã #define CORE_NUM 8 #define N 1000 #define M 80 void crit_ex(void) { int i, j; int A[N]; int Z[N] = {0}; omp_set_num_threads(CORE_NUM); #pragma omp parallel for private (A) for (i = 0; i < M; i++) { poc_A(A, N); #pragma omp critical for (j=0; j<N; j++) Z[j] = Z[j] + A[j]; } }
ãã®ããã°ã©ã ã§ã¯ãé
åAã®åŠçïŒé
åïŒãšé
åZã®åŠççµæã®èç©ã1ãµã€ã¯ã«ã§Måç¹°ãè¿ããããã«ãã³ã¢å®è£
ã«ç§»è¡ãããšãåŠçãµã€ã¯ã«ã®å埩ã8ã³ã¢ã«åæ£ãããŸãããã®å Žåãé
åAã¯ãã©ã€ããŒãé
åãšããŠãã€ãŸãåã³ã¢ã§ç¬ç«ããŠåŠçãããŸãããããã®æé ã«ã¯äŸåé¢ä¿ããªããããåŠçã¯ãã¹ãŠã®ã³ã¢ã§äžŠè¡ããŠå®è¡ã§ããŸããèç©ãããšããã¹ãŠã®ã³ã¢ã®äœæ¥çµæãå
±éã®Zé
åã«çµåãããŸããã³ã¢ãåæããããã®ç¹å¥ãªæªçœ®ãåãããªãå Žåã䞊åã¹ã¬ããã¯1ã€ã®å
±éãªãœãŒã¹ã«ã¢ã¯ã»ã¹ããäºãã®äœæ¥ã«ãšã©ãŒãå°å
¥ããŸãããšã©ãŒãé²ãããã«ããã®å Žæã§äžŠåã¹ã¬ãããå®è¡ãããã®ãé²ãããšãã§ããŸãããªãœãŒã¹ïŒãã®å Žåã¯ã³ãŒãã®äžéšïŒãåŒãç¶ãæåã®ã³ã¢ãå®å
šã«ãããææãããã¹ãŠã®ã¹ããããå®äºãããŸã§ãæ®ãã®ã³ã¢ã¯ãã³ãŒãã®ã¯ãªãã£ã«ã«ã»ã¯ã·ã§ã³ã®éå§æã«ãªãœãŒã¹ã解æŸãããã®ãåŸ
ã¡ãŸããå®éã䞊åé åå
ã®é 次åŠçã«ç§»è¡ããŠããŸããã³ãŒãã§ã¯ãã¯ãªãã£ã«ã«ã»ã¯ã·ã§ã³ã次ã®æ§é ã«çœ®ãæããŸãã #pragma omp critical (Z1add) for (j=0; j<N; j++) Z1[j] = Z1[j] + A[j]; #pragma omp critical (Z2mult) for (j=0; j<N; j++) Z2[j] = Z2[j] * A[j];
çŸåšã2ã€ã®éèŠãªã»ã¯ã·ã§ã³ããããŸãã1ã€ã¯ãæ žã®ä»äºã®çµæãåèšããããšã«ãã£ãŠçµã¿åãããããšã§ãããã1ã€ã¯ãä¹ç®ã§ããäž¡æ¹ã®ã»ã¯ã·ã§ã³ã¯1ã€ã®ã³ã¢ã§ã®ã¿åæã«å®è¡ã§ããŸãããç°ãªãã»ã¯ã·ã§ã³ã¯ç°ãªãã³ã¢ã§åæã«å®è¡ã§ããŸããé ååãã¯ãªãã£ã«ã«ãã£ã¬ã¯ãã£ããã¶ã€ã³ã«è¿œå ãããå Žåãå¥ã®ã«ãŒãã«ããã®é åã§åäœããå Žåã®ã¿ãã«ãŒãã«ã¯ã³ãŒããžã®ã¢ã¯ã»ã¹ãæåŠãããŸãããªãŒãžã§ã³ã«ååãå²ãåœãŠãããŠããªãå Žåãä»ã®ã«ãŒãã«ãã©ã®ãªãŒãžã§ã³ã§ãæ¥ç¶ãããŠããªããŠããä»ã®ã«ãŒãã«ããããã®ãªãŒãžã§ã³ã®ããããã§åäœããå Žåãã«ãŒãã«ã¯ã¯ãªãã£ã«ã«ãªãŒãžã§ã³ãå
¥åã§ããŸãããã¢ãããã¯ãã£ã¬ã¯ãã£ã
ã¢ãããã¯ãã£ã¬ã¯ãã£ãã¯æ¬¡ã®ããã«èšè¿°ãããŸãã #pragma omp atomic [read | write | update | capture]
åã®äŸã§ã¯ãç°ãªãã³ã¢ãåãé åããåæã«ã³ãŒããå®è¡ããããšã¯çŠæ¢ãããŠããŸãããããããããã¯ç¶¿å¯ãªèª¿æ»ã§ã¯åççã«æããªããããããŸãããçµå±ã®ãšãããå
±æãªãœãŒã¹ãžã®ã¢ã¯ã»ã¹ã®ç«¶åã¯ãç°ãªãã«ãŒãã«ãåãã¡ã¢ãªã»ã«ã«åæã«ã¢ã¯ã»ã¹ã§ãããšããäºå®ã«ãããŸãã 1ã€ã®ã³ãŒãã®ãã¬ãŒã ã¯ãŒã¯å
ã§ãç°ãªãã¡ã¢ãªã»ã«ã«ã¢ã¯ã»ã¹ããŠããçµæãæªãããšã¯ãããŸããã atomicãã£ã¬ã¯ãã£ãã䜿çšãããšãã«ãŒãã«ã®åæãã¡ã¢ãªèŠçŽ ã«ãã€ã³ãã§ããŸãã圌女ã¯æ¬¡ã®è¡ã§ã¡ã¢ãªæäœã¯ã¢ãããã¯ã§ãã-äžå¯è§£ã§ããããšãææããŸãïŒã«ãŒãã«ãäœããã®ã¡ã¢ãªã»ã«ã§æäœãéå§ãããšãæåã®ã³ã¢ãåäœãçµäºãããŸã§ä»ã®ãã¹ãŠã®ã«ãŒãã«ã«å¯ŸããŠãã®ã¡ââã¢ãªã»ã«ãžã®ã¢ã¯ã»ã¹ãéããããŸã圌女ãã¢ãããã¯ãã£ã¬ã¯ãã£ãã«ã¯ããªãã·ã§ã³ã瀺ãã¡ã¢ãªã§å®è¡ãããæäœã®çš®é¡ïŒèªã¿åã/æžã蟌ã¿/å€æŽ/ãã£ããã£ãäžèšã®äŸã¯ãatomicãã£ã¬ã¯ãã£ãã䜿çšãããšã次ã®ããã«ãªããŸãã #define CORE_NUM 8 #define N 1000 #define M 80 void crit_ex(void) { int i, j; int A[N]; int Z[N] = {0}; omp_set_num_threads(CORE_NUM); #pragma omp parallel for private (A) for (i = 0; i < M; i++) { poc_A(A, N); for (j=0; j<N; j++) { #pragma omp atomic update Z[j] = Z[j] + A[j]; } }
çè«çã«ã¯ãã¢ãããã¯ãã£ã¬ã¯ãã£ãã䜿çšãããšããµã€ã¯ã«ã®å®å
šãªé 次å®è¡ãããèŠæ±ãããé
åèŠçŽ ã®æ°ãç°ãªãã³ã¢ã§äžèŽããå Žåã«åã
ã®ã¡ã¢ãªã¢ã¯ã»ã¹æäœã®ã¿ã®é 次å®è¡ã«é²ããããåŠçæéã倧å¹
ã«ççž®ãããŸãããã ããå®éã«ã¯ããã®ã¢ã€ãã¢ã®æå¹æ§ã¯ããã®å®è£
æ¹æ³ã«ãã£ãŠç°ãªããŸããããšãã°ãã¢ãããã¯ãã£ã¬ã¯ãã£ãã䜿çšããã«ãŒãã«åæããã«ãŒãã®åå埩ã§å
±æã¡ã¢ãªã«ãããã©ã°ã®èªã¿åãã«æžããšãã«ãŒãã®å®è¡æéã倧å¹
ã«å¢å ããå¯èœæ§ããããŸããèšãæãããšãã¯ãªãã£ã«ã«ãã£ã¬ã¯ãã£ãã®å Žåããµã€ã¯ã«å®è¡æéã¯MxT1ããã»ããµãµã€ã¯ã«ã«ãªããŸããããã§ãMã¯ã³ã¢ã®æ°ãT1ã¯1ã€ã®ã³ã¢ã®ãµã€ã¯ã«æéã§ããã¢ãããã¯ãã£ã¬ã¯ãã£ãã®å Žåããµã€ã¯ã«ã¿ã€ã ã¯T2ããã»ããµãµã€ã¯ã«ã«ãªããŸãããã®å Žåãã¢ãããã¯ãã£ã¬ã¯ãã£ããå«ããµã€ã¯ã«ã«ã¯è¿œå ã®åæã³ãŒããå«ãŸããæéT2ã¯æéT1ã®Må以äžã«ãªãããšããããŸãããã®èšäºã§ã¯ãOpenMPã®äž»èŠãªæ§æèŠçŽ ã§ããããã«ãã³ã¢ããã»ããµã«å®è£
ããããã®ãœãããŠã§ã¢ã®ã³ã³ãã€ã©ã«ããèªå䞊ååã«äœ¿çšãããé«ã¬ãã«ããã°ã©ãã³ã°èšèªïŒC / C ++ïŒã®æ¡åŒµæ©èœã«ã€ããŠèª¬æããŸããããã®èšäºã®ç¹åŸŽã¯ãããžã¿ã«ä¿¡å·åŠçã·ã¹ãã ã®ãªãªãšã³ããŒã·ã§ã³ãšãTexas Instrumentsã®8ã³ã¢DSP TMS320C6678ã§ã®ãµã³ãã«ããã°ã©ã ã®å®è¡ã®èª¬æã§ãã OpenMPã®äž»ãªå©ç¹ã¯ãã·ã³ã°ã«ã³ã¢ãããã«ãã³ã¢å®è£
ãžã®ç§»è¡ã容æãªããšã§ããããŒã¿äº€æãåæãå«ããã¹ãŠã®ã³ã¢ã€ã³ã¿ã©ã¯ã·ã§ã³ã¿ã¹ã¯ã¯ãã³ã³ãã€ã«æ®µéã§æ¥ç¶ãããæšæºOpenMPé¢æ°ã«ãã£ãŠå®è¡ãããŸãããã ããéçºã®å©äŸ¿æ§ã¯ãéåžžãçµæã®ãœãªã¥ãŒã·ã§ã³ã®å¹çã®äœäžã«ã€ãªãããŸãããã®èšäºã§ã¯ãOpenMPã®ããŒãªã³ã°ã³ã¹ãã«ã€ããŠã¯èª¬æããŸãããããã«å¥ã®ä»äºãæ§ããããšãèšç»ãããŠããŸããããã«ãé¢ããããOpenMPãã£ã¬ã¯ãã£ãã®ã³ã¹ãã¯éåžžã«é«ããåäœãšæ°äžã¯ããã¯ãµã€ã¯ã«ã§æž¬å®ãããŸãããããã£ãŠã䞊ååã¯æ¯èŒçé«ãã¬ãã«ã§ã®ã¿æå³ããããŸãã䞊åé åå
ã§èšç®è² è·ã倧ãããã»ãšãã©ã®å Žåãã«ãŒãã«ã¯çžäºäœçšããã«ã¿ã¹ã¯ãåŠçããŸããOpenMPæšæºã¯å
±éã®ã€ããªãã®ãŒãèŠå®ããŠããããšã«ã泚æããå¿
èŠããããŸããOpenMPã®æå¹æ§ã¯ãç¹å®ã®ããã»ããµãã©ãããã©ãŒã çšã®OpenMPé¢æ°ã®å®è£
ã«äŸåããŸãããã®ãããããã»ããµTMS320C6678çšã«Texas InstrumentsãéçºããOpenMP 1ããã³2ã®ããŒãžã§ã³ã¯å€§ããç°ãªããŸãã2çªç®ã®ããŒãžã§ã³ã¯ãå€æ°ã®ããŒããŠã§ã¢ã¡ã«ããºã ã䜿çšããŠãã¥ãŒã¯ãªã¢ã¹ã®çžäºäœçšãå éããæåã®ããŒãžã§ã³ãããã¯ããã«å¹æçã§ãããã®åŸã®äœæ¥ã§ã¯ãOpenMPæ©èœãå®è£
ããããã®äž»èŠãªã¡ã«ããºã ãæããã«ããäºå®ã§ãããããã®æ©èœã«é¢é£ããã³ã¹ããåæããŸããOpenMPãã£ã¬ã¯ãã£ãã®å®è£
ã®ãã¹ãæéã®èŠç©ãããçæããŸãããã®ã¡ã«ããºã ã®äœ¿çšå¹çãæ¹åããããã®ã¢ããã€ã¹ãäœæããŸããæåŠ1. G. Blake, RG Dreslinski, T. Mudge, «A survey of multicore processors,» Signal Processing Magazine, vol. 26, no. 6, pp. 26-37, Nov. 2009.
2. LJ Karam, I. AlKamal, A. Gatherer, GA Frantz, «Trends in multicore DSP platforms,» Signal Processing Magazine, vol. 26, no. 6, pp. 38-49, 2009.
3. A. Jain, R. Shankar. Software Decomposition for Multicore Architectures, Dept. of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL, 33431.
4. Web- OpenMP Architecture Review Board (ARB):
openmp.org .
5. OpenMP Application Programming Interface. Version 4.5 November 2015. OpenMP Architecture Review Board. P. 368.
6. OpenMP 4.5 API C/C++ Syntax Reference Guide. OpenMP Architecture Review Board. 2015幎ã
7. J. Diaz, C. Muñoz-Caro, A. Niño. A Survey of Parallel Programming Models and Tools in the Multi and Many-Core Era. IEEE Transactions on Parallel and Distributed Systems. â 2012. â Vol. 23, Is. 8, pp. 1369 â 1386.
8. A. Cilardo, L. Gallo, A. Mazzeo, N. Mazzocca. Efficient and scalable OpenMP-based system-level design. Design, Automation & Test in Europe Conference & Exhibition (DATE). â 2013, pp. 988 â 991.
9. M. ChavarrÃas, F. Pescador, M. Garrido, A. Sanchez, C. Sanz. Design of multicore HEVC decoders using actor-based dataflow models and OpenMP. IEEE Transactions on Consumer Electronics. â 2016. â Vol. 62. â Is. 3, pp. 325 â 333.
10. M. Sever, E. Ãavus. Parallelizing LDPC Decoding Using OpenMP on Multicore Digital Signal Processors. 45th International Conference on Parallel Processing Workshops (ICPPW). â 2016, pp. 46 â 51.
11. A. Kharin, S. Vityazev, V. Vityazev, N. Dahnoun. Parallel FFT implementation on TMS320c66x multicore DSP. 6th European Embedded Design in Education and Research Conference (EDERC). â 2014, pp. 46 â 49.
12. D. Wang, M. Ali, âSynthetic Aperture Radar on Low Power Multi-Core Digital Signal Processor,â High Performance Extreme Computing (HPEC), IEEE Conference on, pp. 1 â 6, 2012.
13. . . , . . . - . ., 2007, 138 .
14. . . . . . , 2006, 90 .
15. .. . OpenMP. . 2009 , 78 .
16. .. . OpenMP. .: 2012, 121 .
17. TMS320C6678 Multicore Fixed and Floating-Point Digital Signal Processor, Datasheet, SPRS691E, Texas Instruments, p. 248, 2014.