Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
AMD OpenCL´óѧ¿Î³Ì(7)
 
ÒëÕߣºÂõ¿ËÀÏÀÇ2012£¬»ðÁú¹ûÈí¼þ ·¢²¼ÓÚ£º2014-11-11
  3116  次浏览      29
 

ÐÔÄÜÓÅ»¯

1¡¢Ïß³ÌÓ³Éä

ËùνÏß³ÌÓ³ÉäÊÇָij¸öÏ̷߳ÃÎÊÄÄÒ»²¿·ÖÊý¾Ý£¬Æäʵ¾ÍÊÇÏß³ÌidºÍ·ÃÎÊÊý¾ÝÖ®¼äµÄ¶ÔÓ¦¹ØÏµ¡£

ºÏÊʵÄÏß³ÌÓ³Éä¿ÉÒÔ³ä·ÖÀûÓÃÓ²¼þÌØÐÔ£¬´Ó¶øÌá¸ß³ÌÐòµÄÐÔÄÜ£¬·´Ö®£¬Ôò»á½µµÍperformance¡£

Çë²Î¿¼Static Memory Access Pattern Analysis on a Massively Parallel GPUÕâÆªpaper£¬ÎÄÖн²ÊöÏß³ÌÈçºÎÔÚËã·¨Öгä·ÖÀûÓÃÏß³ÌÓ³Éä¡£ÕâÊÇÎÒÔÚgoogleÖÐËÑË÷µ½µÄÏÂÔØµØÖ·£ºhttp://www.ece.neu.edu/~bjang/patternAnalysis.pdf

ʹÓò»Í¬µÄÏß³ÌÓ³É䣬ͬһ¸öÏ߳̿ÉÄÜ·ÃÎʲ»Í¬Î»ÖõÄÊý¾Ý¡£ÏÂÃæÊǼ¸¸öÏß³ÌÓ³ÉäµÄÀý×Ó£º

ÎÒÃÇ¿¼ÂÇÒ»¸ö¼òµ¥µÄ´®ÐоØÕó³Ë·¨£ºÕâ¸öËã·¨±È½ÏÊʺÏÊä³öÊý¾Ý½µÎ¬²Ù×÷£¬Í¨¹ý´´½¨N*M¸öỊ̈߳¬ÎÒÃÇÒÆÈ¥Á½²ãÍâÑ­»·£¬ÕâÑùÿ¸öÏß³ÌÖ´ÐÐP¸ö¼Ó·¨³Ë·¨²Ù×÷¡£ÏÖÔÚÐèÒªÎÒÃÇ¿¼ÂǵÄÎÊÌâÊÇ£¬Ïß³ÌË÷Òý¿Õ¼ä¾¿¾¹Ó¦¸ÃÊÇM*N»¹ÊÇN*M£¿

µ±ÎÒÃÇʹÓÃM*NÏß³ÌË÷Òý¿Õ¼äʱºò£¬KernelÈçÏÂͼËùʾ£º

¶øÊ¹ÓÃN*MÏß³ÌË÷Òý¿Õ¼äʱºò£¬KernelÈçÏÂͼËùʾ:

ʹÓÃÁ½ÖÖÓ³Éä¹ØÏµ£¬³ÌÐòÖ´Ðнá¹ûÊÇÒ»ÑùµÄ¡£ÏÂÃæÊÇÔÚnvµÄ¿¨GeForce 285 and 8800 GPUsÉϵÄÖ´Ðнá¹û¡£¿ÉÒÔ¿´µ½Ó³Éä2£¨¼°N*MÏß³ÌË÷Òý¿Õ¼ä£©£¬³ÌÐòµÄperformance¸ü¸ß¡£

performance²îÒìÖ÷ÒªÊÇÒòΪÔÚÁ½ÖÖÓ³É䷽ʽÏ£¬¶Ôglobal memory·ÃÎʵķ½Ê½ÓÐËù²»Í¬¡£ÔÚÐÐÖ÷ÐòµÄbufferÖУ¬Êý¾Ý¶¼Êǰ´ÐÐÖð¸ö´æ´¢£¬ÎªÁ˱£Ö¤ºÏ²¢·ÃÎÊ£¬ÎÒÃÇÓ¦¸Ã°ÑÒ»¸öwaveÖÐÁ¬ÐøµÄÏß³ÌÓ³Éäµ½¾ØÕóµÄÁУ¨µÚ¶þά£©£¬ÕâÑùÔÚA*B=CµÄÇé¿öÏ£¬»á°Ñ¾ØÕóBºÍCµÄÄÚ´æ¶ÁдʵÏֺϲ¢·ÃÎÊ£¬¶øÁ½ÖÖÓ³É䷽ʽ¶ÔAûÓÐÓ°Ï죨AÓÖi3¾ö¶¨Ë³Ðò£©¡£

ÍêÕûµÄÔ´´úÂëÇë´Ó£ºhttp://code.google.com/p/imagefilter-opencl/downloads/detail?name=amduniCourseCode4.zip&can=2&q=#makechangesÏÂÔØ£¬³ÌÐòÖÐÎÒʵÏÖÁËÁ½ÖÖ·½Ê½µÄ±È½Ï¡£½á¹ûȷʵµÚ¶þÖÖ·½Ê½Òª¿ìһЩ¡£

ÏÂÃæÎÒÃÇÔÙ¿´Ò»¸ö¾ØÕóתÖõÄÀý×Ó£¬ÔÚÀý×ÓÖУ¬Í¨¹ý¸Ä±äÓ³É䷽ʽ£¬Ìá¸ßÁËglobal memory·ÃÎʵÄЧÂÊ¡£

¾ØÕóתÖõĹ«Ê½ÊÇ:Out(x,y) = In(y,x)

´ÓÉÏͼ¿ÉÒÔ¿´³ö£¬ÎÞÂÛ²ÅÈ¥ÄÇÖÖÓ³É䷽ʽ£¬×ÜÓÐÒ»¸öbufferÊǷǺϲ¢·ÃÎÊ·½Ê½£¨×¢£ºÔÚ¾ØÕóתÖÃʱ£¬±ØÐëÒª°ÑÊäÈë¾ØÕóµÄij¸öÔªËØ¿½±´µ½ÁÙʱλÖ㬱ÈÈç¼Ä´æÆ÷£¬È»ºó²ÅÄÜ¿½±´µ½Êä³ö¾ØÕ󣩡£ÎÒÃÇ¿ÉÒԸıäÏß³ÌÓ³É䷽ʽ£¬ÓÃlocal memory×÷ΪÖмäÔªËØ£¬´Ó¶øÊµÏÖÊäÈ룬Êä³ö¾ØÕó¶¼ÊÇglobal memoryºÏ²¢·ÃÎÊ¡£

ÏÂÃæÊÇAMD 5870ÏÔ¿¨ÉÏ£¬Á½ÖÖÏß³ÌÓ³É䷽ʽʵÏֵľØÕóתÖÃÐÔÄܱȽϣº

ÍêÕû´úÂ룺http://code.google.com/p/imagefilter-opencl/downloads/detail?name=amduniCourseCode5.zip&can=2&q=#makechanges

2¡¢Occupancy

Ç°ÃæµÄ½Ì³ÌÖУ¬ÎÒÃÇÌáµ½¹ýOccupancyµÄ¸ÅÄËüÖ÷ÒªÓÃÀ´ÃèÊöCUÖÐ×ÊÔ´µÄÀûÓÃÂÊ¡£

OpenCLÖÐworkgroup±»Ó³Éäµ½Ó²¼þµÄCUÖÐÖ´ÐУ¬ÔÚÒ»¸öworkgroupÖеÄËùÓÐÏß³ÌÖ´ÐÐÍêÖ®ºó£¬Õâ¸öworkgroup²ÅËãÖ´ÐнáÊø¡£¶ÔÒ»¸öÌØ¶¨µÄcuÀ´Ëµ£¬ËüµÄ×ÊÔ´£¨±ÈÈç¼Ä´æÆ÷ÊýÁ¿£¬local memory´óС£¬×î´óÏß³ÌÊýÁ¿µÈ£©Êǹ̶¨µÄ£¬ÕâЩ×ÊÔ´¶¼»áÏÞÖÆcuÖÐͬʱ´¦ÓÚµ÷¶È״̬µÄworkgroupÊýÁ¿¡£Èç¹ûcuÖеÄ×ÊÔ´ÊýÁ¿×ã¹»µÄµÄ»°£¬Ó³É䵽ͬһ¸öcuµÄ¶à¸öworkgroupÄÜͬʱ´¦ÓÚµ÷¶È״̬£¬ÆäÖÐÒ»¸öworkgroupµÄwave´¦ÓÚÖ´ÐÐ״̬£¬µ±´¦ÓÚÖ´ÐÐ״̬µÄworkgroupËùÓÐwaveÒòΪµÈ´ý×ÊÔ´¶øÇл»µ½µÈ´ý״̬µÄ»°£¬²»Í¬workgroupÄܹ»´Ó¾ÍÐ÷״̬Çл»µ½ALUÖ´ÐУ¬ÕâÑùÒþ²Ømemory·ÃÎÊʱÑÓ¡£ÕâÓеãÀàËÆ²Ù×÷ϵͳÖнø³ÌÖ®¼äµÄµ÷¶È״̬¡£ÎÒ¼òµ¥»­¸öͼ£¬ÒÔ¹©²Î¿¼£º

1.¶ÔÓÚÒ»¸ö±È½Ï³¤µÄkernel£¬¼Ä´æÆ÷ÊÇÖ÷ÒªµÄ×ÊÔ´Æ¿¾±¡£¼ÙÉèkernelÐèÒªµÄ×î´ó¼Ä´æÆ÷ÊýĿΪ35£¬ÔòworkgroupÖеÄËùÓÐÏ̶߳¼»áʹÓÃ35¸ö¼Ä´æÆ÷£¬¶øÒ»¸öCU£¨¼ÙÉèΪ5870£©µÄ×î´ó¼Ä´æÆ÷ÊýĿΪ16384£¬ÔòcuÖÐ×î¶à¿ÉÓÐ16384/35=468Ị̈߳¬´Ëʱ£¬Ò»¸öworkgroupÖеÄÏß³ÌÊýÄ¿£¨workitem£©²»¿ÉÄܳ¬¹ý468£¬

2.¿¼ÂÇÁíÒ»¸öÎÊÌ⣬һ¸öcu¹²16384¸ö¼Ä´æÆ÷£¬¶øworkgroup¹Ì¶¨Îª256¸öỊ̈߳¬ÔòʹÓõļĴæÆ÷ÊýÁ¿¿É´ïµ½64¸ö¡£

ÿ¸öCUµÄlocal memoryÒ²ÊÇÓÐÏ޵쬶ÔÓÚAMD HD 5XXXÏÔ¿¨£¬local memoryÊÇ32K£¬NVµÄÏÔ¿¨local memoryÊÇ32-48K£¨¾ßÌå¿´Ðͺţ©¡£ºÍʹÓüĴæÆ÷µÄÇé¿öÏàËÆ£¬Èç¹ûkernelʹÓùý¶àµÄlocal memory£¬ÔòworkgroupÖеÄÏß³ÌÊýĿҲ»áÓÐÏÞÖÆ¡£

GPUÓ²¼þ»¹ÓÐÒ»¸öCUÄÚµÄ×î´óÏß³ÌÊýÄ¿ÏÞÖÆ£ºAMDÏÔ¿¨256£¬nvÏÔ¿¨512¡£

NVµÄÏÔ¿¨¶ÔÓÚÿ¸öCUÄڵ碌îÏß³ÌÓÐÊýÁ¿ÏÞÖÆ£¬Ã¿¸öcu 8¸ö»ò16¸öwarp£¬768»òÕß1024¸öÏ̡߳£

AMDÏÔ¿¨¶Ôÿ¸öCUÄÚµÄwaveÊýÁ¿ÓÐÏÞÖÆ£¬¶ÔÓÚ5870£¬×î¶à496¸öwave¡£

ÕâЩÏÞÖÆ¶¼ÊÇÒòΪÓÐÏÞµÄ×ÊÔ´¾ºÕùÒýÆðµÄ£¬ÔÚnv cudaÖУ¬¿ÉÒÔͨ¹ý¿ÉÊÓ»¯µÄ·½Ê½²é¿´×ÊÔ´µÄÏÞÖÆÇé¿ö¡£

3¡¢ÏòÁ¿»¯

ÏòÁ¿»¯ÔÊÐíÒ»¸öÏß³ÌͬʱִÐжà¸ö²Ù×÷¡£ÎÒÃÇ¿ÉÒÔÔÚkernel´úÂëÖУ¬Ê¹ÓÃÏòÁ¿Êý¾ÝÀàÐÍ£¬±ÈÈçfloat4À´»ñµÃ¼ÓËÙ¡£ÏòÁ¿»¯ÔÚAMDµÄGPUÉÏЧ¹û¸üΪÃ÷ÏÔ£¬ÕâÊÇÒòΪAMDµÄÏÔ¿¨µÄstream coreÊÇ(x,y,z,w)ÕâÑùµÄÏòÁ¿ÔËËãµ¥Ôª¡£

ÏÂͼÊÇÔÚ¼òµ¥µÄÏòÁ¿¸³ÖµÔËËãÖУ¬Ê¹ÓÃfloatºÍfloat4µÄÐÔÄܱȽϡ£

kernel´úÂëΪ£º

±¾½ÚÖ÷Òª½éÉÜNBodyËã·¨µÄOpenCLÐÔÄÜÓÅ»¯¡£

1¡¢NBody

NBodyϵͳÖ÷ÒªÓÃÀ´Í¨¹ýÁ£×ÓÖ®¼äµÄÎïÀí×÷ÓÃÁ¦À´Ä£ÄâÐÇϵϵͳ¡£Ã¿¸öÁ£×Ó±íʾһ¸öÐÇÐÇ£¬¶à¸öÁ£×ÓÖ®¼äµÄÏ໥×÷Ó㬾ͳÊÏÖ³öÐÇϵµÄЧ¹û¡£

ÉÏͼΪһ¸öÁ£×ÓÄ£ÄâÐÇϵµÄͼƬ£ºSource: THE GALAXY-CLUSTER-SUPERCLUSTER CONNECTION£¬http://www.casca.ca/ecass/issues/1997-DS/West/west-bil.html

ÓÉÓÚÿ¸öÁ£×ÓÖ®¼ä¶¼ÓÐÏ໥×÷ÓõÄÒýÁ¦£¬ËùÒÔÕâ¸öËã·¨µÄ¸´ÔÓ¶ÈÊÇN2µÄ¡£ÏÂÃæÎÒÃÇÖ÷Ҫ̽ÌÖÈçºÎÓÅ»¯Ëã·¨ÒÔ¼°ÔÚOpenCL»ù´¡ÉÏÓÅ»¯Ëã·¨¡£

2¡¢NBodyËã·¨

¼ÙÉèÁ½¸öÁ£×ÓÖ®¼äͨ¹ýÍòÓÐÒýÁ¦Ï໥×÷Óã¬ÔòÈÎÒâÁ½¸öÁ£×ÓÖ®¼äµÄÏ໥×÷ÓÃÁ¦F¹«Ê½ÈçÏ£º

×µÄ·½·¨¾ÍÊǼÆËãÿ¸öÁ£×ÓºÍÆäËüÁ£×ÓµÄ×÷ÓÃÁ¦Ö®ºÍ£¬Õâ¸ö·½·¨Í¨³£³Æ×÷N-PairµÄNBodyÄ£Äâ¡£

Á£×ÓÖ®¼äµÄÍòÓÐÒýÁ¦ºÍËüÃÇÖ®¼äµÄ¾àÀë³É·´±È£¬¶ÔÓÚÒ»¸öÁ£×Ó¶øÑÔ£¨¼ÙÉèÁ£×ÓÖÊÁ¿¶¼Ò»Ñù£©£¬Ô¶¾àÀëÁ£×ÓµÄ×÷ÓÃÁ¦ÓÐʱºòºÜС£¬ÉõÖÁ¿ÉÒÔºöÂÔ¡£Barnes Hut °Ñ3D¿Õ¼ä°´°Ë²æÊ÷½øÐзָֻÓÐÔÚÏàÁÚcellµÄÁ£×Ó²ÅÖ±½Ó¼ÆËãËüÃÇÖ®¼äµÄÒýÁ¦£¬Ô¶¾àÀëcellÖеÄÁ£×Óµ±×÷Ò»¸öÕûÌåÀ´¼ÆËãÒýÁ¦¡£

3¡¢OpenCLÓÅ»¯Nbody

ÔÚ±¾½ÚÖУ¬ÎÒÃDz»¿¼ÂÇËã·¨±¾ÉíµÄÓÅ»¯£¬Ö»ÊÇͨ¹ýOpenCL»úÖÆÀ´ÓÅ»¯N-PairµÄNBodyÄ£Äâ¡£

×î¼òµ¥µÄʵʩ·½·¨¾ÍÊÇÿ¸öÀý×ÓµÄ×÷ÓÃÁ¦Ïà¼Ó£¬´úÂëÈçÏ£º

for(i=0; i<n; i++)
{
ax = ay = az = 0;
// Loop over all particles "j¡±
for (j=0; j<n; j++) {

//Calculate Displacement
dx=x[j]-x[i];
dy=y[j]-y[i];
dz=z[j]-z[i];

// small eps is delta added for dx,dy,dz = 0
invr= 1.0/sqrt(dx*dx+dy*dy+dz*dz +eps);

ÎÒÃǶÔÿ¸öÁ£×Ó¼ÆËã×÷ÓÃÔÚËüÉÏÃæµÄºÏÁ¦£¬È»ºóÇóÔÚºÏÁ¦×÷ÓÃÏ£¬deltaʱ¼äÄÚÁ£×ÓµÄÐÂλÖ㬲¢°ÑÕâ¸öÐÂλÖõ±×÷Ï´μÆËãµÄÊäÈë²ÎÊý¡£

ûÓÐÓÅ»¯µÄOpenCL kernel´úÂëÈçÏ£º

__global float4* pos ,
__global float4* vel,
int numBodies,
float deltaTime,
float epsSqr,
__local float4* localPos,
__global float4* newPosition,
__global float4* newVelocity)

{
unsigned int tid = get_local_id(0);
unsigned int gid = get_global_id(0);
unsigned int localSize = get_local_size(0);

ÔÚÕâÖÖʵÏÖÖУ¬Ã¿´Î¶¼Òª´Óglobal memoryÖжÁÈ¡ÆäËüÁ£×ÓµÄλÖã¬ËÙ¶È£¬ÄÚ´æ·ÃÎÊ= N reads*N threads= N2

ÎÒÃÇ¿ÉÒÔͨ¹ýlocal memory½øÐÐÓÅ»¯£¬Ò»¸öÁ£×ÓÊý¾Ý¶Á½øÀ´ÒԺ󣬿ÉÒÔ±»p*p¸öÏ̹߳²Óã¬p*p¼´ÎªworkgroupµÄ´óС£¬¶ÔÓÚÿ¸öÁ£×Ó£¬ÎÒÃÇͨ¹ýµü´úp*pµÄtile£¬ÀÛ»ýµÃµ½×îÖÕ½á¹û¡£

ÓÅ»¯ºóµÄkernel´úÂëÈçÏ£º

 int numBodies,

 float deltaTime,

 float epsSqr,

__local float4* localPos,
__global float4* newPosition,
__global float4* newVelocity)

{
 unsigned int tid = get_local_id(0);

ÏÂÃæÊÇÔÚAMD, NVÁ½¸öƽ̨ÉÏÐÔÄܲâÊÔ½á¹û£º

AMD GPU = 5870 Stream SDK 2.2

Nvidia GPU = GTX 480 with CUDA 3.1

ÁíÍ⣬ÔÚ³ÌÐòÖУ¬Ò²³¢ÊÔÁËÑ­»·Õ¹¿ª£¬Í¨¹ýÕ¹¿ªÄÚÑ­»·£¬´Ó¶ø¼õÉÙGPUÖ´ÐзÖÖ§Ö¸ÁÎҵIJâÊÔÖУ¬Ê¹ÓÃÕ¹¿ªËĴΣ¬µÃµ½µÄFPS±Èûչ¿ªÇ°¿ìÁË30%¡££¨AMD 5670ÏÔ¿¨£©¡£¾ßÌåʵÏÖ¿ÉÒÔ¿´kernel´úÂëÖеÄ__kernel void nbody_sim_unrollº¯Êý¡£ÔÚAMDƽ̨ÉÏ£¬Ê¹ÓÃÏòÁ¿»¯Ò²¿ÉÒÔÌá¸ß10%×óÓÒµÄÐÔÄÜ¡£

1¡¢OpenCLÀ©Õ¹

OpenCLÀ©Õ¹ÊÇÖ¸deviceÖ§³ÖijÖÖÌØÐÔ£¬µ«ÕâÖÐÌØÐÔ²¢²»ÊÇOpenCL±ê×¼µÄÒ»²¿·Ö¡£Í¨¹ýÀ©Õ¹£¬³§ÉÌ¿ÉÒÔ¸ødeviceÔö¼ÓһЩÐµĹ¦ÄÜ£¬¶ø²»Óÿ¼ÂǼæÈÝÐÔÎÊÌâ¡£ÏÖÔÚ¸÷¸ö³§ÉÌÔÚOpenCLµÄʵÏÖÖлò¶à»òÉÙµÄʹÓÃÁË×Ô¼ºµÄÀ©Õ¹¡£

À©Õ¹µÄÀàÐÍ·ÖΪÈýÖÖ£º

1.Khronos OpenCL¹¤×÷×éÅú×¼µÄÀ©Õ¹£¬ÕâÖÖÒª¾­¹ýÒ»ÖÂÐÔ²âÊÔ£¬¿ÉÄܻᱻÔö¼Óµ½Ð°汾µÄOpenCL¹æ·¶ÖС£ÕâÖÖÀ©Õ¹¶¼ÒÔcl_khr×÷ΪÀ©Õ¹Ãû¡£

2.ÍⲿÀ©Õ¹, ÒÔcl_extΪÀ©Õ¹Ãû¡£ÕâÖÖÀ©Õ¹ÊÇÓÉ2¸ö»ò2¸öÒÔÉϵij§ÉÌ·¢Æð£¬²¢²»ÐèÒª½øÐÐÒ»ÖÂÐÔ²âÊÔ¡£±ÈÈçcl_ext_device_fissionÀ©Õ¹¡£

3.ij¸ö³§ÉÌ×Ô¼ºµÄÀ©Õ¹£¬±ÈÈçAMDµÄÀ©Õ¹printf

2¡¢Ê¹ÓÃÀ©Õ¹

OpenCLÖУ¬ÒªÊ¹ÓÃÀ©Õ¹£¬ÎÒÃDZØÐë´ò¿ªÀ©Õ¹£¬ÔÚĬÈÏ״̬Ï£¬ËùÓеÄÀ©Õ¹¶¼ÊǽûÖ¹µÄ¡£

#pragma OPENCL EXTENSION extension_name : enable

¶ÔÓÚOpenCL£¬Ò»¸öº¯ÊýÖ»ÓÐÔÚÔËÐÐʱ£¬²ÅÖªµÀÆäÊÇ·ñ¿ÉÓã¬ËùÒÔҪȷ¶¨Ä³¸öÀ©Õ¹ÊÇ·ñ¿ÉÓã¬ÊdzÌÐòÔ±µÄÔðÈΣ¬ÎÒÃDZØÐëÔÚʹÓÃǰ²éѯËüµÄ״̬¡£ÏÂÃæÊDzéѯÀ©Õ¹ÊÇ·ñ¿ÉÓõĴúÂ룺

3¡¢Ò»Ð©KhronosÅú×¼µÄÀ©Õ¹

Ô­×Ó²Ù×÷£¬Ëü¿ÉÒÔ±£Ö¤º¯ÊýÖ»ÔÚÒ»¸ödeviceÉÏʵʩԭ×Ó²Ù×÷£¬±ÈÈ磺

¡ªcl_khr_{global | local}_int32_base_atomics

¡ªcl_khr_{global | local}_int32_extended_atomics

¡ªcl_khr_int64_base_atomics

¡ªcl_khr_int64_extended_atomics

×¢Ò⣺ԭ×Ó²Ù×÷Äܹ»±£Ö¤²Ù×÷½á¹ûÕýÈ·£¬µ«²»±£Ö¤²Ù×÷µÄ˳Ðò¡£

Ë«¾«¶ÈºÍhalf¾«¶ÈÀ©Õ¹cl_khr_fp64£¬ÔÚһЩÎïÀíÄ£Äâ»òÕß¿ÆÑ§¼ÆËãÖУ¬ÐèҪ˫¾«¶ÈÖ§³Ö¡£AMDµÄ64λÀ©Õ¹ÓÃcl_amd_fp64£¬¶ÔÓÚcl_khr_fp64ÊDz¿·ÖÖ§³Ö£¬NVÖ§³Öcl_khr_fp64À©Õ¹¡£µ«half¾«¶ÈÀ©Õ¹cl_khr_fp16£¬ÕâÁ½¼Ò³§ÉÌÏÖÔÚ¶¼»¹²»Ö§³Ö¡£

ÔÚOpenCLÖУ¬Byte addressable store Ò²ÊÇÒ»¸öÀ©Õ¹£¬¶ÔÓÚsub 32µÄд£¬±ÈÈçchar£¬ÐèÒª¸ÃÀ©Õ¹µÄÖ§³Ö¡£ÀýÈçAMD Ö±·½Í¼µÄÀý×ÓÖУ¬Ã¿¸öbinÓÃÒ»¸öbyteÀ´´æ´¢¡£

3D Image Write Extensions£¬ÔÚOpenCL±ê×¼ÖУ¬Ö§³Ö2DͼÏñµÄ¶Áд£¬3DͼÐεÄд¾ÍÐèҪͨ¹ýÀ©Õ¹À´²Ù×÷¡£

The extension cl_KHR_gl_sharing ÔÊÐíÓ¦ÓóÌÐòʹÓÃOpenGL buffer£¬ÎÆÀíµÈ¡£

4¡¢AMDÀ©Õ¹

cl_ext_device_fissionÀ©Õ¹£¬Í¨¹ý¸ÃÀ©Õ¹°ÑÒ»¸öÉ豸·Ö³É¶à¸ö×ÓÉ豸£¬Ã¿Ò»¸öÉ豸¶¼ÓÐ×Ô¼ºµÄ¶ÓÁУ¬Ö÷ÒªÊǶàºËcpuÒÔ¼°Cell Broadband EngineʹÓ㬸ÃÀ©Õ¹ÓÉAMD£¬Apple£¬IntelÒÔ¼°IBMËļÒÁªºÏÌá³ö¡£

fissionÉ豸¿ÉÄܵÄÓÃ;°üÀ¨£º

1.±£ÁôÒ»²¿·ÖÉ豸´¦Àí¸ßÓÅÏȼ¶¡¢µÍʱÑÓµÄÈÎÎñ¡£

2.Control for the assignment of work to individual compute units

3.Subdivide compute devices along some shared hardware feature like a cache

¶ÔÓÚÿ¸ö×ÓÉ豸£¬¶¼ÓÐ×Ô¼ºµÄqueue£¬±ÈÈçÏÂÃæµÄͼÖУ¬ÎÒÃǰѲ»Í¬ÈÎÎñ·¢Ë͵½Á½¸ö×ÓÉ豸¡£ÖµµÃ×¢ÒâµÄÊÇ£ºÒª°ÑÉ豸²ð·ÖΪ×ÓÉ豸£¬Ê×ÏÈÎÒÃÇÒªÁ˽â¸ÃÉ豸µÄ¼Ü¹¹£¬È»ºó¸ù¾ÝÈÎÎñ¼°device¼Ü¹¹½øÐвð·Ö¡£

GPU printf À©Õ¹£¬Ö÷ÒªÓÃÀ´debug kernel´úÂë¡£cl_amd_media_opsÀ©Õ¹£¬Ö÷ÒªÓÃÓÚһЩ¶àýÌå²Ù×÷¡£The AMD device query extension Ö÷ÒªÓÃÓÚ²éѯºÍʼþ´¦Àí¡£

5¡¢NVÀ©Õ¹

1.Compiler Options

2.Interoperability Extensions

3.Device Query Extension

6¡¢Cell Broadband Engine Extensions

cell´¦ÀíÆ÷ÓõIJ»¶à£¬¾Í²»Ïêϸ˵ÁË£¬Ê¹ÓõÄÈË¿ÉÒÔ²éѯÆäÏà¹ØÊֲᡣ

   
3116 ´Îä¯ÀÀ       29
Ïà¹ØÎÄÕÂ

ÆóÒµ¼Ü¹¹¡¢TOGAFÓëArchiMate¸ÅÀÀ
¼Ü¹¹Ê¦Ö®Â·-ÈçºÎ×öºÃÒµÎñ½¨Ä££¿
´óÐÍÍøÕ¾µçÉÌÍøÕ¾¼Ü¹¹°¸ÀýºÍ¼¼Êõ¼Ü¹¹µÄʾÀý
ÍêÕûµÄArchimateÊÓµãÖ¸ÄÏ£¨°üÀ¨Ê¾Àý£©
Ïà¹ØÎĵµ

Êý¾ÝÖÐ̨¼¼Êõ¼Ü¹¹·½·¨ÂÛÓëʵ¼ù
ÊÊÓÃArchiMate¡¢EA ºÍ iSpace½øÐÐÆóÒµ¼Ü¹¹½¨Ä£
ZachmanÆóÒµ¼Ü¹¹¿ò¼Ü¼ò½é
ÆóÒµ¼Ü¹¹ÈÃSOAÂ䵨
Ïà¹Ø¿Î³Ì

ÔÆÆ½Ì¨Óë΢·þÎñ¼Ü¹¹Éè¼Æ
ÖÐ̨սÂÔ¡¢ÖÐ̨½¨ÉèÓëÊý×ÖÉÌÒµ
ÒÚ¼¶Óû§¸ß²¢·¢¡¢¸ß¿ÉÓÃϵͳ¼Ü¹¹
¸ß¿ÉÓ÷ֲ¼Ê½¼Ü¹¹Éè¼ÆÓëʵ¼ù
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

ר¼ÒÊӽǿ´ITÓë¼Ü¹¹
Èí¼þ¼Ü¹¹Éè¼Æ
ÃæÏò·þÎñÌåϵ¼Ü¹¹ºÍÒµÎñ×é¼þ
ÈËÈËÍøÒÆ¶¯¿ª·¢¼Ü¹¹
¼Ü¹¹¸¯»¯Ö®ÃÕ
̸ƽ̨¼´·þÎñPaaS


ÃæÏòÓ¦Óõļܹ¹Éè¼ÆÊµ¼ù
µ¥Ôª²âÊÔ+ÖØ¹¹+Éè¼ÆÄ£Ê½
Èí¼þ¼Ü¹¹Ê¦¡ª¸ß¼¶Êµ¼ù
Èí¼þ¼Ü¹¹Éè¼Æ·½·¨¡¢°¸ÀýÓëʵ¼ù
ǶÈëʽÈí¼þ¼Ü¹¹Éè¼Æ¡ª¸ß¼¶Êµ¼ù
SOAÌåϵ½á¹¹Êµ¼ù


Èñ°²¿Æ¼¼ Èí¼þ¼Ü¹¹Éè¼Æ·½·¨
³É¶¼ ǶÈëʽÈí¼þ¼Ü¹¹Éè¼Æ
ÉϺ£Æû³µ ǶÈëʽÈí¼þ¼Ü¹¹Éè¼Æ
±±¾© Èí¼þ¼Ü¹¹Éè¼Æ
ÉϺ£ Èí¼þ¼Ü¹¹Éè¼Æ°¸ÀýÓëʵ¼ù
±±¾© ¼Ü¹¹Éè¼Æ·½·¨°¸ÀýÓëʵ¼ù
ÉîÛÚ ¼Ü¹¹Éè¼Æ·½·¨°¸ÀýÓëʵ¼ù
ǶÈëʽÈí¼þ¼Ü¹¹Éè¼Æ¡ª¸ß¼¶Êµ¼ù