cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

bmerry
Journeyman III

Errors in OpenCL developers' guide

Hopefully someone from AMD techpubs will see this. I was trying to look up the bandwidths of the various memory systems in rev 2.7 of the AMD APP OpenCL Programming Guide and found assorted contradictory information about the bandwidths of the register file and LDS. I found the following claims for bandwidth per stream processor per cycle

Register file: 48B (6-11), 12B (6-15)

LDS: 2B (6-10, based on 14x ratio to global) 8B (6-11 and 6-15), 1/6 of reg (6-11)

The only way the numbers make sense to me is if it is 12B for registers (which makes sense for 2 inputs and 1 output) and 2B for LDS (which makes sense for 32x4B banks shared by 64 processors). It would be great if this could be fixed in future versions of the document.

0 Likes
2 Replies
dipak
Big Boss

Hi,

Thanks for reporting it. As per section "Device Parameters", if you calculate the peak read bandwidth/Processing Element for register and LDS for Pitcairn XT (which is based on GCN), the numbers come as follows:

Register Peak Read Bandwidth/ Processing Element = 15360 / (1 * 1280) = 12B /cycle

LDS Peak Read Bandwidth/ Processing Element = 2560 / (1 * 1280) = 2B /cycle

But the numbers seem okay for NI or Evergreen.

So, indeed the numbers look confusing for GCN [at least term peak read bandwidth/Stream core for GCN]. I've asked someone for more clarification and forwarded a request to update, if required,  the corresponding sections of the guide. I'll let you know as soon as I get any reply.

Regards,

0 Likes

We are in the process of updating the user guide, and will ensure that the table listing the bandwidths is updated.

0 Likes