cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

toddwbrownjr
Journeyman III

OpenCL Coalescing To Global Memory

Hello all,

     I have an HD 5870 and the ATI Stream V2.0 SDK installed.  I had a question regarding coalescing global memory reads/writes in a kernel.  The documentation says a wavefront is composed of 64 work items and it appears to suggest that 32 work items are processed at one time.  If, in a given half-wavefront, the addresses to global items are not aligned and/or not completely sequential across increasing wavefront IDs, will the hardware make 32 individual global accesses (horrible bandwidth) or will it try to make as few coalesced global reads as necessary to fulfill the half-wavefront request (better bandwidth)?

0 Likes
20 Replies