Archives Discussions

sayantandatta · ‎06-04-2012

Hello everyone,

I want to write a kernel that use 4KB local memory per work item. Since only 32KB LDS is available per work group ,hence work group size can be atmost 8. Now if I were to use the total 64KB LDS then I have to shedule 2 work groups per CU. But if I want to eliminate or reduce LDS bank conflicts do I have to consider both the work groups while wrting the codes? Or the work group's memory access are independent of each other i.e local memory access pattern of one work group doesn't affect the other work group in a CU?

Also note that the work group will only use 8 channels out of 32 channels available. Now will the second workgroup in the CU use same set of 32 channels used by the first work group or an entirely different set of 32 channels ?

Please Reply.

Thanks,

Sayantan

LeeHowes · ‎06-07-2012

Bank conflicts are a half-wavefront issue problem, not a workgroup or even full wavefront problem. The way it works is that every cycle 16 lanes of requests are made by both one of SIMD 0 and 1 and one of or SIMD 2 and 3 in the CU. Those requests are serviced as 32-lanes per cycle by the LDS interface and hence conflicts (I think) can occur accross those 32 lanes and 32 banks.

Of course, bank conflicts that delay one wave can affect another wave on another SIMD unit in the CU because it creates a pipeline bubble.

View solution in original post

Archives Discussions

HD 7970 LDS Bank Conflicts