AnsweredAssumed Answered

Optimization Guide: GCN Channel Conflicts

Question asked by nibal on Oct 12, 2015
Latest reply on Oct 14, 2015 by nibal

p 44:

"In this example:

for (ptr=base; ptr<max; ptr += 16KB)
     R0 = *ptr ;

where the lower bits are all the same, the memory requests all access the same
bank on the same channel and are processed serially.
This is a low-performance pattern to be avoided. When the stride is a power of
2 (and larger than the channel interleave), the loop above only accesses one
channel of memory."


Agreed with the reasoning, disagree with conclusion and scenario. I think that this is what exactly

we want in a kernel. The code in the loop should run serially for any given kernel (aside from

compiler optimizations, that may parallelize instructions), so that parallel kernels have the chance

with a base offset to use different channels. To that effect, unit strides, mentioned elsewhere in the same

page, would be the worst possible scenario.


Also to my understanding only memory writes can be conflicted. No reason for memory reads to be.

Am I missing smt?