cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

nibal
Challenger

Optimization Guide: GCN Channel Conflicts

p 44:

"In this example:


for (ptr=base; ptr<max; ptr += 16KB)
     R0 = *ptr ;


where the lower bits are all the same, the memory requests all access the same
bank on the same channel and are processed serially.
This is a low-performance pattern to be avoided. When the stride is a power of
2 (and larger than the channel interleave), the loop above only accesses one
channel of memory."

Agreed with the reasoning, disagree with conclusion and scenario. I think that this is what exactly

we want in a kernel. The code in the loop should run serially for any given kernel (aside from

compiler optimizations, that may parallelize instructions), so that parallel kernels have the chance

with a base offset to use different channels. To that effect, unit strides, mentioned elsewhere in the same

page, would be the worst possible scenario.

Also to my understanding only memory writes can be conflicted. No reason for memory reads to be.

Am I missing smt?

0 Likes
14 Replies