Discussion created by marwen on Apr 24, 2010
Latest reply on Apr 26, 2010 by ryta1203

i'm trying to evaluate performance differences between global memory access in a 128 bytes aligned boundary manner against an access not aligned.

the fact is that i can't find any difference although there should be an improvement of like an order of magnitude due to coalesced access.

so what is the smallest kernel code which exhibit global memory access coalescence?