I know that this should be documented somewhere, but I can't find it, so I kindly ask if someone could help me: When in a kernel we write to a buffer using (__global uint *) pointers, the memory layout of the buffer is linear;
__kernel_linear (__global uint *pin, __global uint *pout)
The results of each kernel are contigous (result from kernel 'tid' is just before result from kernel 'tid-1'; and after result from kernel 'tid+1')
But when we use (__global uint4 *)pointers, the four elements aren't in contigous positions in the buffer.
__kernel_uint4 (__global uint *pin, __global uint4 *pout)
The memory layout isn't [...kernel'n'.x - kernel'n'.y - kernel'n'.z - kernel'n'.w - kernel'n+1'.x - kernel'n+1'.y - kernel'n+1'.z - kernel'n+1'.w); so what is the memory layout in the second case? Thanks in advance for any insight about this.