What is the memory layout when we use uint4?
Dear people,
I know that this should be documented somewhere, but I can't find it, so I kindly ask if someone could help me: When in a kernel we write to a buffer using (__global uint *) pointers, the memory layout of the buffer is linear;
ej:
__kernel_linear (__global uint *pin, __global uint *pout)
{
...
uint tid=get_global_id(0);
...
*(pout+tid)=...
}
The results of each kernel are contigous (result from kernel 'tid' is just before result from kernel 'tid-1'; and after result from kernel 'tid+1')
But when we use (__global uint4 *)pointers, the four elements aren't in contigous positions in the buffer.
ej:
__kernel_uint4 (__global uint *pin, __global uint4 *pout)
{
...
uint tid=get_global_id(0);
uint4 data;
...
data.x=tid+1;
data.y=tid+2;
data.z=tid+3;
data.w=tid+4;
...
*(pout+tid)=data.
}
The memory layout isn't [...kernel'n'.x - kernel'n'.y - kernel'n'.z - kernel'n'.w - kernel'n+1'.x - kernel'n+1'.y - kernel'n+1'.z - kernel'n+1'.w); so what is the memory layout in the second case? Thanks in advance for any insight about this.
best regards,
Alfonso