Dear people,
I know that this should be documented somewhere, but I can't find it, so I kindly ask if someone could help me: When in a kernel we write to a buffer using (__global uint *) pointers, the memory layout of the buffer is linear;
ej:
__kernel_linear (__global uint *pin, __global uint *pout)
{
...
uint tid=get_global_id(0);
...
*(pout+tid)=...
}
The results of each kernel are contigous (result from kernel 'tid' is just before result from kernel 'tid-1'; and after result from kernel 'tid+1')
But when we use (__global uint4 *)pointers, the four elements aren't in contigous positions in the buffer.
ej:
__kernel_uint4 (__global uint *pin, __global uint4 *pout)
{
...
uint tid=get_global_id(0);
uint4 data;
...
data.x=tid+1;
data.y=tid+2;
data.z=tid+3;
data.w=tid+4;
...
*(pout+tid)=data.
}
The memory layout isn't [...kernel'n'.x - kernel'n'.y - kernel'n'.z - kernel'n'.w - kernel'n+1'.x - kernel'n+1'.y - kernel'n+1'.z - kernel'n+1'.w); so what is the memory layout in the second case? Thanks in advance for any insight about this.
best regards,
Alfonso
There's a fairly good explanation of this in Appendix B - Portability of the OpenCL specification.
hi afo,
vector and arrays are two different things.vector are an opaque container and you cannot access them in any way you want.The operation mentioned is possible on arrays but not on vectors.Use the .xyzw or s[0-1-2-3] approach to access elements of a vector.
Thanks a lot!
Appendix B is what I was looking for. I changed a kernel from using uint to use uint4 and I wanted to check the output vector to see what was generated.
best regards,
Alfonso