I have some structures, and I didn't understand how I should align them.
On the CPU I have an array of structures (voxel *VoxelList) and I want to send it to the GPU, where I copy the elements into another array which is on the GPU (GPUVoxelList). Then I want to access from the kernel each element with GPUVoxelList.
here are the structures, organised as good as I could understand:
typedef struct __attribute__ ((aligned(16))) Raza
cl_uint4 pozitie; ->x
cl_float4 directie; ->x + 16
cl_float dimensiune; ->x + 32
cl_float fov; ->x + 36
cl_uchar4 mediu_parcurs; ->x + 40
cl_uchar4 mediu_curent; ->x + 44
cl_ushort2 pixel; ->x + 48
cl_ushort influenta; ->x + 52
cl_ushort luminozitate_curenta; ->x + 54
cl_uchar padd; ->x + 56
}raza; ->x + 64
typedef struct Material
cl_uchar4 culoare; ->x
cl_char4 normala; ->x + 4
cl_uchar2 reflexivitate; ->x + 8
cl_uchar2 transparenta; ->x + 12
cl_ushort luminozitate; ->x + 16
cl_uchar densitate; ->x + 18
}material; ->x + 19
typedef struct __attribute__ ((aligned(16))) Voxel
cl_uint fiu; ->x
cl_uint parinte; ->x + 32
material m; ->x + 36
cl_uchar padd; ->x + 55
}voxel; ->x + 64
I am only sending the raza and the voxel structs. The material struct doesn't have to be padded because I padd at the end of the voxel struct and it starts at 36, right?
I compiled the code, and it doesn't work strangely I get only 0-s. Also changed the aligned(16) with packed.
maybe because the array of structs on the device isn't aligned? How do I align a cl_mem? and the voxel structure where should start? at a multiple of 4? and the raza structure at a multiple of 16?
I hope these work: I've tested the 'raza' struct with the pixel field which seams ok, and the voxel struct with the 'parinte' field witch is 99% ok I assume the others are wrong because I did something wrong in the code, not because of the alignments.
I think the idea was that material should start at a multiple of 16 (because it is aligned to 16) and the paddings help align parinte to 4 (uint=4) and the size of the structure to be multiple of 16 (64). Maybe this helps someone like me
typedef struct Raza
uint4 pozitie; //x, y, z
float4 directie; //i, j, k (cu cat se misca la fiecare pas)
float dimensiune; //cat de mare e cubul acum
float fov; //cu cat creste la fiecare pas
uchar4 mediu_parcurs; //r, g, b -> cat filtreaza din fiecare culoare
uchar4 mediu_curent; //r, g, b, densitate
uint pixel; //pixelul care trebuie updatat (w=pixel%width h=pixel/hidth)
ushort influenta; //cat influenteaza culoarea lui pixelul
ushort luminozitate_curenta; //luminozitatea mediului curent
typedef struct Material
uchar4 culoare; //r, g, b ( [0, 255] )
char4 normala; //x, y, z ( [-1, 1] * 127 )
uchar2 reflexivitate; //indice reflexie, specularitate ( [0, 255] )
uchar2 transparenta; //indice transparenta, specularitate ( [0, 255] )
ushort luminozitate; //( [0, 65535] )
uchar densitate; //( [0, 255] )
typedef struct Voxel
and on the host:
#pragma pack (push, 16)
...same fields as the kernel source above...
Sorry for re-popping this thread... I ca't understand where the problem is:
the only member which seams not to work is densitate from material:
( voxel.m.densitate, on the host it is anything, on the gpu it sais it is 255)
I also tried aligning the material struct to 8 and putting the padd at the end of the material instead in the voxel struct.
Can you see any errors?
the culoare, normala, and luminozitate from material are correct, and fiu and parinte from voxel, haven't tested the others. (they are correct for all members in the array, because there is an array of structs like this)
Here is the problem:
SelectieVoxelScena(__global voxel* ListaVoxeli, [...])
// -> val_voxel.m.densitate is 255
// -> ListaVoxeli[radacina].m.densitate is 204
the structures are the ones above. Why is that?
If I copy the members one by one it is ok.
 I don't modify the ListaVoxeli anywhere
Please switch the order or "parinte" and "Material m" in the voxel struct.
This is just guess but it might fix inconsitent padding done by the compilers.