cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

gidden
Journeyman III

wrong alignment on amd graphic card

i have this struct :

#pragma pack(push)

#pragma pack(1)

typedef struct

{

  uint a;

  uint b;

  float c;

  float2 d;

  float2 e;

} TMyStruct;

#pragma pack(pop)

When I try acces "c" it is ok. But when I try acces d.x, it returns d.y

Problably it aligns "c" on float2.

When I put "c" on end, struct has good size.

On nVidia cards it works ok, but on AMD Radeon HD 7670M it does not work.

0 Likes
6 Replies
gbilotta
Adept III


gidden wrote:



i have this struct :



#pragma pack(push)


#pragma pack(1)


typedef struct


{


  uint a;


  uint b;


  float c;


  float2 d;


  float2 e;


} TMyStruct;


#pragma pack(pop)



When I try acces "c" it is ok. But when I try acces d.x, it returns d.y


Problably it aligns "c" on float2.


When I put "c" on end, struct has good size.


On nVidia cards it works ok, but on AMD Radeon HD 7670M it does not work.


It's a matter of packed vs unpacked. My bet is that either your device compiler or your host compiler does not know about the #pragma pack directive. Those that understand the pragma will put d right after c, but those that do not will align d and e at their natural alignment, which requires a 4-byte padding after c, so the location of d.x when the pack directive is not obeyed will be the location of d.y when the pack directive is obeyed.

This is a well-known issue when dealing with structs across different devices and compilers. You need to either find a way to enforce the struct to be packed (I don't know if there is a universally-understood attribute for that) or you should rearrange your data so that packing is not necessary.

0 Likes
nou
Exemplar

there should be sizeof(float) gap between C and D as every OpenCL type must be aligned to their size. so that it return d.y is correct behavior.

0 Likes

I'm wondering if (or rather how) the compiler should handle the case of packed structure where vector types end up being misaligned.

On the host, when something like this happens (for example, a float at a non-multiple-of-4 address), the compilers know that they have to issue multiple load instructions and then reassemble the appropriate data type to prevent bus errors. Should something like this be done by the device compilers too? In this case, for example, the misaligned could be handled by using vload2/vstore2 instructions instead of direct loads/stores.

0 Likes
himanshu_gautam
Grandmaster

Check 6.11.1 Specifying Attributes of Types from OpenCL spec.

0 Likes
gidden
Journeyman III

Now this code works:

typedef struct __attribute__ ((packed))

{

  uint a;

  uint b;

  float c;

  float2 d;

  float2 e;

} TMyStruct;

This code works for nVidia too, but I`ll keep "#pragma pack(1)" for be sure.

Can you point me where I can find information about default padding (for structs) ?

0 Likes


gidden wrote:



Now this code works:



typedef struct __attribute__ ((packed))


{


  uint a;


  uint b;


  float c;


  float2 d;


  float2 e;


} TMyStruct;



This code works for nVidia too, but I`ll keep "#pragma pack(1)" for be sure.


Can you point me where I can find information about default padding (for structs) ?


The pack pragma might still be necessary for the host if you're using a shared header. As for th default padding, this is actually up to the compiler and it usually is machine-dependent. Usually, padding is inserted between types with different alignment requirements to make sure that subsequent items are aligned correctly. For example, a struct {char c; int i} would have a sizeof(char) - sizeof(int) padding (3 bytes on a 4-bytes-per-int architecture).

0 Likes