suppose i have a struct on the host as follows:
typedef struct{
float x;
float y;
float z;
} FloatThree __attribute__ ((aligned (4)));
typedef struct
{
uint u;
int i;
float f;
FloatThree ft;
char c;
}Params __attribute__ ((aligned (16)));
is it necessary to add padding/alignment? Section 5.1 in the AMD beta4 release notes says that structs must be packed and aligned. I see section 6.10.1 in the spec which states:
Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question and must also be a power of two.
For FloatThree each member is 4 bytes, so the LCM is 4 bytes, which is a power of 2.. so an alignment of 4 seems correct.
For Params the first 3 members are 4 bytes each, the struct is 12 bytes (it is aligned to 4) and the char is 1 byte. According to 6.1.5 the char must be aligned to two bytes. So the LCM would seem to be 12 and the alignment is the next highest power of 2 which is 16. But it seems based on the 5.1 release notes example that i need to explicitly add padding:
typedef struct
{
uint u;
int i;
float f; // 12 bytes up to here
char pad1[4]; // now 16
FloatThree ft; // another 12 bytes
char c; // 1 byte
char pad[3]; // now 16
}Params __attribute__ ((aligned (16)));
Does this seem correct? I think if it's wrong it could conceivably cause at least some of issues i'm seeing.
Originally posted by: david_aiken is it necessary to add padding/alignment?
Padding is required to satisfy alignment requirements of individual elements of a structure.
Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question and must also be a power of two.
It says 'LCM of ALIGNMENTS of all members'
For FloatThree each member is 4 bytes, so the LCM is 4 bytes, which is a power of 2.. so an alignment of 4 seems correct
This is correct.
For Params the first 3 members are 4 bytes each, the struct is 12 bytes (it is aligned to 4) and the char is 1 byte. According to 6.1.5 the char must be aligned to two bytes. So the LCM would seem to be 12 and the alignment is the next highest power of 2 which is 16.
First 3 members require alignment of 4, 4th member(struct FloatThree) also requires an alignment of 4, 5th member requires alignment of 1. Hence LCM of all these alignments is 4. So the struct will be aligned to 4.
Hence :
typedef struct <----- aligned to 4 (4n)
{
uint u; ----> aligned to 4 (4n)
int i; -------> aligned to 4 (4n + 4)
float f; -------> aligned to 4 (4n + 😎
FloatThree ft; ------> aligned to 4 (4n + 12)
char pad[3]; -------> padding to make struct size multiple of 4
char c; ---------> aligned to 1 (4n + 27)
};
The formatting is all messed up, seems buggy to me. Also there is no preview post option :frown:
ah.. i see. It's the alignment rather than the size which i missed. Thanks!
In section 6.10.1 there is also a "packed" attribute, which apparently minimizes memory requirements. When does it make sense to use this? Does it affect the alignment?
How do these attributes affect the organization of the data on the host? Is there an expectation that the data is aligned/packed exactly to match the kernel declaration? Is there a good way to specify the data layout in a shared include for, say, MSVC 2008 and/or gcc?