6 Replies Latest reply on Jul 2, 2013 2:27 AM by gbilotta

    wrong alignment on amd graphic card

    gidden

      i have this struct :

       

      #pragma pack(push)

      #pragma pack(1)

      typedef struct

      {

        uint a;

        uint b;

        float c;

        float2 d;

        float2 e;

      } TMyStruct;

      #pragma pack(pop)

       

      When I try acces "c" it is ok. But when I try acces d.x, it returns d.y

      Problably it aligns "c" on float2.

      When I put "c" on end, struct has good size.

      On nVidia cards it works ok, but on AMD Radeon HD 7670M it does not work.

        • Re: wrong alignment on amd graphic card
          gbilotta

          gidden wrote:

           

          i have this struct :

           

          #pragma pack(push)

          #pragma pack(1)

          typedef struct

          {

            uint a;

            uint b;

            float c;

            float2 d;

            float2 e;

          } TMyStruct;

          #pragma pack(pop)

           

          When I try acces "c" it is ok. But when I try acces d.x, it returns d.y

          Problably it aligns "c" on float2.

          When I put "c" on end, struct has good size.

          On nVidia cards it works ok, but on AMD Radeon HD 7670M it does not work.

          It's a matter of packed vs unpacked. My bet is that either your device compiler or your host compiler does not know about the #pragma pack directive. Those that understand the pragma will put d right after c, but those that do not will align d and e at their natural alignment, which requires a 4-byte padding after c, so the location of d.x when the pack directive is not obeyed will be the location of d.y when the pack directive is obeyed.

          This is a well-known issue when dealing with structs across different devices and compilers. You need to either find a way to enforce the struct to be packed (I don't know if there is a universally-understood attribute for that) or you should rearrange your data so that packing is not necessary.

          • Re: wrong alignment on amd graphic card
            nou

            there should be sizeof(float) gap between C and D as every OpenCL type must be aligned to their size. so that it return d.y is correct behavior.

              • Re: wrong alignment on amd graphic card
                gbilotta

                I'm wondering if (or rather how) the compiler should handle the case of packed structure where vector types end up being misaligned.

                 

                On the host, when something like this happens (for example, a float at a non-multiple-of-4 address), the compilers know that they have to issue multiple load instructions and then reassemble the appropriate data type to prevent bus errors. Should something like this be done by the device compilers too? In this case, for example, the misaligned could be handled by using vload2/vstore2 instructions instead of direct loads/stores.

              • Re: wrong alignment on amd graphic card
                himanshu.gautam

                Check 6.11.1 Specifying Attributes of Types from OpenCL spec.

                • Re: wrong alignment on amd graphic card
                  gidden

                  Now this code works:

                   

                  typedef struct __attribute__ ((packed))

                  {

                    uint a;

                    uint b;

                    float c;

                    float2 d;

                    float2 e;

                  } TMyStruct;

                   

                  This code works for nVidia too, but I`ll keep "#pragma pack(1)" for be sure.

                  Can you point me where I can find information about default padding (for structs) ?

                    • Re: wrong alignment on amd graphic card
                      gbilotta

                      gidden wrote:

                       

                      Now this code works:

                       

                      typedef struct __attribute__ ((packed))

                      {

                        uint a;

                        uint b;

                        float c;

                        float2 d;

                        float2 e;

                      } TMyStruct;

                       

                      This code works for nVidia too, but I`ll keep "#pragma pack(1)" for be sure.

                      Can you point me where I can find information about default padding (for structs) ?

                      The pack pragma might still be necessary for the host if you're using a shared header. As for th default padding, this is actually up to the compiler and it usually is machine-dependent. Usually, padding is inserted between types with different alignment requirements to make sure that subsequent items are aligned correctly. For example, a struct {char c; int i} would have a sizeof(char) - sizeof(int) padding (3 bytes on a 4-bytes-per-int architecture).