0 Replies Latest reply on Oct 25, 2011 2:58 PM by Marix

    Accessing structs with doubles on APP 2.5

    Marix
      Miscompile?

      The code attached to this message fails for me on APP 2.5 with Catalyst 11.7 on Linux.

      On execution of version 1 the memory object out is filled with zeros. If I change the code to the one in version 2 same problem occurs.

      I know this is not optimal as all threads will access the same memory address, however functionally it _should_ work correctly. Even putting the value into `__constant` memory does not change anything (I assume that's what the first version implicitly does). Both versions work with APP 2.4, though. It will also work with APP 2.5 if I simply replace `my_complex` by `double2`. This however is not a real solution in more complex cases where I originally ran into this issue. Another way to work around the problem is to `my_float` to be single precision. It does not make any difference whether I use `cl_amd_fp64` or `cl_khr_fp64`. Another way to get the second variant working is to use the function in the workaround section to load the value from the pointer.

      Is this a regression in the APP or am I doing something stupid here? I really wanted avoid to typedef `double2` as `my_complex` as otherwise its multiplication operator could accidently be invoked.

       

       

      CODE Version 1: #pragma OPENCL EXTENSION cl_amd_fp64 : enable #define FIELDSIZE 10024 typedef double my_float; typedef struct { my_float re; my_float im; } my_complex; __kernel void fillComplex(__global my_complex * out, const my_complex value) { for(size_t i = get_global_id(0); i < FIELDSIZE; i += get_global_size(0)) { out[i] = value; } } CODE Version 2: #pragma OPENCL EXTENSION cl_amd_fp64 : enable #define FIELDSIZE 10024 typedef double my_float; typedef struct { my_float re; my_float im; } my_complex; __kernel void fillComplex(__global my_complex * out, __global my_complex * value_p) { const value = *value_p; for(size_t i = get_global_id(0); i < FIELDSIZE; i += get_global_size(0)) { out[i] = value; } } CODE Workaround: my_complex complexLoadHack(__global const my_complex * p) { union { double2 v; my_complex c; } tmp; tmp.v = *((__global const double2*) p); return tmp.c; }