Hello everybody,

I'd like to ask if anyone else has tried to run his/her host code on another platform besides AMD/ATI-Stream. It seems typedefing cl_float4 to the union containing f32[] is not done by other vendor's cl_platform.h (namely: NVIDIA), so I had a look at the "official" Khronos cl_platform.h -- it's not in there, either!

Take for example AMD's cl_platform.h, line 244:


typedef union cl_float4   { cl_float  f32[4]; }  cl_float4;

and compare with Khronos' cl_platform.h, line 117:


typedef float           cl_float4[4];

Ergo: Host code will not compile on platforms other than ATI-Stream. Which is regrettable, to say the least.



  • if you encountered this, did you find a workaround (maybe some #defines here and there..., I tried #define .f32  but that didn't work) ? 
  • Will AMD change this in future releases?

Disclaimer: I'm using the C++ bindings provided by Khronos, and I'm astounded by the significant speed ups of even the CPU implementation! Thumbs up for that, AMD/ATI!