e.g. cl_float4.f32[] vs. cl_float4[]
Hello everybody,
I'd like to ask if anyone else has tried to run his/her host code on another platform besides AMD/ATI-Stream. It seems typedefing cl_float4 to the union containing f32[] is not done by other vendor's cl_platform.h (namely: NVIDIA), so I had a look at the "official" Khronos cl_platform.h -- it's not in there, either!
Take for example AMD's cl_platform.h, line 244:
typedef union cl_float4 { cl_float f32[4]; } cl_float4; |
and compare with Khronos' cl_platform.h, line 117:
typedef float cl_float4[4]; |
Ergo: Host code will not compile on platforms other than ATI-Stream. Which is regrettable, to say the least.
So:
- if you encountered this, did you find a workaround (maybe some #defines here and there..., I tried #define .f32 but that didn't work) ?
- Will AMD change this in future releases?
Disclaimer: I'm using the C++ bindings provided by Khronos, and I'm astounded by the significant speed ups of even the CPU implementation! Thumbs up for that, AMD/ATI!