OpenCL on Windows too slow?

Discussion created by syoyofujita on Sep 11, 2009
I've wrote very simple OpenCL kernel which fills pixels by work ID.

I got terribly slow performance from this kernel with ATI Stream SDK 2.0beta on Windows Vista64.

It requires about 8 secs to execute which is unbelievable for me. On the other hand Snow Leopard executes same kernel within 0.0001 sec.

Does anyone know the reason why so slow on Windows?

More is available at the following site.


__kernel void main( __global uint *out, uint col) // not used. { int x = get_global_id(0); int y = get_global_id(1); out[x+y*get_global_size(0)] = (uint)(x | (y << 8) | (255 << 16) | (255<<24)); }