Originally posted by: bpurnomo When you use the profiler, do you see one or more CreateBuffer (or CreateImage) API calls with N/A timings?
It depends, if I use clCreateBuffer and copy the pointer, then yes.
If I don't do this and use clEnqueueWriteBuffer non-blocking then no, I see WriteBufferAsynch
If I use clEnqueueWriteBuffer blocking I see WriteBuffer
All three of these have the variations with the samples for 2kx2k problem size: DCT (6.x ms up to 11.x ms), Mersenne Twister (14.x up to 18.x ms), Black Scholes (6.x ms up to 11.x ms).
For the DCT, for example, if I go up to 4k*4k problem size then the timings become much much more stable and I no longer see this fluctuation (variances occur at the .0x ms range, not at the x.xx ms range).
Originally posted by: bpurnomo
Please post your feedback here.