kernel overhead too high (again)

Discussion created by sourcery on Oct 15, 2011
Latest reply on Oct 19, 2011 by sourcery

Am running a modified templatec.cpp together with the APP profiler.

Calling my own version of runkernels which writes a small data area, runs a kernel and reads back 8000 bytes.

All the cl_mem addresses, data areas and arguments have been set up.

Running on an HD6850 with APP

Calling the runkernels code 1300 times gives me a writebuffer time of around 0.08 millisecond, a kernel time of around 0.3 milliseconds. and a read buffer time of around 0.16 milliseconds (for each iteration).

Yet 1300 calls takes 54.0 seconds !

Is it not possible to utilise OPENCL with quick kernels ?