AnsweredAssumed Answered

Large submit->start delay in OpenCL kernel on AMD APU (running on GPU)

Question asked by jmaxg3 on Nov 28, 2013
Latest reply on Nov 28, 2013 by jmaxg3

I'm seeing massive delays between a kernel being submitted to an AMD GPU and actually executed. My program is doing blocking writes/reads (with blocking=CL_TRUE) to ensure that I/O isn't interfering with the kernel. I then use clGetEventProfilingInfo to get info on kernel queueing, submitting, starting and ending. The data (and code) below shows that the kernel spends about 3 seconds submitted, and then 3 seconds running. In general, it looks like the submitted time scales with the running time. I've looked at a number of forum posts about delays in kernel execution (for instance, but there doesn't seem to be a resolution there. I've checked that the GPU is not in low-power mode. Has anyone else seen this or have suggestions of how to diagnose it?

Full code is at [C] #include <stdio.h>  #include <stdlib.h>  #include <CL/cl.h>  #include <sys/timeb -

Sample run:

  5 write: queued 0.000000 submit 0.023312 start 3296.444778 end 3335.371268 | submitted 3296.421466 running 38.926490

  6 exec: queued 0.021067 submit 78.494703 start 3335.371268 end 6529.140138 | submitted 3256.876565 running 3193.768870

  7 read: queued 0.024849 submit 79.085042 start 6529.140158 end 6578.664028 | submitted 6450.055116 running 49.523870

  8 Overall 6583.000000 ms