Heyho,

i need to measure the elapsed time on GPU to terminate computation,

is this possible? how?

--

Srdja

Heyho,

i need to measure the elapsed time on GPU to terminate computation,

is this possible? how?

--

Srdja

If your talking about measuring the time within a kernel then there is no proper way to measure elapsed time (OpenCL does not provide access to timers, IL/CAL *may* be able to provide some HW timings I don't know, but this is not portable and not part of OpenCL per se).

One way you can perform it is to have an iteration count in your kernel that counts up and when it reaches a certain point, then quits. This iteration limit can be controlled by passing it as a kernel argument. However, to correlate this to time, you would have to run a quick 'test run' of your kernel where you run for a predetermined number of iterations and measure the time spent. Then you can calculate that X GPU iterations takes Y seconds (say 1000 iterations takes 0.1sec) and from this you can extrapolate how many iterations to allow for a given time period. (e.g so you can run 10000 iterations within ~1sec).

Furthermore you can continously refine this measurement as you go along for added accuracy. If you do this remember to account for kernel overhead (i.e. run a long kernel to minimise contribution of overhead, or run several different kernels with different running times and from this you can extract the average overhead from a gradient)

If your talking about measuring the time within a kernel then there is no proper way to measure elapsed time (OpenCL does not provide access to timers, IL/CAL *may* be able to provide some HW timings I don't know, but this is not portable and not part of OpenCL per se).

One way you can perform it is to have an iteration count in your kernel that counts up and when it reaches a certain point, then quits. This iteration limit can be controlled by passing it as a kernel argument. However, to correlate this to time, you would have to run a quick 'test run' of your kernel where you run for a predetermined number of iterations and measure the time spent. Then you can calculate that X GPU iterations takes Y seconds (say 1000 iterations takes 0.1sec) and from this you can extrapolate how many iterations to allow for a given time period. (e.g so you can run 10000 iterations within ~1sec).

Furthermore you can continously refine this measurement as you go along for added accuracy. If you do this remember to account for kernel overhead (i.e. run a long kernel to minimise contribution of overhead, or run several different kernels with different running times and from this you can extract the average overhead from a gradient)