3 Replies Latest reply on Jun 15, 2012 3:01 PM by ganadineroxint

    Kernel execution time reduced using multiple iterations

    uvedale

      Hi,

       

      I stumbled upon an optimisation that I don't quite understand, and was hoping somebody could shed some light on it.

       

      I have two OpenCL kernels, model_setup and model_run. The model_run kernel basically does a Monte Carlo type simulation with a large number of different parameter sets. One of the selling points of OpenCL is that you can just give it a huge amount of work, and it will figure out how to schedule it efficiently (if I'm not mistaken?). However, I stumbled upon a sizable performance increase (around 11%) by scheduling multiple executions of the kernels with a smaller NDRange as opposed to scheduling one execution with all the data. The group size was not changed.

       

      Any ideas?

       

      Dale