1 Reply Latest reply on Jun 23, 2010 7:03 PM by RyFo18

    Which is faster?


      To calculate 100 matrixes multiplication,which is faster? One is to loop a kernel 99 times  which deals with two matrixes multiplication. Another is excute the kernel only one time  which deals with 100 matrixes.

        • Which is faster?

          My guess is that it is much faster to call the kernel only once.  I have noticed quite a bit of overhead associated with the EnqueueNDRangeKernel call, so if you can amortize that one kernel call over 100 iterations as opposed to calling the kernel 100 times, you should see a decent speed up.