9 Replies Latest reply on Sep 17, 2015 6:23 AM by dipak

    Queuing large number of kernels eats up a lot of host memory


      When I run my application, I first queue up around 50 sets of kernels, each set containing around 10 kernels.

      The queued kernels wait for a user event before beginning.  I am finding that simply queuing the kernels into OpenCL

      queues eats up around 1.5 GB of host memory, and even after the kernels have been executed, the memory does not

      get cleaned up.


      How can I trouble shoot this issue?  And why does the queue eat up so much memory? Each set of kernels waits for a host to device

      transfer of a 9 MB buffer before they execute, but I maintain a pool of these buffers, so only a handful are allocated.