I am trying to figure out why debug version and release version of my app written in c++ using opencl c++ api act differently. Basically what my app does is:
1. create a few kernels
2. create a set of buffers
3. set kernel arguments
then execute these kernels in a loop by repeatedly call clEnqueueNDRangeKernel with a single in order command queue(oddly enough, if I use a new command queue for each command, the outcome changes. commands are properly synchronized of course). There are some read/write commands spread in the queue.
I understand that c++ debug / release version behave differently may happen, but it seems to be pretty simple code in c++ side. I think I may have missed something in host code...
Could any one give me some hints please ? Thanks in advance.