I'm executing a kernel which is invoked multiple times. When I execute in Release mode, it runs fine, however when I profile it with the Stream Profiler, the output is incorrect. Is this a known issue?
If your kernel reads in an input buffer that is also used as an output, the profiler will generate incorrect results (this problem is described in this KB).
That explains it.
Retrieving data ...