Hi,
we're working on a real-time image processing application where we want to use an Intel Core as the host CPU, its integrated GPU for display rendering, and an AMD GPU (R9 285) for OpenCL processing. The OS is Windows 8.1, 64-bit.
However, we're seeing massive performance problems when the Intel GPU is the primary display adapter (and even when there is no monitor attached to the R9), and is used for OpenGL rendering. CodeXL's application timeline trace shows that our kernels and memory transfers do run fast enough, but that the R9 GPU and its command queue sit idle for quite some time in between the kernels. The OpenCL host thread waits inside a call to clFinish while this happens. This does not occur when the R9 is the primary display adapter and is used for rendering. There is no explicit OpenCL/OpenGL interop in our code.
Is there some kind of implicit synchronization between OpenCL and OpenGL going on here? If so, how can we detect and disable that?