Usually, if I have three gpus, I'll use two of them to render using crossfire, and the last one will be used to do Opencl calculation. So it's possible to distribute work to render part of each frame on each device and aggregate the results using crossfire.
I beg to differ,
In most cases Cross Fire works by distributing whole frames. E.g. in a two GPUs scenario all even frames will be rendered on GPU 0 and all odd numbered frames will be rendered on GPU 1 .
Currently we do not support OGL interop with multi-device contexts.
However, you do not need OGL to maximize parallelism.
You need to remove CPU-GPU synchronizations. Be sure to en-queue and flush the NDrange commands of frame N+1 before waiting for the results of frame N .
You can profile GPU utilization by using AMD APP Profiler or Microsoft GPUView.