My raytracer supports hybrid GPU-CPU execution.
When running with only the CPU, I can vary the number of threads using the CPU_MAX_COMPUTE_UNITS variable. I see that performance is scalableish using up to 32 Interlagos 6272 cores. At 16 cores, I see a 10x speedup, at 32 I see 19x.
The machine I'm running this on has 32 6272 cores and 3 Radeon 7970s. On the 7970s, performance scales virtually linearly with the number of GPUs.
When I hybridize the resources, I see modest performance improvement when using 16 cores + the 3 GPUs over just the 3 GPUs. However, the moment I increase CPU_MAX_COMPUTE_UNITS to 17 or higher, performance drops like a colorful euphemism and the overall algorithm is significantly slower than using just the 3 GPUs.
I'm trying to account for why this might be the case. It appears for every CPU core, APP spawns 2 + 2 threads and another 2 for every GPU. Thus, with 3 GPUs and 32 cores, I'm seeing 73 threads. I suspect I'm getting contention issues that delay the GPUs from executing. How would I go about checking this? Also, why are there 2 threads per core instead of 1? I can understand having a few extra for all the cores to do asynchronous copies and such, but 2 per core sounds excessive.
There are 16 "bulldozer" cores in this box. This means you can run 32 threads concurrently. However, APP spawns 2 * numDevices threads for device management (copying data and such) and when using the CPU as a compute device it spawns 2 * numCores threads. This yields 73 (including the main thread) threads when running with everything in the system. I'm confused why it spawns 2 threads per core instead of 1, as unless one is blocking on a condition variable or sleeping, it's just going to contend for resources.
Since you haven't gotten very many responses on this forum, try:
* posting on the technology forum: http://forums.amd.com/game/categories.cfm?catid=433&entercat=y
* contact AMD support either by email (http://emailcustomercare.amd.com/) or by phone (http://support.amd.com/us/contacts/Pages/global-technical-support.aspx)
Let me know if you still don't get an answer to your question after contacting these 2 resources.