I am running a Barnes-Hut nbody simulation and if I increase the body count above a certain threshold while running the kernel on the GPU my computer becomes unresponsive. Is there any remedy for this? As the body count increases the calculation time for each kernel execution becomes longer, but it should be a logarithmic not linear increase. Even given this, it seems as thought the GPU devotes all its resources to executing my entire batch of commands before allowing any other work.
Some one on the khronos forum explained to me that this is a known issue and is being looked in to. I tried a couple of approaches to avoid this problem:
I broke my NDenqueue into a few calls manually : this didn't help, in fact under some cases contention was worse.
I tried to use clCreateSubDevice to select a subset of compute units so as to leave some cores free for other processing : this isn't possible, as the GPU doesn't support creating sub devices (even though the query about how many sub devices is possible returns the same number as how many compute units is possible does), which is understandable.