AnsweredAssumed Answered

Different behaviors when device has reached its maximum global memory limit

Question asked by roboto on Mar 20, 2014
Latest reply on Apr 4, 2014 by roboto



I have a HD 7970 with 6G and I have an OpenCL program(.exe) that takes up about 3.3G of Global memory.


Two odd behaviors:

First I'll call the OpenCL program I want to start myOpenCLApp.exe


Behavior 1:

I am able to create two myOpenCLApp.exe and successfully get an output by spawning it as a process via CreateProcessA of windows API. This is odd since creating two myOpenCLApp.exe surpasses my global memory limit by ~600Mb. I observe the same behavior with smaller global memory on other GPU devices i.e NVidia. Using my tools, I throttling on GPU activity and the execution time slows down considerably.

I create the process like this:

CreateProcessA("myOpenCLApp.exe", NULL, NULL, NULL,false, 0, NULL, NULL, &sinfo, &pinfo);

CreateProcessA("myOpenCLApp.exe", NULL, NULL, NULL,false, 0, NULL, NULL, &sinfo, &pinfo);

Total memory for both processes: 6.63GB. This is puzzling as I expect the second call to CreateProcessA to fail. However, what seems to happen is that both processes run just fine just really slow.


What is this behavior I'm seeing and where can I find more info on it? I have not seem much material online about this.


Behavior 2:

I used the command line for this.

On the command line I use:

start /b myOpenCLApp.exe


and again

start /b myOpenCLApp.exe


This gives me an error when I try to start the second process CL_MEM_OBJECT_ALLOCATION_FAILURE .

I expect to see this when starting the process the way I did for Behavior 1.

What's going on here?


Behavior 1 has one common parent while Behavior 2 is somehow different?


I also observe the behavior 1's pattern when applying to multiple threads and one common parent.


Please let me know if I missed something obvious