AnsweredAssumed Answered

how much memory can be allocated concurrently on a 4GB GPU by multiple processes?

Question asked by titanius on Oct 28, 2014

I have a 295x2 and, for each GPU, I use multiple processes (lets assume >=2) to allocate and work on large memory buffers.

 

The sum of the buffers is either less than the following two limits

1. CL_DEVICE_MAX_MEM_ALLOC_SIZE ~2.4GB

2. CL_DEVICE_GLOBAL_MEM_SIZE ~3.2GB

 

if i try to allocate buffers that total upto 2.4GB (2 processes, 1.2GB in many buffers), the program finishes 10x faster than when the total buffer size is ~3.4GB (2 processes, 1.7GB in many buffers).

 

My Questions:

1. Is there a reason for this slowdown?

2. Do i have to contend with atmost using only 2.4GB per GPU even after utilizing multiple processes?

 

Strangely enough, i didn't run into any limits and i was able to allocate (sum of) buffers approaching CL_DEVICE_GLOBAL_MEM_SIZE on a 7870 with 2GB memory.

 

 

I am using debian with 14.9 catalyst and SDK 2.9.1. I tried searching for answers but couldn't find any.

 

Thanks for reading.

Outcomes