I am trying to find out what is the maximum number of 128MB objects I can create on my GPU device memory, but at the moment the program is not giving me any errors when I call cl_mem buffer1= clCreateBuffer(), cl_mem buffer2 = clCreateBuffer() ... 12 consective times, why is that?
If I understand correctly, there are 1GB device memory available, hence the call should return some kind of error after the 8th call, isn't it?
Thank you for any help!
Thank you for the quick reply.
my CL_DEVICE_MAX_GLOBAL_MEM_SIZE shows 512MB, which is less than the 1GB available on my card, but nevertheless it shouldn't allow me to create more than 4 128MB buffers within the same context...
imagine that you create context with two or more devices.
then you create onto this context a buffers. on whih device should be created? thats why you can make more buffers than is device memory.
nou is correct. It is not guaranteed that a memory will be allocated to GPU if you call a clCreateBuffer. Only the buffer that is required by that GPU will later be allocated. I think(not sure) you will get some appropriate error from clEnqueueNDRangeKernel when you will try to use those buffers.
currently you can create arbitrary number of buffers. i had experiment with this. i create sample code where i created 10 buffers 128MB each. then i enqueue simple kernel which write simple zeroes into this buffers. and fifth enqueued buffer return out of resources error.
but IMHO OpenCL implementation should swap buffers from device memory when is device memory full. and return out of resource error only when not all buffers needed for kernel execution fit into device memory.
thank you himanshu and nou, I totally missed the point that a context can be associated with multiple devices.
so does OpenCL guarantees the execution of clEnqueueWriteBuffer as long as all the requiprements are met (i.e. in my case, 128MB max per buffer with a total data input less than 512MB)? What if some other program is using the GPU memory at the same time?
In order to write 128MB memory buffer into the device memory, there must be 128MB+ CONTIGUOUS memory space available. And with the intervention of other programs, the free space may not be evenly distributed (e.g. no enough for 4 x 128MB but 3 x 128MB + 2 x 64MB is still possible). How does OpenCL behave in this case?
Thanks again for your help.