cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

inducer77
Adept II

Incorrect error messages for out-of-memory in overcommitted situation

Hi there,

OpenCL allows buffer allocations to succeed even when nominally all available memory is exhausted. This memory is only required to be available when commands/kernels are enqueued using these buffers. If that is not the case, the correct error to return is CL_MEM_OBJECT_ALLOCATION_FAILURE. It seems that the 13.1 drivers (at least on my "Devastator" APU) return CL_OUT_OF_RESOURCES. This is confusing and appears to be non-compliant. It'd be great if this could be fixed.

Thanks,

Andreas

0 Likes
9 Replies
himanshu_gautam
Grandmaster

Hi inducer77,

Can you please share a test case, that can help us reproduce the issue.

Also i am not sure what you mean by "This memory is only required to be available when commands/kernels are enqueued using these buffers". Can you clarify.

0 Likes

Here's some sample code that triggers the issue for me: (in PyOpenCL--C equivalent is easy to write, but longer)

import pyopencl as cl
import pyopencl.array
import numpy as np

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

arys = [cl.array.empty(queue, 2**24, np.float32) for i in range(255)]
for ary in arys:
    ary.fill(0)

The point is that the memory allocations (-> clCreateBuffer calls) far exceed the available memory on the device. Yet they all succeed. That may seem strange, but it is correct behavior. However, as soon as you start touching more memory than present (by way of the "fill" calls), you get an error. This error should be CL_MEM_OBJECT_ALLOCATION_FAILURE, not CL_OUT_OF_RESOURCES.  Hope that clarifies things.

0 Likes

ok inducer. I will try to see the behavior at my end.

A working testcase, is always handy though

Just to be sure, I should try to allocate more memory, than my graphics card has using clCreateBuffer. In such a case, the error returned should be CL_INVALID_BUFFER_SIZE. You may not get CL_MEM_OBJECT_ALLOCATION_FAILURE, as actual allocation for buffer may happen at  some later time. In your case you get no error in clCreateBuffer. But you are getting CL_OUT_OF_RESOURCES when trying to use the buffer by say, clEnqueueWriteBuffer.

Message was edited by: Himanshu Gautam

0 Likes

buffers are bounded to context and not device. also AMD implementation allocate buffer on device only when it is needed. imagine this situation. you have four GPU in one context. you create buffers for each of them. but OpenCL can't know where exactly you want place them. Only after some command involving command queue OpenCL can place buffer on appropriate device, and only then can return error when you exhaust device memory.

0 Likes

Thanks for pointing that out, nou. Ofcourse buffers are tied to context, and therefore clCreateBuffer should not shout if buffer having more memory, than the device can hold is created.

From OpenCL spec:

The clEnqueueWrite/Read Buffer should return: 

CL_MEM_OBJECT_ALLOCATION_FAILURE if there is a failure to allocate memory for data store associated with buffer.

I will try to reproduce this issue, and report to AMD Engg Team.

0 Likes

I can see this behavior also. What should AMD consider is moving unnecessary buffer from device to host memory automatically and swap then with buffers that are needed for current execution.

0 Likes

This sounds very similar to what we've been discussing on this other thread.

0 Likes

yes it is.

But here issue was wrong error code returned. There we want runtime to dynamically move buffers out of device, while running many kernels requiring different buffers sequentially. I will take up your code, and ask about the expected behavior. But most likely the discussion will remain private

0 Likes



From OpenCL spec:


The clEnqueueWrite/Read Buffer should return: 


CL_MEM_OBJECT_ALLOCATION_FAILURE if there is a failure to allocate memory for data store associated with buffer.



I will try to reproduce this issue, and report to AMD Engg Team.


This was reproduced. And you should get CL_MEM_OBJECT_ALLOCATION_FAILURE with new SDK release.

0 Likes