cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

stuart_rogers
Adept II

OpenCL Maximum Buffer Size of Kernel Argument

I create an OpenCL buffer, using clCreateBuffer and CL_MEM_WRITE_ONLY, that is only written to by an OpenCL kernel. The global_work_size[1] is 2500. The buffer is a large array of doubles consisting of about 655 megabytes. When I run the OpenCL program it hangs the whole computer. But if the buffer is around 330 megabytes or less, the OpenCL program works fine. I am using an AMD FirePro W8000 with AMD SDK v2.8 on Red Hat 6.2. Is there a maximum kernel argument buffer size?

0 Likes
1 Solution

Yes the error code returned by the last argument in clCreateBuffer is -61, which means the requested buffer size exceeds CL_DEVICE_MAX_MEM_ALLOC_SIZE. For the AMD FirePro W8000, CL_DEVICE_MAX_MEM_ALLOC_SIZE is 537 MB. So it does seem that there is a maximum allowable buffer size for OpenCL kernel arguments. Also this thread discusses this issue: OpenCL buffer size limited? And this link also touches on the issue: https://devtalk.nvidia.com/default/topic/464454/maximum-data-size-in-opencl/.

From OpenCL Error Codes | tersetalk:

#define CL_INVALID_BUFFER_SIZE -61

From clCreateBuffer:

CL_INVALID_BUFFER_SIZE if size is 0 or is greater than CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in table of OpenCL Device Queries for clGetDeviceInfo for all devices in context.

View solution in original post

0 Likes
6 Replies
himanshu_gautam
Grandmaster

Kernel arguments cannot have a limit on the size of buffer. If buffer is allocated properly, kernel should run. Some reasons might be: There is some other error in the program, check the previous API, specially related to buffer creation. Check That global size is a perfectly divisible by local size in both cases.

You say global_work_size[1] = 2500, what about global_work_size[0]? You can also check your code on some other device (another GPU or atleast CPU). Does it work anywhere else? If problem persists, attach your code here as a zip file.

0 Likes

I changed the program a little bit to simplify things. The code is attached. If the global_work_size[0] = 2500, the program runs fine on the FirePro. But if global_work_size[0] = 5000, the program crashes on the FirePro. The work size is determined by the NO variable on line 65 of Orange.cpp. The program works fine on the CPU. The CPU can be chosen as the platform on line 85 by setting the array argument to 0. When the CPU is selected, are all the cores of a multi-core CPU exploited by AMD APP or does the kernel only execute on a single core? I don't specify the local work size, passing NULL in for that argument of clEnqueueNDRangeKernel.

0 Likes

The context is being created without using any platform?? How is this code running in any case?

You need to check error codes from OpenCL APIs, and importantly, do a clCreateContextFromType. it seems device selected using clGetDeviceIds and device in context are different.

EDIT: Looks like context can be created from the device only, and platform can be null in that case. But the testcase is always crashing on my machine as of now. Let me revisit the code.

0 Likes


himanshu.gautam wrote:



Kernel arguments cannot have a limit on the size of buffer. If buffer is allocated properly, kernel should run.



What is the meaning of "Max memory allocation" reported by clinfo then? I had thought that was the max buffer size. It's 512MB (536870912 bytes) for the Tahiti and FX6300 I'm running. That would fit with Stuart's point that 655 MB fails but 330MB runs.  Just noticed that Stuart posted the code. Looking at the buffer allocation:

cl_mem outputBuffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY , (NO*NF) * sizeof(double), NULL, NULL);

That last argument could rather be address of error code, which one might want to examine to detect if the allocation fails.

Yes the error code returned by the last argument in clCreateBuffer is -61, which means the requested buffer size exceeds CL_DEVICE_MAX_MEM_ALLOC_SIZE. For the AMD FirePro W8000, CL_DEVICE_MAX_MEM_ALLOC_SIZE is 537 MB. So it does seem that there is a maximum allowable buffer size for OpenCL kernel arguments. Also this thread discusses this issue: OpenCL buffer size limited? And this link also touches on the issue: https://devtalk.nvidia.com/default/topic/464454/maximum-data-size-in-opencl/.

From OpenCL Error Codes | tersetalk:

#define CL_INVALID_BUFFER_SIZE -61

From clCreateBuffer:

CL_INVALID_BUFFER_SIZE if size is 0 or is greater than CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in table of OpenCL Device Queries for clGetDeviceInfo for all devices in context.

0 Likes

Thanks void_ptr for clarifying. I took the kernel argument size for a different meaning.

Hi stuart,

Thanks for confirming the issue is justified now.

0 Likes