cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

roboto
Adept I

What is the true limit of OpenCL 3-D image?

Hi All,

What is the image memory limit for OpenCL 1.2?

On my AMD 7970 6GB, the following:

size_t sz=0, sx, sy;

clGetDeviceInfo(device_, CL_DEVICE_IMAGE3D_MAX_DEPTH, sizeof(sz), &sz, NULL);

clGetDeviceInfo(device_, CL_DEVICE_IMAGE3D_MAX_HEIGHT, sizeof(sy), &sy, NULL);

clGetDeviceInfo(device_, CL_DEVICE_IMAGE3D_MAX_WIDTH, sizeof(sx), &sx, NULL);

Returns 2048 for sx, sy, sz.

But I can't allocate more than 1000, 1000, 1000, image size using a volume descriptor as:

cl_image_desc bigImageDesc = {CL_MEM_OBJECT_IMAGE3D, 1024, 1024, 1024, 0, 0, 0, 0, 0, NULL};

const cl_image_format imageFormat = {CL_R, CL_FLOAT};

bigImage = clCreateImage(context, CL_MEM_READ_WRITE, &imageFormat, &bigImageDesc, NULL, &err);

The behavior is that when doing a clEnqueueReadImage , the call never returns. When doing smaller image sizes like 1000x1000x1000, everything is ok. Is the image size limited by 32bit addressing ? Is there a way around this like the is for buffer memory (i.e GPU_FORCE_64BIT_PTR) ?

My rig:

Win7 64

24GB Ram

i7 QuadCore

AMD 7970 6GB

Thanks!

0 Likes
1 Solution
gbilotta
Adept III

Interesting. The cEnqueueReadImage call never returning is most definitely a bug. Does clCreateImage return with no error? Do you have some other buffers allocated on the device?

Concerning the maximum image sizes, I've always read those values as the maximum size in each direction, without any indication of how large the total image can be. This would be similar to the way local work sizes are limited to 1024 in each direction, but the product of the three dimensions cannot be higher than 1024 altogether. I suspect this might be the case here too.

There is a simple way to check if the limit is due to you requesting a 4GB image (which would indeed smell like a 32-bit vs 64-bit issue) or something else: try a different layout (e.g. 2048*2048*160)  with format CL_R (2.5GB, so it should work), and then try the same layout with a format with two channels (assuming the GPU in question supports something like CL_RA or CL_RG): if it bombs out, it's a memory allocation issue (5GB), otherwise it's an image size (as in: dimensions) issue.

View solution in original post

0 Likes
3 Replies
gbilotta
Adept III

Interesting. The cEnqueueReadImage call never returning is most definitely a bug. Does clCreateImage return with no error? Do you have some other buffers allocated on the device?

Concerning the maximum image sizes, I've always read those values as the maximum size in each direction, without any indication of how large the total image can be. This would be similar to the way local work sizes are limited to 1024 in each direction, but the product of the three dimensions cannot be higher than 1024 altogether. I suspect this might be the case here too.

There is a simple way to check if the limit is due to you requesting a 4GB image (which would indeed smell like a 32-bit vs 64-bit issue) or something else: try a different layout (e.g. 2048*2048*160)  with format CL_R (2.5GB, so it should work), and then try the same layout with a format with two channels (assuming the GPU in question supports something like CL_RA or CL_RG): if it bombs out, it's a memory allocation issue (5GB), otherwise it's an image size (as in: dimensions) issue.

0 Likes

Hi,

Thanks for replying. Good point, I will try this in a few days and let you know how it goes. I remember clEnqueueReadImage not returning when the 3D size was 1024 or greater even for CL_RA or CL_RG. So it seems like a problem with the 3-D sizes. I'll let you know for sure.

Thanks!

0 Likes
roboto
Adept I

I tried what you said and found the limit to be a single block memory allocation (4GB in my case). I can allocate more memory by splitting allocation in multiple chunks.

Thanks!

0 Likes