cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

njh1983
Journeyman III

PCIe-memory size limit? (using CAL APIs)

Hi,

 

I am trying to allocate a huge amount of data using CAL API calResAllocRemote2D(). According to the CAL programming guide, this data should be assigned to PCI-e memory as a part of host memory. So I guess total PCI-e memory space for GPUs should be greater than or equal to total GPUs’ local memory space (in my case, it will be roughly 2GB for each GPU; total 4GB).

Anyway, total CPU uncached space size(PCIe memory) can be easily obtained by calling calDeviceGetAttribs() and my result is 1786MB. But when I allocate 1300MB or so CAL API returns an error. (1300MB for 1 device or 2048MB for 2 devices)

For example,

calResAllocRemote2D(&resRemote[0], &device[0], 1, 8192, 8192, CAL_FORMAT_INT_4, 0);

calResAllocRemote2D(&resRemote[1], &device[1], 1, 8192, 8192, CAL_FORMAT_INT_4, 0); // Error!



 

In short, my question is:

  • Does “Uncached remote GPU memory space; uncachedRemoteRAM” mean maximum space for 1 GPU or for total GPUs installed?
  • Why it returns an error when I allocate 1300MB for 1 device (or 2048MB for 2 devices)?

 

Thanks in advance.

 

These are the details of  my system:

  • Host: Super Micro GPU computing server (Intel Xeon 5600),  32GB DRAM
  • GPU: Radeon HD 6970 x 2
  • OS: Red Hat Enterprise Linux 5.5
  • Driver: Catalyst 11.2
  • SDK: v2.3-lnx64
0 Likes
5 Replies
njh1983
Journeyman III

Am I doing anything wrong? Or is this just device driver problem? (catalyst 11.2)

I'd appreciate any suggestions of how to fix this problem.

0 Likes

When I was doing CAL development with HD5870 1GB card I found that the largest single 2D image buffer I could create was 768MB. But when I was trying to store data in there I had problems if I tried to use more than 672MB (I think that was the number).

Overall it's a mess. There are so many hidden gotchas that I'm afraid to say you're basically on your own.

For example I discovered that you can't use arbitrary dimensions for a buffer. e.g. 2048, 2560, 3072, 3584 and 4096 are valid sizes for a float4 formatted buffer's width or height. But 2050 or 3840 are not valid sizes - they might work or they might not, depends what the other dimension is.

I wasted a month on this nonsense.

When I started work in OpenCL I discovered exactly the same problem.

0 Likes

jawed,

can you specify the detail about the problem in opencl?

AFAIK there is only one limitation that is global NDrange should be perfectly divisible by the local NDRange.

Thanks

0 Likes

I developed a work-around for this problem a year ago, for CAL.

When I tested a "pass through" kernel under OpenCL I found the problem recurred, so I re-used my work-around. The kernel consists simply of copying the data from an input 2D image to an output 2D image, with one work item processing each element in the buffer.

I last tested OpenCL for this misbehaviour in October.

I just tested my CAL app with the work-around removed. Catalyst 10.12 doesn't have the issue, so I presume it's fixed in OpenCL too. Since there are no release notes for Catalyst, I've no idea which version actually fixed the problem.

And, yes, I'm aware of the NDRange sizing restrictions in OpenCL.

---

Going back to my old notes, it seems that CAL was reporting 830MB as the maximum buffer size. But I could only create a buffer of 768MB.

672MB was a restriction on a particular data layout I was using, so that's irrelevant to this discussion.

 

0 Likes

Jawed,

Thanks for sharing your experience.

Anyway, I hope it can be fixed soon.



0 Likes