Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Adept III

Re: More pinned host-memory than device-memory capacity

I changed the code to print the buffer that is been processed, and these are the findings:

  • on the HD6970 I can get up to 64 buffers (1GB), which is the total amount of gmem reported by clinfo for the card. I also notice that when going beyond 32 buffers things start to slow down (especially if using read/write instead do map/unmap);
  • on the Tesla C2070 (6GB of memory, but under OpenCL it only works in 32-bit mode, so 4GB is the maximum we can use) I get to 250 buffers (4016MB), so it would seem the processing does happen, but it's much faster (no perceptible slowdowns for any buffer);
  • strange results are coming from our HD7970: it should have 3GB of RAM according to the box, OpenCL only shows 2GB, but the out of resources only happens at 270 buffers, 4336MB (the host only has 4GB of RAM); does this mean that the driver (13.4 again, OpenCL 1.2 AMD-APP 1124.2) does evict buffers on this card, and is therefore limited by the host memory instead? (In this case, btw, out of resources is the correct error, I think, since it's running out of host resources to manage the buffers, not of device memory.)
Journeyman III

Re: More pinned host-memory than device-memory capacity

With a HD 7750 (1GB), on Linux with 5GB of RAM and Catalyst 12.6: the clWrite and clMap versions both fail at 120 buffers (with -5 and -12), which is about 2GB of memory, but there is no obvious slowdown.


Re: More pinned host-memory than device-memory capacity

Nice to hear you had added some performance metrics to that test. Can you please share it in your github repo.(Link above somewhere )

It is interesting to know what happens on HD7750 + 5GB RAM machine. But Catalyst 12.6 is way old. Can you check with 13.6beta there? It will be also useful, if you can test on windows as well. Performance is expected to be better on windows.