2 Replies Latest reply on Jun 6, 2014 5:09 AM by gopal

    GPU Memory Allocation Failure on Trinity APU

    farooqun

      I am hitting a CL_MEM_OBJECT_ALLOCATION_FAILURE when I try to run relatively large graphs on the GPU device of my Trinity APU machine. I configured GPU_MAX_ALLOC_PERCENT=100 and GPU_MAX_HEAP_SIZE=100, which increases my GPU device max memory allocation to about 1.8 GB. However, I need to be able to execute larger graphs on the GPU. Is there any way I can further increase this limit? On an APU machine, where I can use CL_MEM_ALLOC_HOST_PTR to use (zero-copy) CPU-side of the memory, I would have expected to not encounter such a limit since the data is supposed to reside on the CPU memory and is simply mapped (not copied) over to the GPU. As such, should I not be able to use more of the CPU memory than just around 2 GB to run kernels on the GPU? Is there a way around this?

        • Re: GPU Memory Allocation Failure on Trinity APU
          dipak

          Hi,

           

          I'll discuss this point with the concerned team and come back to you shortly.

           

          Thanks,

          • Re: GPU Memory Allocation Failure on Trinity APU
            gopal

            Yes, you should be able to increase the GPU device max memory limit, either using CL_MEM_ALLOC_HOST_PTR or CL_MEM_USE_HOST_PTR flag, to use CPU side of memory.

             

            Here is my observation on memory allocation on AMD A6-3410MX APU (its about 3 years old card and hence the figure given below may not match as per your post, but concept wise we should be able to reach CPU device's max memory limit):

             

            1. before setting GPU_MAX_ALLOC_PERCENT=100, clinfo displayed

            device(GPU) memory limit:

                maximum allocation size = 191 MB

                global memory size = 765 MB

            host(CPU) memory limit:

                maximum allocation size = 2GB

                global memory size = 3.5 GB

             

            2. After setting GPU_MAX_ALLOC_PERCENT=100, clinfo displayed

            device(GPU) memory limit: (little improvement)

                maximum allocation size = 254 MB

                global memory size = 1 GB

            host(CPU) memory limit: (same)

                maximum allocation size = 2GB

                global memory size = 3.5 GB

             

            As per above info, i was not able to allocate a buffer of size more than 254MB, on GPU. And hence after exceeding this size(252MB), clCreateBuffer() returned CL_INVALID_BUFFER_SIZE error.

             

            Using CL_MEM_ALLOC_HOST_PTR flag, i am able to allocate a buffer of size upto 1.4GB on Host side. And after exceeding this size(1.4GB), clCreateBuffer() returns CL_MEM_OBJECT_ALLOCATION_FAILURE error. Which means that we can allocate a larger buffer on CPU memory. Note: But still there is a limit on CPU memory, we can not allocate a buffer of size more than CPU device's max memory allocation (as in this case not more than 2GB).

            I am not sure why i am not able to reach to 2GB (i.e. CPU device's max memory allocation), it could be either OS limit or vendors's OpenCL implementation issue.

             

            Further, we would be able to allocate even more larger buffer using SVM (shared virtual memory) of OpenCL 2.0.

             

            Thanks,