6 Replies Latest reply on Jul 18, 2014 1:09 AM by dukeleto

    problems using large amounts of global memory


      I've finally decided to try to understand/correct a problem I've been having for quite a while now:

      being able to use most of the global memory (RAM) on the GPU for a computation.


      Here's a description of the problem:

      - my code is an in-house 2D wave propagation solver; it declares a number of cl_mem arrays

      (23 to be precise) and performs operations on them

      - the code "works": results quantitatively match reference data obtained from legacy fortran code

      - the largest array size that works is about 3,500,000 doubles, which gives, for the full code,

        3.5e6*23*(8 bytes) = (roughly) 640 Mbytes

      - above this size, computer hangs, without leaving any useful messages in the logs (that I could find)

      - latest beta linux driver, running on 13.10 ubuntu; same symptoms with slightly older drivers

      - I've tried setting GPU_MAX_ALLOC_PERCENT to various values, to no avail (found in this thread)

      - I've tried setting  GPU_FORCE_64BIT_PTR to 1, also with no luck


      Can anyone guess what might be going on? Am I missing something obvious to allow

      a decent fraction of the total memory to be used?


        • Re: problems using large amounts of global memory

          I have pared down my code to just declarations and a single ultra-simple kernel.

          With this super-simple code, on a workstation with a 6GB 7970 running the latest


          - everything works fine if my arrays total less than about 640 Mbytes

          - above that size, copying from one gpu array to another (with clEnqueueCopyBuffer) works,

            but launching the test kernel segfaults on the clenqueuendrangekernel call.

          At this point, might I ask that someone from AMD at least state whether I should be able to

          use more memory? I can send the simple code base if that can help any checking.





          • Re: problems using large amounts of global memory

            Hi Sudarshan,

            thanks for the response. I am indeed able to allocate a single large buffer and pass it to a kernel,

            I will therefore try to build up from there, to see where my problem was/is coming from .