5 Replies Latest reply on Mar 5, 2010 10:59 PM by MicahVillmow

    global vs. local memory

    blelump
      some clarification would be appreciated

      Hello,

      I'm wondering what's the main difference between such types of memory.

      There are some examples for CL, which could clarify it a bit [but they don't, for me anyway] - for instance transpose or multiply matrices uses local memory and on the other hand sobel filter or matrix convolution works without any local memory.

      I know that fetch data from local is probably faster than from global, but copying data from global to local also costs some time - isn't it ?

      So far I assume that local memory might be more efficient while user must deal with a lot of data whereas global memory is more effective when user operates just on each element and moreover needs additional access to the element neighbourhood - Am I right ?

      Thanks in advance for any response.

        • global vs. local memory
          thatguymike

          You are correct that you need to move data from global to local and then you can use if from local.  If you are just computing on a data element once, it makes little sense to move it from global to local instead of just computing on the global value.

          The general reason to use local is that you have significant data reuse, and generally read-modify-write reuse.  Matrix multiply is an example where you will load data into local because you will reuse the same data elements multiple times.  For something like a small window convolution, like 3x3, it may not make sense to move the data to local first because their may not be enough reuse.  (If you are very careful about how you do your convolution, using local *may* actually be a win)

            • global vs. local memory
              drstrip

              When I list the device properties for the RV770 local memory type is listed as GLOBAL. This would suggest for this device (and similar ones) there is no benefit to using local memory, as the access would have the same "cost", with the possible exception of using the copy to organize the data to get better access patterns. Is this right or am I missing something?

            • global vs. local memory
              MicahVillmow
              drstrip,
              This is correct. Local memory on 7XX series of cards is emulated in global memory because of hardware constraints on local memory.
              • global vs. local memory
                MicahVillmow
                drstrip,
                The 5XXX series of cards have true local memory, the 4XXX series do not. This is a hardware issue as the 4XXX series were designed before OpenCL existed.