5 Replies Latest reply on Nov 18, 2010 1:50 PM by MicahVillmow

    kernel input paramters: const or non-const?

    CaptainN
      __constant int * given worse estimate then __global const int*

       

      Through the sample code in SDK and posted messages I see that input parameters declared usually as:

      __global TYPE * or __global const TYPE *, where TYPE could be any of valid types, int, for example.

       

      However, there is a recommendation (or at least way) to declare input buffer with __constant address space qualifier to take advantage of const buffers/caches in Radeon.

      __constant qualifier allows to refer to global memory as well.

       

      Just having very simple kernel SKA shows that performance when input array declared as

      __constant int * worse comparing to __global const int *. Whether const caches are that small and ineffective, and only good for a non-mem object parameters?

       

      Whether any difference exists between declaration of input as

      __global int * or __global const int * performance wise, or it is just cl language protection from writing into the input array?

        • kernel input paramters: const or non-const?
          gat3way

          On 4xxx, it apparently does not matter.  I have done some testing on 4670 and both show almost the same performance. Since I was working on linux and there was no SKA available, I concluded that it could be because constant memory on those boards is emulated in global memory. But I may be wrong on that.

            • kernel input paramters: const or non-const?
              nou

              use __constant is performace wise. but AMD GPU have limited constant space in HW. only 16kB IIRC. but OpenCL spec require 64kB. so AMD must it emulate in global memory. but you can utilize real constant if you specife it maximum size. you can find more in OpenCL programing guide from AMD.

            • kernel input paramters: const or non-const?
              MicahVillmow
              constant buffer sizes on our hardware are 64kb. There was a bug in SDK 2.1 or 2.0.1 that limited them to 16kb, but that was removed in 2.2. If your size is too large, then use a const global int* and you will get caching when it is enabled in the future.
                • kernel input paramters: const or non-const?
                  CaptainN

                  Thank you, Micah!

                  Just to dive in it a little bit longer

                  Does it mean that __constant int * and const global int * can be used interchangably in future? (but today __constant int * uses cache unconditionally, while const global int* is not using const cache today, but will be optimized someday to use constant cache properly?)

                • kernel input paramters: const or non-const?
                  MicahVillmow
                  CaptainN,
                  No, this is not correct at all. constant points to the constant address space, which maps to the hardware constant buffers. const global int* points to the global address space, which uses device memory but will utilize the cache infrastructure. const global int* peaks at around 1TB/s on the highend chips but constant int* peaks at 5/12th register speed with static indexes(should be 3-4TB/s). The constant cache and the global cache are two seperates cache's in the hardware. The global cache has a L1/L2 hierarchy, the constant cache is 8-48KB of space depending on the device and no L2 exists.