8 Replies Latest reply on Jan 20, 2010 5:13 PM by MicahVillmow

    Where to keep data?


      I have few quiet large arrays of precomputed values (8*256*8B + 10*8B = over 16kB) that I keep in __constant address space. All values are of ulong type and are used by all work-items (because actually I have no idea how to split their task for more than one work-item :/). Is __constant the best place to put those values?


      Next question - AFAIR on HD4850 __local address space is emulated from __global so it won't be any faster?

        • Where to keep data?

          Right now I don't see any performance improvement by using _constant qualifier instead of _global so currently it doesn't seem to use any cache.

          Yes, local memory is emulated by using global memory in 4800 series hence using local memory will actually degrade your performance.

            • Where to keep data?


              Originally posted by: n0thing Right now I don't see any performance improvement by using _constant qualifier instead of _global so currently it doesn't seem to use any cache.

              Are you sure n0thing?

              Let's suppose I have this kernel argument:

              "const int value"

              This value won't be cached in constant memory at the moment??

                • Where to keep data?

                  Unless the OpenCL compiler is really bad I don't see why it wouldn't use the constant buffer/cache!?

                  Have you tried both? Did you see any performance improvement (only if it's bound by that)?

                    • Where to keep data?

                      I am not sure but from what I observed I didn't see any performance improvement on using _constant qualifier.


                      • Where to keep data?

                        what i see is 2 read from global memory. and only one use of constatnt cache. but i think as ATI improve OpenCL compiler it will begn use constant cache and use of __constant improve performance. so if you loking forwad you should use __constant

                        __kernel void cons(__global float *a, __global float *b, __constant float *c, const float d) { size_t gid = get_global_id(0); b[gid] = a[gid]+c[gid]+d; } ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(11) KCACHE0(CB0:0-15) KCACHE1(CB1:0-15) 0 t: MULLO_INT ____, R1.x, KC0[1].x 1 z: ADD_INT ____, R0.x, PS0 2 y: ADD_INT ____, PV1.z, KC0[6].x 3 x: LSHL T0.x, PV2.y, (0x00000002, 2.802596929e-45f).x 4 y: ADD_INT ____, KC1[0].x, PV3.x w: ADD_INT ____, KC1[2].x, PV3.x 5 y: ADD_INT R0.y, KC1[1].x, T0.x z: LSHR R0.z, PV4.w, (0x00000002, 2.802596929e-45f).x w: LSHR R0.w, PV4.y, (0x00000002, 2.802596929e-45f).x 01 TEX: ADDR(48) CNT(2) 6 VFETCH R0.x___, R0.z, fc156 MEGA(4) FETCH_TYPE(NO_INDEX_OFFSET) 7 VFETCH R1.x___, R0.w, fc156 MEGA(4) FETCH_TYPE(NO_INDEX_OFFSET) 02 ALU: ADDR(43) CNT(4) KCACHE0(CB1:0-15) 8 x: LSHR R1.x, R0.y, (0x00000002, 2.802596929e-45f).x z: ADD ____, R0.x, R1.x 9 x: ADD R0.x, KC0[3].x, PV8.z 03 MEM_RAT_CACHELESS_STORE_RAW: RAT(1)[R1].x___, R0, MARK VPM END_OF_PROGRAM

                  • Where to keep data?
                    constant address space pointers do not use the constant cache since the size of the pointer is not known at compile time and the hardware constant cache sizes do not conform to the OpenCL spec. We are working on a way to allow programmers to put the data in the constant caches, but currently the data resides in global memory.