2 Replies Latest reply on Oct 21, 2010 1:40 PM by douglas125

    Indexes in Kernel's


      I want to count some array which have 40000 items. But I'm thinking of it like 2 dimensional(200x200). So I set the   globalThreads[0] = 200;
        localThreads[0] = 200. Then using the simple logic, I calculating index in kernel like: 

      uint tid = get_global_id(0);
       uint lid = get_local_id(0);

      const uint range=200;

      index= tid*range+lid;

      But seems that it wrong idea. What's wrong in my logic? 

        • Indexes in Kernel's

          I don't know exactly what you want to do here, but if you just want each work-item to access an element, without using local memory in any particular way, then what you want is a 2D NDRange globally. What you seem to have done is create a 1D range with 200 work-items in total, with one work-group of size 200. What you should rather do is create a 200x200 2D NDRange globally. The local size is determined based on performance. Since you're not using a power-of-two size, it's probably going to be sub-optimal, but you could use a local size of say 10x10.

          So you'd have something like:

          uint gidx = get_global_id(0);

          uint gidy = get_global_id(1);

          uint range = get_global_size(0);

          index = gidy * range + gidx;

            • Indexes in Kernel's

              I think what you mean is to use a work dimension of two:

              Use EnqueueNDRangeKernel using GlobalWorkSize = {200, 200}

              Then inside the kernel use

              int tid = get_global_id(0);

              int lid = get_global_id(1);

              ind = tid*200+lid;


              Roughly speaking, the local workgroup is a subdivision of the global workgroup. Refer to the OpenCL architecture in Khronos Spec.