2 Replies Latest reply on Feb 11, 2010 2:24 AM by genaganna

    get_num_groups() function not working

    jmyc
      in 2 or more dimensions get_num_groups() gives sometimes wrong answers

      I run the attached code with: work_dim=2, item_num(0) = 8192, item_num(1)=2, group_size(0) = 256, group_size(1) = 1. The code is just an example. The resulting prob array should consist the id of the groups - numbers from 0 to 63. On CPU I get the correct answer but on GPU the result is:

      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

      It seems that in some cases the function get_num_groups(), is not returning the same number as the one set in clEnqueueNDRangeKernel. For work_dim=1 program works correctly. I know that for my card the other dimensions are emulated but anyway opencl is all about portability.

      My system: Linux, kernel 2.6.28-16, gcc 4.3.3, CPU AMD64, quad core, GPU: ATI HD4770, fglrx-8.682.2

       

      __kernel void qsimKernel(__global float * prob) { float4 s=0; float4 y[25]; for (unsigned int j=0; j<25; j++) s += y[j]; exp(s); size_t bid0 = get_group_id(0); size_t bid1 = get_group_id(1); size_t bid2 = get_group_id(2); size_t num_groups0 = get_num_groups(0); size_t num_groups1 = get_num_groups(1); size_t num_groups2 = get_num_groups(2); size_t bid = bid0 + bid1 * num_groups0 + bid2 * num_groups0 * num_groups1; prob[bid] = bid; }

        • get_num_groups() function not working
          omkaranathan

          Could you post the complete source code?

          • get_num_groups() function not working
            genaganna

             

            Originally posted by: jmyc I run the attached code with: work_dim=2, item_num(0) = 8192, item_num(1)=2, group_size(0) = 256, group_size(1) = 1. The code is just an example. The resulting prob array should consist the id of the groups - numbers from 0 to 63. On CPU I get the correct answer but on GPU the result is:

             

            0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

             

            It seems that in some cases the function get_num_groups(), is not returning the same number as the one set in clEnqueueNDRangeKernel. For work_dim=1 program works correctly. I know that for my card the other dimensions are emulated but anyway opencl is all about portability.

             

            My system: Linux, kernel 2.6.28-16, gcc 4.3.3, CPU AMD64, quad core, GPU: ATI HD4770, fglrx-8.682.2

             

             



            Jmyc,

                   Thanks for reporting this. get_num_groups(1) is giving 1 instead of 2.