cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jmyc
Journeyman III

get_num_groups() function not working

in 2 or more dimensions get_num_groups() gives sometimes wrong answers

I run the attached code with: work_dim=2, item_num(0) = 8192, item_num(1)=2, group_size(0) = 256, group_size(1) = 1. The code is just an example. The resulting prob array should consist the id of the groups - numbers from 0 to 63. On CPU I get the correct answer but on GPU the result is:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

It seems that in some cases the function get_num_groups(), is not returning the same number as the one set in clEnqueueNDRangeKernel. For work_dim=1 program works correctly. I know that for my card the other dimensions are emulated but anyway opencl is all about portability.

My system: Linux, kernel 2.6.28-16, gcc 4.3.3, CPU AMD64, quad core, GPU: ATI HD4770, fglrx-8.682.2

 

__kernel void qsimKernel(__global float * prob) { float4 s=0; float4 y[25]; for (unsigned int j=0; j<25; j++) s += y; exp(s); size_t bid0 = get_group_id(0); size_t bid1 = get_group_id(1); size_t bid2 = get_group_id(2); size_t num_groups0 = get_num_groups(0); size_t num_groups1 = get_num_groups(1); size_t num_groups2 = get_num_groups(2); size_t bid = bid0 + bid1 * num_groups0 + bid2 * num_groups0 * num_groups1; prob[bid] = bid; }

0 Likes
2 Replies
omkaranathan
Adept I

Could you post the complete source code?

0 Likes
genaganna
Journeyman III

Originally posted by: jmyc I run the attached code with: work_dim=2, item_num(0) = 8192, item_num(1)=2, group_size(0) = 256, group_size(1) = 1. The code is just an example. The resulting prob array should consist the id of the groups - numbers from 0 to 63. On CPU I get the correct answer but on GPU the result is:

 

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

 

It seems that in some cases the function get_num_groups(), is not returning the same number as the one set in clEnqueueNDRangeKernel. For work_dim=1 program works correctly. I know that for my card the other dimensions are emulated but anyway opencl is all about portability.

 

My system: Linux, kernel 2.6.28-16, gcc 4.3.3, CPU AMD64, quad core, GPU: ATI HD4770, fglrx-8.682.2

 

 



Jmyc,

       Thanks for reporting this. get_num_groups(1) is giving 1 instead of 2.

0 Likes