tameem

Problem with the output when using more than one work-group

Discussion created by tameem on Sep 18, 2011
Latest reply on Sep 19, 2011 by tameem

I have a problem with kernel output , since the kernel works well when I am using one work group, but if I use more than that I couldn't understand the result, for example if I use use 2 work groups, and when I do the following :

 

this the kernel code: #define BLOCK_SIZE 16 #define BLOCK_COL 4 #define BLOCK_SIZE 16 #define BLOCK_COL 3 __kernel void exmple1( const __global float * C1, __global float * O, const int col, const int hard) { int ar = get_global_id(0); __local float C[BLOCK_SIZE][BLOCK_COL]; if(ar < col * hard) // col =4, hard=3 { C[ar/col][ar%col] = C1[ar]; // col=4 } barrier(CLK_LOCAL_MEM_FENCE); [color=#FF0040][b] O[ar]= C[0][3] ; // I update this and C[0][3] = 1 [/b][/color] } the result will differ between the 2 work groups, for example if I have 8 work items(4 work item in each work group) the the result will be as following: 0= 1 1= 1 2= 1 3= 1 [color=#FF0000 ] [b]4= 0 5= 0 6= 0 7= 0[/b] [/color] the first 4 result is true but the other 4 is wrong since it should be '1'

Outcomes