AnsweredAssumed Answered

atomic operation

Question asked by Anon5710 on Apr 30, 2012
Latest reply on May 16, 2012 by MicahVillmow

Hello,

 

I have written a kernel whose job is to compare  2 groups of integers to each other, whenever there is a match counter is increased.

To avoid running in the 128 MB buffer limit, i decided to use atomic operations to give me a index, which i use as a reference where to write when counter is high enough.

 

__kernel void compare( __global int* input, __global int* output, const unsigned int count, __local int * shared , __local int * suspect, __local int * indexes, __global int* outIndex )
{
    int x = get_global_id(0); //get global idx
    int y = get_global_id(1); //get global idy

    int local_idx = get_local_id(0);

    int k,i,j;
    int counter=0;
    int index;
        
    if( x < count && y < count && x != y)
    {
        //store each subgroup (of hashes) in local memory 
        //offset each variable by workgroup size bytes
        for(k=0;k<REP*2;k++)
        {
            shared[local_idx+k*64] = input[x+k*count]; 
            suspect[k] = input[y+k*count];            //does the one exception to bank conflicts also work when writing ?
        }
        barrier(CLK_LOCAL_MEM_FENCE);

        //store outindex global values in local index values
        //has a coalsed acces pattern
        indexes[local_idx] = outIndex[x];
                
        for(i=0;i<REP*2;i++)
        {
            for(j=0;j<REP*2;j++)
            {
                if(suspect[i] == shared[local_idx+j*64])
                {
                    //printf("Succes : %d = %d -> (%d,%d) \n",suspect[i],shared[local_idx+j*64],x,y);
                    counter++;
                }
            }
        }

        //output[x+y*count] = counter;
        if(counter >  REP/2)
        {
            index = atomic_add( &indexes[local_idx], 2 );
            output[ x +index*count ] = y;
            output[ x +index*count+count ] = counter;
            //if( x == 65 )
                //printf("(%d,%d,%d)\n",y,counter,index);
        }
        barrier(CLK_LOCAL_MEM_FENCE);

        //restore outindex in global memory;
        outIndex[x] = indexes[local_idx];
        barrier(CLK_LOCAL_MEM_FENCE);

    }
}

 

Now when this code is run with line 45 & 46 uncommented, it works perfectly i have no problem at all. I get 20 or so positive results.

When line 45&46 are commented out, i get return 1 result, the last one.  So I'm guessing there is a problem with the atomic operation. But honestly, i have no clue how to fix this.

 

Does anyone know what i'm doing wrong ?

Outcomes