Fr4nz

Cannot modify correctly __local uint4 vectors variables: serious compiler bug?

Discussion created by Fr4nz on Jan 15, 2010
Latest reply on Jan 15, 2010 by Fr4nz
If yes, it is very serious...

Hello,

after posting yesterday about my problems with __local uint4 vectors variables, I tried to understand better the problem: it seems that I can read these variables without problems but I can't modify them correctly in an "usual" way (see below what I mean).

Consider this simple kernel (ignore the parameters that aren't used inside the kernel...):

 

/**
 * Kernel: uint4 vector test;
 */
__kernel
void
intersect(__global uint4 *inputVD,
          __global uint *inputPrefixCk,
      __global uint *outputCounts,
      __local uint4 *tmpBuffer,
          const int k,
          const int dimRow) {

    // Simple mask
    const uint4 MASK = (uint4) (1);

    // Recover global ID of the work item;
    const int gid = get_global_id(0);

    // Test: modify __local vector variable with vstore4;
    vstore4(4,0,tmpBuffer+gid); // OK here....

   // Test: modify __local vector variable with an operator (in this case "+");
    tmpBuffer[gid] += MASK; // NOT ok here: cannot modify tmpBuffer value correctly! Only "x" and "w" components are updated correctly...
    
    // Final output of the work-item;
    outputCounts[gid] = tmpBuffer[gid].x + tmpBuffer[gid].y + tmpBuffer[gid].z + tmpBuffer[gid].w;

}



As you can read from the source comments, when I modify a __local vector uint4 variable location with "vstore4" everything works, but if I try a stupid operation (in the example a sum with another uint4 vector variable) only components "x" and "w" (the first and the last) are updated.

Is this normal? What is the explanation of this? A compiler bug maybe? The only workaround I've found in order to update correctly a __local vector variable is to use vstore4...in fact if I replace the non-working line with:

 

// Test: store the result of the sum with vstore4 instead of using a simple sum...
    vstore4(tmpBuffer[gid] + MASK,0,tmpBuffer+gid); // Again ok here....


Then everything works fine. So it really seems we have a problem when writing "implictly" (that is, without using vstore4 explicitly ) a vector, because "y" and "z" components aren't updated or they take strange values...

Outcomes