cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Fr4nz
Journeyman III

Cannot modify correctly __local uint4 vectors variables: serious compiler bug?

If yes, it is very serious...

Hello,

after posting yesterday about my problems with __local uint4 vectors variables, I tried to understand better the problem: it seems that I can read these variables without problems but I can't modify them correctly in an "usual" way (see below what I mean).

Consider this simple kernel (ignore the parameters that aren't used inside the kernel...):

/**
 * Kernel: uint4 vector test;
 */
__kernel
void
intersect(__global uint4 *inputVD,
          __global uint *inputPrefixCk,
      __global uint *outputCounts,
      __local uint4 *tmpBuffer,
          const int k,
          const int dimRow) {

    // Simple mask
    const uint4 MASK = (uint4) (1);

    // Recover global ID of the work item;
    const int gid = get_global_id(0);

    // Test: modify __local vector variable with vstore4;
    vstore4(4,0,tmpBuffer+gid); // OK here....

   // Test: modify __local vector variable with an operator (in this case "+");
    tmpBuffer[gid] += MASK; // NOT ok here: cannot modify tmpBuffer value correctly! Only "x" and "w" components are updated correctly...
    
    // Final output of the work-item;
    outputCounts[gid] = tmpBuffer[gid].x + tmpBuffer[gid].y + tmpBuffer[gid].z + tmpBuffer[gid].w;

}



As you can read from the source comments, when I modify a __local vector uint4 variable location with "vstore4" everything works, but if I try a stupid operation (in the example a sum with another uint4 vector variable) only components "x" and "w" (the first and the last) are updated.

Is this normal? What is the explanation of this? A compiler bug maybe? The only workaround I've found in order to update correctly a __local vector variable is to use vstore4...in fact if I replace the non-working line with:

// Test: store the result of the sum with vstore4 instead of using a simple sum...
    vstore4(tmpBuffer[gid] + MASK,0,tmpBuffer+gid); // Again ok here....


Then everything works fine. So it really seems we have a problem when writing "implictly" (that is, without using vstore4 explicitly ) a vector, because "y" and "z" components aren't updated or they take strange values...

0 Likes
25 Replies