cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

lexknuther
Journeyman III

Catalyst 13.4 Generates Incorrect Kernel Results

After upgrading to Catalyst 13.4, my kernel is now producing incorrect results.  With Catalyst 12.10 and 13.1, everything worked.  It's failing on a fairly simple function:

uint F(uint x, uint y, uint z) {

     uint result = x & y | ~x & z;

     printf("F(0x%08X, 0x%08X, 0x%08X) = 0x%08X\n", x, y, z, result);

     return result;

}

The function works the first few times; however, after the 4th invocation, it starts producing incorrect results

Correct output from printf in 12.10/13.1:

F(0xEFCDAB89, 0x98BADCFE, 0x10325476) = 0x98BADCFE

F(0xEEE046D0, 0xEFCDAB89, 0x98BADCFE) = 0xFEDA9AAE

F(0x2F92ADCF, 0xEEE046D0, 0xEFCDAB89) = 0xEECD06C0

F(0x1D867DCA, 0x2F92ADCF, 0xEEE046D0) = 0xEFE22FDA (difference)

Incorrect output from printf in 13.4:

F(0xEFCDAB89, 0x98BADCFE, 0x10325476) = 0x98BADCFE

F(0xEEE046D0, 0xEFCDAB89, 0x98BADCFE) = 0xFEDA9AAE

F(0x2F92ADCF, 0xEEE046D0, 0xEFCDAB89) = 0xEECD06C0

F(0x1D867DCA, 0x2F92ADCF, 0xEEE046D0) = 0xFFE67FDA (difference)

I've also noted the IL output from KernelAnalyzer for 13.4 is significantly shorter than it was for 12.10 and that the output for 13.1 is different though not nearly as different as 12.10.

P.S.  I'm testing on the Tahiti platform with a Radeon 7970.  I've also attached a simple kernel that demonstrates the problem with the following input parameters:

key = {0x8EA5B689, 0x4E29529A, 0x0A639456, 0xE4E95734}.

0 Likes
1 Solution
vmiura
Adept II

We came across a bug with exactly x & y | ~x & z.  It got optimized to the wrong code.

A workaround is to use bitselect() directly.

View solution in original post

0 Likes
3 Replies
vmiura
Adept II

We came across a bug with exactly x & y | ~x & z.  It got optimized to the wrong code.

A workaround is to use bitselect() directly.

0 Likes

Thanks for the workaround.  I'm surprised that the optimizing compiler doesn't just reduce x & y | ~ x & z to bitselect rather than generating bad code.

Yeah.  Actually it replaced x & y | ~x & z with bitselect, but with the arguments mixed up.