After upgrading to Catalyst 13.4, my kernel is now producing incorrect results. With Catalyst 12.10 and 13.1, everything worked. It's failing on a fairly simple function:
uint F(uint x, uint y, uint z) {
uint result = x & y | ~x & z;
printf("F(0x%08X, 0x%08X, 0x%08X) = 0x%08X\n", x, y, z, result);
return result;
}
The function works the first few times; however, after the 4th invocation, it starts producing incorrect results
Correct output from printf in 12.10/13.1:
F(0xEFCDAB89, 0x98BADCFE, 0x10325476) = 0x98BADCFE
F(0xEEE046D0, 0xEFCDAB89, 0x98BADCFE) = 0xFEDA9AAE
F(0x2F92ADCF, 0xEEE046D0, 0xEFCDAB89) = 0xEECD06C0
F(0x1D867DCA, 0x2F92ADCF, 0xEEE046D0) = 0xEFE22FDA (difference)
Incorrect output from printf in 13.4:
F(0xEFCDAB89, 0x98BADCFE, 0x10325476) = 0x98BADCFE
F(0xEEE046D0, 0xEFCDAB89, 0x98BADCFE) = 0xFEDA9AAE
F(0x2F92ADCF, 0xEEE046D0, 0xEFCDAB89) = 0xEECD06C0
F(0x1D867DCA, 0x2F92ADCF, 0xEEE046D0) = 0xFFE67FDA (difference)
I've also noted the IL output from KernelAnalyzer for 13.4 is significantly shorter than it was for 12.10 and that the output for 13.1 is different though not nearly as different as 12.10.
P.S. I'm testing on the Tahiti platform with a Radeon 7970. I've also attached a simple kernel that demonstrates the problem with the following input parameters:
key = {0x8EA5B689, 0x4E29529A, 0x0A639456, 0xE4E95734}.
Solved! Go to Solution.
We came across a bug with exactly x & y | ~x & z. It got optimized to the wrong code.
A workaround is to use bitselect() directly.
We came across a bug with exactly x & y | ~x & z. It got optimized to the wrong code.
A workaround is to use bitselect() directly.
Thanks for the workaround. I'm surprised that the optimizing compiler doesn't just reduce x & y | ~ x & z to bitselect rather than generating bad code.
Yeah. Actually it replaced x & y | ~x & z with bitselect, but with the arguments mixed up.