I've been working on isolating a malfunction in one of my OpenCL kernels that has been encountered on a Hawaii card (R9 290) in 64-bit mode. I'd hope to provide a much more isolated testcase than what I have now. The bug appears to be mult/divides on 32-bit floats that's off by a factor of three and/or 1/three which is what I noted using the kernel debugger yesterday. I think the three may have been from result of a previous calculation. Otherwise it follows the logic OK even though the math has become wrong. The only other AMD card I have is an old Turks card and it has no issue.
I figure the bug is in the compiler toolchain, I'm guessing maybe one of amdocl_as64.exe or amdocl_ld64.exe. The bug is specific to 64-bit, but I verified yesterday that if I load the GPU binary from 64-bit bit into the 32-bit version of my program, the same issue occurs. (I didn't prove the converse exactly since I switched drivers before I though to save my 32-bit generated binary).
At this moment I suspect there may be some installer-related issue as well; on the two systems (Win 8.1 x64 & Win 7 x64) where this had been encountered, the system was "clean", before 14.12 Omega was installed. So on Win 7 x64, only previous card was an old firepro based on Redwood I think, then the R9 was dropped in and 14.12 installed. On 8.1, I'd never previously encountered it, had been using Cat 14.4. For unrelated reasons on that system, I had forcibly uninstalled all GPU drivers including Nvidia in addition to AMD, and cleaned out the Registry. Then I put in 14.12 and presto, repro'd the issue. After uninstalling that, I retested 14.4, 14.9, and then finally 14.12, none of which reproduced it. (As of 14.12, clinfo says "OpenCL C 2.0"). The way I installed these was just simply running the installer, reboot & test, in order.
I've made this lower priority for me based on having a workaround available simply by switching drivers, but I still need to dig into it further. I've noted that at present on my Win 8.1 system, where it's not reproducing after having installed 14.12, amdocl_as64.exe and amdocl_ld64.exe in C:\Windows\System32 are dated 7/21/2014 10:04 PM; but the FileRepository has versions dated 11/20/2014 7:33 PM. So I think the installer should have updated those tools. When I re-examine it I'll be restaging and trying again 14.12 on a clean configuration, and might get a chance to examine the generated code to see if I can spot the difference. Or I can just try to switch those tools using the versions in the repository and see if it will repro again.
I'm curious if anyone else has seen anything similar? E.g., the incorrect arithmetic, or files that didn't get updated by installer?