I have come across what I am convinced has to be a bug in the OpenCL compiler. I guess this needs to be posted on the OpenCL forum. The bug only occurs in specific circumstances, namely:
64 bit system
Relatively complex kernel, using sub-functions, loops #prama unroll’s etc.
Device: Hawaii (290X) (OK on Tahiti – I do not have any other boards readily available to try it on)
Using the ‘-cl-opt-disable’ option also makes the kernel operate correctly
I am using the latest AMD Driver: 15.200.1062.1004 (03/08/2015)
I have simplified the kernel as much as I can while still exhibiting the issue. I have created a simple self-contained console program that demonstrates the problem. A visual studio solution for VS2013 is in the attached zip file along with further details (see the Read.Me).