I have come across what I am convinced has to be a bug in the OpenCL compiler. I guess this needs to be posted on the OpenCL forum. The bug only occurs in specific circumstances, namely:
64 bit system
Relatively complex kernel, using sub-functions, loops #prama unroll’s etc.
Device: Hawaii (290X) (OK on Tahiti – I do not have any other boards readily available to try it on)
Using the ‘-cl-opt-disable’ option also makes the kernel operate correctly
I am using the latest AMD Driver: 15.200.1062.1004 (03/08/2015)
I have simplified the kernel as much as I can while still exhibiting the issue. I have created a simple self-contained console program that demonstrates the problem. A visual studio solution for VS2013 is in the attached zip file along with further details (see the Read.Me).
Hi Mark,
Welcome!
I have white-listed you, so you should be able to directly post in the appropriate developer forum. As this question seems related to the OpenCL forum, I am moving it there -- the experts there should help.
--Prasad
Hi Mark,
Thanks for reporting the issue and providing the reproducible test-case.
Yes, it indeed seems a compiler issue. As I checked it using a Hawaii XT card, I got the same observation. I'll file a bug report against this issue.
BTW, the result seems okay without passing the option ‘-cl-opt-disable’ for both win32 and x64 if the kernel is built for OpenCL 2.0 i.e. with build flag "-cl-std=CL2.0". Could you please check and confirm?
Regards,
Hi Dipak,
Thanks for checking this and filing a bug report.
Yes I can confirm with compiler build flag "-cl-std=CL2.0" the Kernel runs correctly. I have also confirmed my original Kernel (that this test case was derived from) also runs correctly with this option.
(Interesting though my original Kernel takes 78% longer with this option set. I have not got time just now to tinker with this to see if the time can be got back.)
Regards,
Mark.
Thanks for the confirmation.
Regards,