I've been working on adding OpenCL support to our code generator (GitHub - genn-team/genn: GeNN is a GPU-enhanced Neuronal Network simulation environment based on code generation for Nvi… ) and the generated code is now working on NVIDIA, Intel and ARM devices but we've been having ongoing issues getting this to work on AMD GPUs. With new enough Adrenaline drivers and a relatively modern GCN GPU, all now seems good on Windows but, in order to do some more rigorous testing in-house, we have now bought a Radeon 5700 XT for one of our Linux machines. Now, using the AMD GPU PRO 20.30 drivers we're seeing similar broken behavior.
I have reduced a simple case to the attached minimal reproduction which, on NVIDIA and Intel hardware, prints out:
However, on the 5700 XT, it prints out:
Similarly to our previous issue, if you add any printf to any kernel, it works correctly. Additionally, interspersing commandQueue.flush() calls between kernel launch has the same effect however, from my understanding of the OpenCL spec, this should not be necessary.