I've been working on adding OpenCL support to our code generator (GitHub - genn-team/genn: GeNN is a GPU-enhanced Neuronal Network simulation environment based on code generation for Nvi… ) and the generated code is now working on NVIDIA and Intel devices but, on AMD devices, it is failing rather mysteriously.
I have reduced the attached on-GPU initialization code to a minimal reproduction case. If the printf in the kernel is commented out, then initializeKernel correctly initializes d_xPost and it is displayed correctly. If not nans are printed.
We have reproduced this issue on an iMac with the following clinfo:
Platform Name Apple
Platform Vendor Apple
Platform Version OpenCL 1.2 (Apr 18 2019 20:03:31)
Device Name AMD Radeon Pro 570 Compute Engine
Device Vendor AMD
Device Vendor ID 0x1021c00
Device Version OpenCL 1.2
Driver Version 1.2 (Jan 23 2020 07:52:24)
And a Windows machine with 3004.8 drivers and a RX 580.
Am I doing something stupid or is this a compiler bug? If so any workarounds would be much appreciated.
Thanks in advance for your help