Well the title already describes it.
I have got a code using 64k LDS on a Radeon VII and a RX 5700. Work group size is 1024.
Its working fine on Ubuntu 16.04 and 18.04 using amdgpu-pro 18.50, 19.30 and ROCm 2.10 (all on VII) and in an other system on 19.30 on the RX 5700.
Unfortunately it does not work on the first test system (VII) booting Windows 10 using Adrenalin 19.10.1 WHQL. The code compiles well but once queued does exit with a CL_OUT_OF_RESOURCES error. I doubt it is the compiler since I had the feeling Linux 19.30 and Adrenalin 19.10.1 are more or less binary kernel compatible.
A variant with only 32k shared memory and the remaining part of the shared operations shifted to global memory does work on all systems, but is super slow. Unfortunately some of my clients run Windows, so I wonder how to get this to work with the Adrenalin runtime - especially since the ISA for Vega states the full 64k are available.
Additionally I wonder if there are any documentations about the existing runtime environmental variables the AMD drivers understand. Concretely I am searching for options to switch WAVE32 / WAVE64 and WGP / CU mode on Navi
Thanks in advance