I am planning to ship a software using binary kernels with inline asm. Therefore I decided to go with LLVM based offline compile, since the buildin pal compiler can not handle this. Hereby it is important for me to support as many platforms as possible - I would love to do only rocm, because I believe its the more mature platform and the way for the future, but unfortunately I got a lot of Windows clients or clients that connect their cards over PCIE-2 - so for them need to provide PAL builds.
My problem is: LLVM seems to build the kernels for rocm correctly when using
/tmp/kernel-d4a333.s: Assembler messages: /tmp/kernel-d4a333.s:2: Error: unknown pseudo-op: `.hsa_code_object_isa' /tmp/kernel-d4a333.s:7: Error: Keine solche Anweisung: »s_getpc_b64 s[36:37]« /tmp/kernel-d4a333.s:8: Error: Keine solche Anweisung: »s_mov_b32 s36,s0« /tmp/kernel-d4a333.s:9: Error: Keine solche Anweisung: »s_load_dwordx4 s[36:39],s[36:37],0x0« ...
Where "Keine solche Anweisung" can be best translated to "No such command" or similar. If I ask the compiler instead to output LLVM-IR that seems to work, but I am not sure if that result can be consumed as binary kernel (I am myself running on rocm). Not that this also happens for a kernel that normally would compile in amdpal, so without special instructions or inlined asm. Also the kernel contains "#include "opencl-c.h"" in the beginning - well else it would not compile for rocm.
So I wonder what I am doing wrong or if there is a better way to achieve my goal. Any instruction how to do this appreciated Thanks
Edit / Remark: The llvm version was just from the Ubuntu 18.04 repos - just in case this matters