Offline compiler convert bit-select to AND/XOR sequence, while just in time compiler (correctly) use BFI instructions.
It there any flag that can control such behavior?
AKAIK, CodeXL uses the same OpenCL compiler that comes with the driver itself. If kernel build arguments are same, it is expected to generate same ISA code as inline build for a particular target device.
I'll investigate this further. I compiled online, offline with CodeXL, and with Radeon GPU Analyzer and got 3 different results.
I like this Radeon GPU Analyzer. Is it possible to force 32 bit compilation?
1. Radeon GPU Analyzer's GUI uses AMD's LLVM-based Lightning Compiler which is different than the compiler that AMD's OpenCL runtime uses on Windows at the moment.
2. When working with the command line, please note that there are 2 different compiler toolchains. For -s cl, the runtime compiler would be used, while for -s rocm-cl the offline Lightning Compiler would be used.
2. To force 32-bit compilation, please try the x86 build of RGA. It should load and use the 32-bit runtime compiler. Here is a download link:
Thanks for answer,
I was really interested in this LLVM-based Lightning Compiler. I tried to create a binary file, but I did not manage to load it. Whatever I do, I always get the same error: "The binary is incorrect or incomplete. Finalization to ISA could not be performed."
I use RGA on Windows and I found several posts on the net where users say that these binaries are in a format that can only be used on Linux. To be honest, I do not understand why the GUI is under Windows if the result can not be used on the same operating system.
Is there any workaround or something?
A similar question had been answered here, in the RGA GitHub issues page.
Since RGA is cross-platform, you can develop with it either on Linux or on Windows, although, as mentioned in the thread above, the generated binaries can only run on ROCm-capable Linux machines. As a reminder, you are asking about the ROCm OpenCL mode (-s rocm-cl), which is different than the legacy OpenCL mode (-s cl) that uses the live driver.
For further questions about RGA, kindly post them on the RGA GitHub issues page.
Retrieving data ...