Our application compiles OpenCL kernels at runtime, and we're discovering that customers with Navi based GPUs are suffering from extremely slow kernel compilation times (clBuildProgram).
The following are the first 5 kernels that we compile, taken from a log file. First NVIDIA, then AMD.
Compilation times with NVIDIA RTX 2080 SUPER installed (latest drivers, 461.40):
[OpenCL] 4149553730. Begin kernel compilation
[OpenCL] 4149553730. Kernel compiled in 170.4304ms
[OpenCL] 3026476454. Begin kernel compilation
[OpenCL] 3026476454. Kernel compiled in 176.2081ms
[OpenCL] 38168626. Begin kernel compilation
[OpenCL] 38168626. Kernel compiled in 175.3464ms
[OpenCL] 3157765234. Begin kernel compilation
[OpenCL] 3157765234. Kernel compiled in 154.0611ms
[OpenCL] 1764890686. Begin kernel compilation
[OpenCL] 1764890686. Kernel compiled in 176.8117ms
Compilation times with AMD Radeon RX 5700 XT (latest drivers, 21.2.2):
[OpenCL] 4149553730. Begin kernel compilation
[OpenCL] 4149553730. Kernel compiled in 1395.8775ms
[OpenCL] 3026476454. Begin kernel compilation
[OpenCL] 3026476454. Kernel compiled in 1322.6756ms
[OpenCL] 38168626. Begin kernel compilation
[OpenCL] 38168626. Kernel compiled in 1292.6209ms
[OpenCL] 3157765234. Begin kernel compilation
[OpenCL] 3157765234. Kernel compiled in 1306.0073ms
[OpenCL] 1764890686. Begin kernel compilation
[OpenCL] 1764890686. Kernel compiled in 1336.2341ms
As you can see, the AMD driver is compiling approx 10 times slower than NVIDIA's. I should also point out that non-Navi architecture GPUs don't seem to be affected.
Hi @MarkIngramUK ,
Thank you for reporting it. I've moved the post to the OpenCL forum. Also I've whitelisted you for the "AMD Devgurus" community.
Please provide a minimal reproducible test-case and attach the clinfo output.
Thanks.
Minimal example available here - https://github.com/MarkIngramUK/ocl-compile-benchmark
Let me know if you need anything further.
Thank you for providing the repro. We'll look into this and get back to you.