Hello,
I am currently working with OpenMP Offloading to a MI250 GPU. The compiler used is the most recent AMD version of clang (bundled with ROCm 5.4.3): amdclang++ (clang-15). The code is written in C++ and compiles / runs fine.
When I tried to look into means of profiling I found rocprof and uProf. rocprof is somewhat hard to use and requires a lot of application runs to collect all the metrics I want (and it does not provide a UI) so I looked into uProf. It should support the MI250 but when I try to invoke it with
AMDuProfCLI collect --output-dir uProf --config tbp --omp --trace gpu ./matmul
I get a warning saying
Tool lib "/opt/AMDuProf_4.0-341/bin/ProfileAgents/x64/libAMDGpuAgent.so" failed to load.
and in the result there is no real GPU trace / profiling info. It runs the program just fine, just fails to collect profiling data for the GPU. What could be the cause of this?
As a side question: what would be profiling tools that could be used for my use case? Is there "only" rocprof and uProf?