- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2018
01:22 PM
How to tune the performance of ROCm(llvm) compiler?
I modified llvm (roc-1.6.x) a bit to generate a code that can run on AMDGPU pro dirver. It can run but the performance is over 10% slower than AMDGPU's online compiler, for the same opencl code. I wonder if there is some flags I can set to tune up llvm. If you can give me some examples it will be great.
Labels
- Labels:
-
OCL Performance and Benchmark
0 Replies
