Hi youwei,
The current driver releases have an issue when collecting perf counters on Fury GPUs. The driver has been fixed internally -- but I'm not sure when a driver with this fix will be available publicly. Hopefully, the next driver release will have this fix.
Also, with regards to the multi-pass profiling, this is something that happens under the hood (it typically won't be visible to you unless it is not functioning properly). For most OpenCL kernels, the profiler is able to capture a kernel's input and output buffers and save and restore their contents when replaying a kernel. If a kernel uses SVM or takes a pipe as an argument, the profiler will not be able to replay the kernel, and it can only collect counters that can be queried in a single pass.
Thanks,
Chris