I am trying to write a C++ application that benchmarks a set of functions and collects power/energy information during their execution using the AMDPowerProfile API (part of uProf).
In an attempt to avoid polluting the benchmark results with the PowerProfile driver constantly executing, I set the driver sample period to a large value and only process the captured PowerProfile samples once the function being benchmarked exits. My first call to AMDTPwrReadAllEnabledCounters works as expected - returning 12 samples with RecordIDs 1 through 12 and with the expected elapsed time difference.
However, my second call to AMDTPwrReadAllEnabledCounters returns 12 samples but with RecordIDs 14 through 25. RecordID 13 appears to be missing. The elapsed time difference between the samples with RecordIDs 12 and 14 is also double what I would expect. This trend continues with each subsequent call to AMDTPwrReadAllEnabledCounters skipping one RecordID.
Does anyone know of a workaround to this? I've already tried indexing one more entry into the returned array to see if the returned count had an off by 1 error.
I've put together a modified version of the CollectAllCounters example which demonstrates this behavior and have included the output I see when running it. The development platform is a Ryzen 2990WX on Ubuntu 18.04 LTS with AMD uProf 2.0.493. I recently tried upgrading to uProf 3.1.35 but the behavior persists.