Hi,
I encountered some unexpected results when using the Core::ExRetCops performance counter (PMCx0C1, counting "Retired Uops" according to the Processor Programming Reference (PPR) for AMD Family 17h Models 00h-0Fh Processors [1]) on my Zen+ Ryzen 5 2600X CPU.
For simple instructions that use memory, like for example "add rbx, qword ptr [rcx + 42]", this performance counter (read using nanoBench[2]) reports one micro op.
Table 1 of the Software Optimization Guide for AMD Family 17h Processors [3] however states that such instructions are implemented with two micro ops, one for the addition and one for the load.
It seems that either I misunderstood the documentation, or one of the documents is wrong. It might be a hint to the culprit that this performance counter is documented to count macro-ops instead of micro ops in the Zen3 PPR [4].
Is this a known error in the Zen(+) PPR, and if so, is there another mechanism to count the number of issued/executed/retired micro ops?
Thank you for your help!
---
[1] https://www.amd.com/system/files/TechDocs/54945_3.03_ppr_ZP_B2_pub.zip
[2] https://github.com/andreas-abel/nanoBench
[3] https://amd.wpenginepowered.com/wordpress/media/2013/12/55723_3_00.ZIP
[4] https://www.amd.com/system/files/TechDocs/56214-B0-PUB.zip