As the subject states, I want to know if it is possible to use AMDuProf to profile cache miss % for L1, L2, and L3 (Local vs Remote if possible). I saw that AMDuProfCLI has some predefined events but is there any way to get these specific metrics through AMDuProfCLI or is it better to just use AMDuProfPcm. I am working strictly through a CLI and am very new to this area. Any advice would be much appreciated.
Thank you!
Hello @hlanka3
Thank you for writing to Server gurus.
We are currently investigating this and will reach out to you asap.
Additionally, I was wondering if AMDuProfPcm can give function related information along with its counters. Furthermore, when generating a report with AMDuProfCLI, is there anyway to get the total number of a certain event? For example is there any way to get the total number of L1_DC_MISSES instead of the way it is given in the csv which is 4 decimal points in PTI in the report csv?
Hi @hlanka3
To get the total number of any event instead of the decimal representation, you can modify the expression in the config files as below.
Go to AMDuProf/bin/Data/Config folder.
To check the config file name,
Check lspcu and note CPU family and Model number.
Example :
CPU family: 25
Model: 17
Convert these values to hex :
CPU Family : 0x19
Model : 0x11
So the config file name to modify would be : 0x19_0x11
There are other suffixes available too, please chose according to your system.
If the model name after hex conversion has 2 digits , ignore the last digit.
Ex : 0x19_0x11 - Look for config file with name 0x19-0x1
If the Model number is single digit and no matching config file is present, that means PCM is not supported on that model.
Open the appropriate config file.
Check the metric which you need modified.
Ex. For the event All DC Fills (pti), to get the total number instead
<metric name="All DC Fills (pti)" expression="$AllDCFills * 1000 / $IRPerfDC"> </metric>
modify as ,
<metric name="All DC Fills (pti)" expression="$AllDCFills"> </metric>
And this directly appears when profiling with AMDuProfCLI correct?
Hi @hlanka3
This above mentioned method of formatting data is for AmduProfPCM not for AMDuProfCLI. Formatting is not available for AMDuProfCLI.
For AMDuProfCLI, the number of events can be represented using "--show-event-count" in report command (Section 6.5.1 of AMDuProf User guide ).
But this option would show an approximate event count as AMDuProfCLI is based mainly on sample mode of collection.
For exact counts, would suggest to check with AMDuProfPCM.
Hi @hlanka3
There are too many events to report, hence right now we are reporting a selected few.
PCM customized reported events, we can obtain referring below points :
Go to AMDuProf/bin/Data/Config folder.
To check the config file name,
Check lspcu and note CPU family and Model number.
Example :
CPU family: 25
Model: 17
Convert these values to hex :
CPU Family : 0x19
Model : 0x11
So the config file name to modify would be : 0x19_0x11
If the model name after hex conversion has 2 digits , ignore the last digit.
Ex : 0x19_0x11 - Look for config file with name 0x19-0x1
If the Model number is single digit and no matching config file is present, that means PCM is not supported on that model.
Open the appropriate config file.
Though events can be customized, the no. of events in one block cannot be greater than 6.
If required create a new section : <core> --- </core>
Check for the parameter you need and modify the metric section.
Ex for L2 data is reported as below :
<metric name="L2 Access (pti)" expression="(($L2AccessWithoutPF + $L2PFHitinL2 + $L2PFMissL2HitinL3 + $L2PFMissL2L3) * 1000) / $IRPerfL2"> </metric>
<metric name="L2 Miss (pti)" expression="(($L2Miss * 1000 / $IRPerfL22) + ($L2PFMissinL2 * 1000 / $IRPerfL2))"> </metric>
If you need cache miss % (pti), you can add metrics L2Accesspti and L2Misspti as below :
*** Note the usage of "$" before the metric names
*** Ensure metric names have no special characters.
<metric name="$L2Accesspti" expression="(($L2AccessWithoutPF + $L2PFHitinL2 + $L2PFMissL2HitinL3 + $L2PFMissL2L3) * 1000) / $IRPerfL2"> </metric>
<metric name="$L2Misspti" expression="(($L2Miss * 1000 / $IRPerfL22) + ($L2PFMissinL2 * 1000 / $IRPerfL2))"> </metric>
<metric name="L2 Miss %" expression=""($L2Misspti * 100 ) / $L2Accesspti"> </metric>
Similarly you can check for L1 and customize the data for whichever cache you are looking for.
Ok thank you so much for the reply. Additionally, is there anyway to get L3 cache related information from AMDuProfCLI? Furthermore, is there any way to get function level information with AMDuProfPcm?
Hi @hlanka3
Cache analysis you can run using AMDuProfCLI , to analyze false sharing.
Please refer section 7.7.3 "Cache analysis using CLI" in the AMDuProf User guide
Function level information is not supported on AMDuProfPCM.
Is there anyway to generate a report that only has the top n Hottest Functions, Processes, Threads etc. when using the memory config in AMDuProfCLI?
I ask because I have been generating a report with the memory config for well over 24 hours now. The report generation is definitely progressing but at a very slow rate. It is currently at around 78 MB. If there is a way I could avoid Shared Data Cachline part of the report
Hi @hlanka3
Report generation taking a long time to complete with memory config is a known issue.
We will be taking up this as a bug report.
Hi @hlanka3
Can you please share the collect and report AMDuProfCLI command used.
AMDuProfCLI collect --config memory -o ./output-dir python3 script.py
AMDuProfCLI report --cutoff 20 -i ./output-dir/<SESSION-DIR>
Hi @hlanka3
Issue was fixed in latest version of uProf, please download it from the below link and confirm us if your issue got resolved.
https://www.amd.com/en/developer/uprof/uprof-eula/uprof-5-0-eula.html?filename=AMDuProf_Linux_x64_5....
Thanks & Regards
Ajay Ratnam