Hi,
I am new to AMD hardware.
Currently, I am trying to measure the read and write count that is served by DRAM. In the intel processor, I was able to count read and write from uncore counters (memory controller) using PAPI.
I am using Zen2 processors: MD EPYC 7402 24-Core Processor
Any pointer on the counter name or tool name would be great.
Thanks.
Monil.
University of Oregon.
Hello @monil01 ,
Have you tried using our uProf tool for this? Found here: https://developer.amd.com/amd-uprof/
Or the Processor Programming Reference (PPR) would contain the registers: https://developer.amd.com/wp-content/resources/55803_B0_PUB_0_91.pdf
Hi @Anonymous
Thanks a lot for suggesting these.
I have looked at both of the documentations.
Uprof is nice and provides function-wise counter data which I need.
However, I am struggling with finding events and bypassing the sampling mode of uprof.
Basically, I have three questions regarding using this tool to measure traffic between LLC and DRAM.
1. Can uprof work without sampling mode and count all the specified event rather than sampling? For my work, I need the counts as accurately as possible.
2. At, Table 21: Guidance for Common Performance Statistics with Complex Event Selects of the "Processor Programming Reference (PPR) ", there are some events that I am interested to measure using uprof. So far I found only some predefined PMC and IBS events can be measured using uprof. So, how can I measure the event mentioned in the register document?
3. Which event (s) should I measure to find the Cache (LLC) and DRAM read and writes. Basically I need to count the load and stored request actually landed on DRAM which cache could not serve. (Note: For Intel processors, I was able to use uncore counters for the memory controllers to count. Is there any similar strategy?).
Thanks in advance.
Monil.
PhD Student,
University of Oregon.
Hi @Anonymous
Thanks a lot for suggesting these.
I have looked at both of the documentations.
Uprof is nice and provides function-wise counter data which I need.
However, I am struggling with finding events and bypassing the sampling mode of uprof.
Basically, I have three questions regarding using this tool to measure traffic between LLC and DRAM.
1. Can uprof work without sampling mode and count all the specified event rather than sampling? For my work, I need the counts as accurately as possible.
2. At, Table 21: Guidance for Common Performance Statistics with Complex Event Selects of the "Processor Programming Reference (PPR) ", there are some events that I am interested to measure using uprof. So far I found only some predefined PMC and IBS events can be measured using uprof. So, how can I measure the event mentioned in the register document?
3. Which event (s) should I measure to find the Cache (LLC) and DRAM read and writes. Basically I need to count the load and stored request actually landed on DRAM which cache could not serve. (Note: For Intel processors, I was able to use uncore counters for the memory controllers to count. Is there any similar strategy?).
Thanks in advance.
Monil.
PhD Student,
University of Oregon.
Hi @monil01 ,
1. uProf works on sampling mode. For counting mode, there is another tool 'AMDuProfPcm' comes with uProf, can be found in bin/ directory. Details about AMDuProfPcm available in the User Guide document.
2. From CLI, to use the PMC events in the collect command, you can use the '--event' option. From GUI, while selecting options to profile, you can select the PMC events at 'Custom Profile' in the 'Select Profile Type' TAB.
3. To measure uncore events (LLC, DRAM), you can use AMDuProfPcm tool.
Let us know if you need more details for any of the above.
Hi,
1. "To measure uncore events (LLC, DRAM), you can use AMDuProfPcm tool."
>> So far I have found, I can measure some predefined PMC events ( AMDuProfPcm -l ). How can I measure the uncore counters/events and how can I get the list of uncore events/counters?
2. "For counting mode, there is another tool 'AMDuProfPcm' comes with uProf"
>>> Does AMDuProfPCM provides a function-wise profile for an application like the CLI tool does?
For example, if I use PCM tool for a vector multiplication application, I get the following:
$sudo ../../bin/AMDuProfPcm -m l3 -s ./serial_vecmul
Summary Report:
CORE METRICS,(core 0)
Utilization (%),0.390085
Eff Freq,2830.761836
IPC (Sys + User),0.385817
CPI (Sys + User),2.591902
DC Access (pti),561.690363
L2 Access (pti),89.675084
L2 Access from IC Miss (pti),26.858736
L2 Access from DC Miss (pti),41.868931
L2 Access from HWPF (pti),19.088817
L2 Miss (pti),65.998836
L2 Miss from IC Miss (pti),20.385999
L2 Miss from DC Miss (pti),32.828191
L2 Miss from HWPF (pti),12.784646
L2 Hit (pti),18.859754
L2 Hit from IC Miss (pti),6.718347
L2 Hit from DC Miss (pti),5.534967
L2 Hit from HWPF (pti),6.304171
L3 METRICS,(ccx 0)
L3 Access (pti),41.097357
L3 Miss (pti),7.652235
Ave L3 Miss Latency,628.862097
Is it possible to get these function wise like the CLI tool? (CLI tool provides function wise result but it's in sampling mood)
Thanks.
Monil.
hi, @swarup @Anonymous
Please have a look at the queries in the previous post.
I forgot to tag you guys.
Thanks in advance.
Monil.
Hi @monil01 ,
1. AMDuProfPcm tool does not support passing a event directly. This will be enhanced in the up coming release. Till then you can modify the predefined conf files. You can modify for example Data/Config/SamplePcm_l3.conf file and add/update with the events from PPR, then do L3 profiling.
2. Right now, function details not collected in the counting mode. We will take this a feature request and will plan to add this support in next releases.
Hello,
I was wondering if function details are collected in the counting mode for AMDuProfPcm now in 2023? If not, any updates on whether this feature will be added or not?
Hi @hlanka3
AMDuProfPCM is a system analysis tool, hence application/function data is not supported.