cancel
Showing results for 
Search instead for 
Did you mean: 

Server Processors

jsomarri
Adept I

Does new Server CPU Profiler uProf supports event L3PMCx06 [L3 Miss]?

Team, Im trying to profile an application in my Epyc and would be really useful to have L3 misses data for the application. The thing is that when I try to capture this counter it gives me an error and the documentation does not specify on how to get this counter.

Im using the following command to collect the data:

./AMDCpuProfiler collect -a -c 0-31 -e event=0x06,interval=5000 -d 20 -o /tmp/out

Error:

The configuration erroneously contains an event (0x6) which is not

available on this system.  Please choose a different

configuration.

Any Idea on how to collect this counter?

Thanks a lot.

0 Likes
1 Solution
swarup
Staff

Please use uProf (v1.1) and use AMDuProfCLI instead of AMDCpuProfiler (this is deprecated in v1.1).

Use event=0xb006 for 'L3 Miss' event.

List of supported events can be displayed using the command: ./AMDuProfCLI collect --list cpu-events

You need to combine L3 event along with TBP profiling. Here is a sample command to collect "L3 miss" events.

To collect L3 Miss samples in SWP mode:

./AMDuProfCLI collect -e event=timer,interval=1 -e event=0xb006,umask=0x01,slicemask=0xF,threadmask=0xFF -a -d 10 -o /tmp/out

View solution in original post

0 Likes
7 Replies
swarup
Staff

Please use uProf (v1.1) and use AMDuProfCLI instead of AMDCpuProfiler (this is deprecated in v1.1).

Use event=0xb006 for 'L3 Miss' event.

List of supported events can be displayed using the command: ./AMDuProfCLI collect --list cpu-events

You need to combine L3 event along with TBP profiling. Here is a sample command to collect "L3 miss" events.

To collect L3 Miss samples in SWP mode:

./AMDuProfCLI collect -e event=timer,interval=1 -e event=0xb006,umask=0x01,slicemask=0xF,threadmask=0xFF -a -d 10 -o /tmp/out

0 Likes

I'm trying to do something similar -- profile my application for L3 caching issues on my Threadripper 1950x CPU on Windows 10.

Using AMDuProfCLI, the sample command you gave seems to work, but I can't seem to inspect or analyze the collected samples in any useful way.

The only way I can tell anything was collected is to inspect the generated .CSV file which has a section that looks like:

L3/DF PROFILE REPORT

TimeStamp,L3 miss(CCX0),L3 miss(CCX1),L3 miss(CCX2),L3 miss(CCX3)

0:0:0:407.550,,,,1904

0:0:10:410.141,414,,,

0:0:10:410.164,472,,,

0:0:10:410.182,467,,,

0:0:10:410.198,3061,,,

0:0:10:410.214,1227,,,

Is there any way to connect these events back to specific threads and instructions?

0 Likes

As of now only chronological L3 events are being reported. On family 0x17 processors, if SMT is enabled, 8 threads share single L3 cache resource within a CCX. Existing L3 events don't provide (software) threads or instructions attribution information. But you may restrict L3 events to a specific (hardware) thread or core. Each bit in threadmask corresponds to a core within CCX. When threadmask set to 0xFF, it collects L3 events for all threads. You can set it to a specific core, and set your application affinity to that core. This way you might get some useful information regarding L3 events.

0 Likes

Thanks Swarup, it does work! Do you know if there is any way I can collect memory bandwidth data while I run my application?

0 Likes

Memory bandwidth profiling is not yet supported by uProf.

0 Likes

Yeah, its help too for my problem.
can you give me the event for L1 and L2 Hit/Miss swarup ?

0 Likes

Refer "Open-Source Register Reference for AMD Family 17h Processors" (OSRR) document at https://developer.amd.com/resources/developer-guides-manuals/

Following events may be what you are looking for:

PMCx040 [Data Cache Accesses]

PMCx043 [Data Cache Refills from System]

PMCx045 [L1 DTLB Miss]

PMCx060 [Requests to L2 Group1]

PMCx061 [Requests to L2 Group2]

Refer the OSRR document to figure what else events are more suitable to your need.

Apart from above PMC events, you may consider IBS sampling events as well.

0 Likes