To determine the total number of cache misses from one core, select only a single core using UnitMask[7:4] and set UnitMask[2:0] to 111b.
It sounds like you may be getting the event on all cores instead of just one. For the L2 misses, are you using event 07Eh?
How many cores do you have? What's the system topology, if you know it?
Magnycours is a 12-core multichip processor, 2 6-core processors on a single package. For L2 I'm using 77Eh as suggested here: http://developer.amd.com/Assets/intro_to_ca_v3_final.pdf
For L3, I'm using f74e1h, which counts L3cache misses from any core, which is also suggested in the same document. How can I derive the total number of L3 cache misses by using that event? It seems like it's giving me some sort of a aggregated result, but still I'm getting much higher numbers even I normalize with 12. And one other weird behavior is, either the first 6 cores or the 2nd set of 6-cores have approximately 100 times higher L3-cache misses than the other set of cores.
What are the number of L3 cache misses breakdown per core you are seeing?
I have a similar problem on a magny-cours system with 4 cpus, 12-core each (two groups of 6 cores).
I'm using PAPI which allows me to set the counter mask (Unit Mask) as an integer number from 0-255;
I'm trying to get the L3 misses on one specific core, I follow your advice and the documentation and select the core using UnitMask[7:4] and set UnitMask[2:0] to 111b. Strangely, when I do that, I receive a zero count.
I tried all combinations for the UnitMask, in fact I ran my little app with UnitMask = 0 up to 255 and the only time I get a count is when the unit mask = 0 and when it's equal to 1.
I use taskset -c to pin my run onto a specific core; I tried pinning to different cores 0, 1, 2, 3,4,5 but the same thing happens??
Am I doing something wrong?
The command I'm using looks like the following (note: I'm using PAPI thru papiex command-line tool and everything else works but anything with L3 does not seem to be working)
taskset -c 21 papiex -eL3_CACHE_MISSES:c=7 -- ./a.out
where c specifies the mask using an integer from 0-255
I'd appreciate any input.