For several reasons [1] we’re looking for a way to measure evicts from the L2 to the victim L3 cache. Preliminary results of performance investigations indicate that the L3 is not a “true” victim cache in the sense that it receives all cache lines that were selected for replacement in the overlying L2 cache; instead, some cache lines (e.g., unmodified and unshared) seem to exist in L2 and L3 in parallel and thus can be silently dropped in the L2 upon replacement. We’d like to verify this using hardware performance counters. However, in the documentation available to us [2,3] there is only an event that counts the number cache line transfers from the L3 to the L2 cache. So here’s our questions:
- Is there an event to count cache-line evicts from L2 to L3?
- Can you share this event? If it's not publicly available, can you share it under NDA (which we have)?
- Related question: Will fabric counter events be published in the future?
[1] For one thing, we’re developing the LIKWID tool, used by many HPC experts for low-level analysis, which would benefit from additional hardware events; for another, we’re developing the ECM performance model, which requires a good understanding of data transfers inside the cache/memory hierarchy to deliver good results for a particular micro-architecture.
[2] Open-Source Register Reference For AMD Family 17h Processors Models 00h-2Fh
[3] Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors
Thanks a lot,
John