0 Replies Latest reply on Jun 5, 2015 10:02 AM by mah

    L3 cache counter not available in Oprofile

    mah

      Hello,

      Using Oprofile-1.0.0 on my Opteron 6274 (15h), there is no stat for L3 cache. Technically Magny core products has L3 cache.

       

      Please see the output of ophelp

       

      [root@tiger exe]# ophelp

      oprofile: available events for CPU type "AMD64 family15h"

       

      See BIOS and Kernel Developer's Guide for AMD Family 15h Processors

      For architectures using unit masks, you may be able to specify

      unit masks by name.  See 'operf' or 'ocount' man page for more details.

       

      DISPATCHED_FPU_OPS: (counter: 3)

              FPU Pipe Assignment (min count: 500)

              Unit masks (default 0xff)

              ----------

              0x01: Total number uops assigned to Pipe 0

              0x02: Total number uops assigned to Pipe 1

              0x04: Total number uops assigned to Pipe 2

              0x08: Total number uops assigned to Pipe 3

              0x10: Total number dual-pipe uops assigned to Pipe 0

              0x20: Total number dual-pipe uops assigned to Pipe 1

              0x40: Total number dual-pipe uops assigned to Pipe 2

              0x80: Total number dual-pipe uops assigned to Pipe 3

              0xff: All ops

      CYCLES_FPU_EMPTY: (counter: 3, 4, 5)

              FP Scheduler Empty (min count: 500)

      RETIRED_SSE_OPS: (counter: 3)

              Retired SSE/BNI Ops (min count: 500)

              Unit masks (default 0xff)

              ----------

              0x01: Single Precision add/subtract FLOPS

              0x02: Single precision multiply FLOPS

              0x04: Single precision divide/square root FLOPS

              0x08: Single precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS

              0x10: Double precision add/subtract FLOPS

              0x20: Double precision multiply FLOPS

              0x40: Double precision divide/square root FLOPS

              0x80: Double precision multiply-add FLOPS. Multiply-add counts as 2 FLOPS

      MOVE_SCALAR_OPTIMIZATION: (counter: 3)

              Number of Move Elimination and Scalar Op Optimization (min count: 500)

              Unit masks (default 0xc)

              ----------

              0x01: Number of SSE Move Ops

              0x02: Number of SSE Move Ops eliminated

              0x04: Number of Ops that are candidates for optimization

              0x08: Number of Scalar ops optimized

      RETIRED_SERIALIZING_OPS: (counter: 3, 4, 5)

              Retired Serializing Ops (min count: 500)

              Unit masks (default 0xf)

              ----------

              0x01: SSE bottom-executing uops retired

              0x02: SSE control word mispredict traps due to mispredictions

              0x04: x87 bottom-executing uops retired

              0x08: x87 control word mispredict traps due to mispredictions

      BOTTOM_EXECUTE_OP: (counter: 3, 4, 5)

              Number of Cycles that a Bottom-Execute uop is in the FP Scheduler (min count: 500)

      SEGMENT_REGISTER_LOADS: (counter: all)

              Segment Register Loads (min count: 500)

              Unit masks (default 0x7f)

              ----------

              0x01: ES register

              0x02: CS register

              0x04: SS register

              0x08: DS register

              0x10: FS register

              0x20: GS register

              0x40: HS register

      PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE: (counter: all)

              Pipeline Restart Due to Self-Modifying Code (min count: 500)

      PIPELINE_RESTART_DUE_TO_PROBE_HIT: (counter: all)

              Pipeline Restart Due to Probe Hit (min count: 500)

      LOAD_Q_STORE_Q_FULL: (counter: 0, 1, 2)

              Load Queue/Store Queue Full (min count: 500)

              Unit masks (default 0x3)

              ----------

              0x01: Cycles that the load buffer is full

              0x02: Cycles that the store buffer is full

      LOCKED_OPS: (counter: all)

              Locked Operations (min count: 500)

              Unit masks (default 0x1)

              ----------

              0x01: Number of locked instructions executed

              0x04: Cycles spent non-speculative phase (including cache miss penalty)

              0x08: Cycles waiting for a cache hit (cache miss penalty)

      RETIRED_CLFLUSH_INSTRUCTIONS: (counter: all)

              Retired CLFLUSH Instructions (min count: 500)

      RETIRED_CPUID_INSTRUCTIONS: (counter: all)

              Retired CPUID Instructions (min count: 500)

      LS_DISPATCH: (counter: all)

              LS Dispatch (min count: 500)

              Unit masks (default 0x7)

              ----------

              0x01: Loads

              0x02: Stores

              0x04: Load-op-Stores

      CANCELLED_STORE_TO_LOAD: (counter: all)

              Canceled Store to Load Forward Operations (min count: 500)

              Unit masks (default 0x1)

              ----------

              0x01: Store is smaller than load or different starting byte but partial overlap

      SMIS_RECEIVED: (counter: all)

              SMIs Received (min count: 500)

      EXECUTED_CFLUSH_INST: (counter: all)

              Executed CLFLUSH Instructions (min count: 500)

      DATA_CACHE_ACCESSES: (counter: all)

              Data Cache Accesses (min count: 500)

      DATA_CACHE_MISSES: (counter: all)

              Data Cache Misses (min count: 500)

              Unit masks (default 0x1)

              ----------

              0x01: First data cache miss or streaming store to a 64B cache line

              0x02: First streaming store to a 64B cache line

      DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE: (counter: all)

              Data Cache Refills from L2 or System (min count: 500)

              Unit masks (default 0xb)

              ----------

              0x01: Fill with good data. (Final valid status is valid)

              0x02: Early valid status turned out to be invalid

              0x08: Fill with read data error

      DATA_CACHE_REFILLS_FROM_NORTHBRIDGE: (counter: 0, 1, 2)

              Data Cache Refills from System (min count: 500)

      UNIFIED_TLB_HIT: (counter: 0, 1, 2)

              Unified TLB Hit (min count: 50000)

              Unit masks (default 0x77)

              ----------

              0x01: 4 KB unified TLB hit for data

              0x02: 2 MB unified TLB hit for data

              0x04: 1 GB unified TLB hit for data

              0x10: 4 KB unified TLB hit for instruction

              0x20: 2 MB unified TLB hit for instruction

              0x40: 1 GB unified TLB hit for instruction

              0x07: All DTLB hits

              0x70: All ITLB hits

              0x77: All DTLB and ITLB hits

      UNIFIED_TLB_MISS: (counter: 0, 1, 2)

              Unified TLB Miss (min count: 500)

              Unit masks (default 0x77)

              ----------

              0x01: 4 KB unified TLB miss for data

              0x02: 2 MB unified TLB miss for data

              0x04: 1 GB unified TLB miss for data

              0x10: 4 KB unified TLB miss for instruction

              0x20: 2 MB unified TLB miss for instruction

              0x40: 1 GB unified TLB miss for instruction

              0x07: All DTLB misses

              0x70: All ITLB misses

              0x77: All DTLB and ITLB misses

      MISALIGNED_ACCESSES: (counter: all)

              Misaligned Accesses (min count: 500)

      PREFETCH_INSTRUCTIONS_DISPATCHED: (counter: all)

              Prefetch Instructions Dispatched (min count: 500)

              Unit masks (default 0x7)

              ----------

              0x01: Load (Prefetch, PrefetchT0/T1/T2)

              0x02: Store (PrefetchW)

              0x04: NTA (PrefetchNTA)

      INEFFECTIVE_SW_PREFETCHES: (counter: all)

              Ineffective Software Prefetches (min count: 500)

              Unit masks (default 0x9)

              ----------

              0x01: Software prefetch hit in L1 data cache

              0x08: Software prefetch hit in the L2

      MEMORY_REQUESTS: (counter: 0, 1, 2)

              Memory Requests by Type (min count: 500)

              Unit masks (default 0x83)

              ----------

              0x01: Requests to non-cacheable (UC) memory

              0x02: Requests to write-combining (WC) memory

              0x80: Streaming store (SS) requests

      DATA_PREFETCHER: (counter: 0, 1, 2)

              Data Prefetcher (min count: 500)

              Unit masks (default 0x2)

              ----------

              0x02: Prefetch attempts

      MAB_REQS: (counter: 0, 1, 2)

              MAB Requests (min count: 500)

              Unit masks (default 0x1)

              ----------

              0x01: MAB ID bit 0

              0x02: MAB ID bit 1

              0x04: MAB ID bit 2

              0x08: MAB ID bit 3

              0x10: MAB ID bit 4

              0x20: MAB ID bit 5

              0x40: MAB ID bit 6

              0x80: MAB ID bit 7

      MAB_WAIT: (counter: 0, 1, 2)

              MAB Wait Cycles (min count: 500)

              Unit masks (default 0x1)

              ----------

              0x01: MAB ID bit 0

              0x02: MAB ID bit 1

              0x04: MAB ID bit 2

              0x08: MAB ID bit 3

              0x10: MAB ID bit 4

              0x20: MAB ID bit 5

              0x40: MAB ID bit 6

              0x80: MAB ID bit 7

      SYSTEM_READ_RESPONSES: (counter: 0, 1, 2)

              Response From System on Cache Refills (min count: 500)

              Unit masks (default 0x3f)

              ----------

              0x01: Exclusive

              0x02: Modified

              0x04: Shared

              0x08: Owned

              0x10: Data Error

              0x20: Modified unwritten

      OCTWORD_WRITE_TRANSFERS: (counter: 0, 1, 2)

              Octwords Written to System (min count: 500)

              Unit masks (default 0x1)

              ----------

              0x01: Octword write transfer

      CPU_CLK_UNHALTED: (counter: 0, 1, 2)

              CPU Clocks not Halted (min count: 50000)

      REQUESTS_TO_L2: (counter: 0, 1, 2)

              Requests to L2 Cache (min count: 500)

              Unit masks (default 0x47)

              ----------

              0x01: IC fill

              0x02: DC fill

              0x04: TLB fill (page table walks)

              0x08: NB probe request

              0x10: Canceled request

              0x40: L2 cache prefetcher request

      L2_CACHE_MISS: (counter: 0, 1, 2)

              L2 Cache Misses (min count: 500)

              Unit masks (default 0x17)

              ----------

              0x01: IC fill

              0x02: DC fill (includes possible replays, whereas PMCx041 does not)

              0x04: TLB page table walks

              0x10: L2 cache prefetcher request

      L2_CACHE_FILL_WRITEBACK: (counter: 0, 1, 2)

              L2 Fill/Writeback (min count: 500)

              Unit masks (default 0x7)

              ----------

              0x01: L2 fills from system

              0x02: L2 Writebacks to system (Clean and Dirty)

              0x04: L2 Clean Writebacks to system

      PAGE_SPLINTERING: (counter: 0, 1, 2)

              Page Splintering (min count: 500)

              Unit masks (default 0x7)

              ----------

              0x01: Guest page size is larger than host page size when nested paging is enabled

              0x02: Splintering due to MTRRs, IORRs, APIC, TOMs or other special address region

              0x04: Host page size is larger than the guest page size

      L2_PREFETCHER_TRIGGER: (counter: 0, 1, 2)

              L2 Prefetcher Trigger Events (min count: 500)

              Unit masks (default 0x3)

              ----------

              0x01: Load L1 miss seen by prefetcher

              0x02: Store L1 miss seen by prefetcher

      INSTRUCTION_CACHE_FETCHES: (counter: 0, 1, 2)

              Instruction Cache Fetches (min count: 500)

      INSTRUCTION_CACHE_MISSES: (counter: 0, 1, 2)

              Instruction Cache Misses (min count: 500)

      INSTRUCTION_CACHE_REFILLS_FROM_L2: (counter: 0, 1, 2)

              Instruction Cache Refills from L2 (min count: 500)

      INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM: (counter: 0, 1, 2)

              Instruction Cache Refills from System (min count: 500)

      L1_ITLB_MISS_AND_L2_ITLB_HIT: (counter: 0, 1, 2)

              L1 ITLB Miss, L2 ITLB Hit (min count: 500)

      L1_ITLB_MISS_AND_L2_ITLB_MISS: (counter: 0, 1, 2)

              L1 ITLB Miss, L2 ITLB Miss (min count: 500)

              Unit masks (default 0x7)

              ----------

              0x01: Instruction fetches to a 4K page

              0x02: Instruction fetches to a 2M page

              0x04: Instruction fetches to a 1G page

      PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE: (counter: 0, 1, 2)

              Pipeline Restart Due to Instruction Stream Probe (min count: 500)

      INSTRUCTION_FETCH_STALL: (counter: 0, 1, 2)

              Instruction Fetch Stall (min count: 500)

      RETURN_STACK_HITS: (counter: 0, 1, 2)

              Return Stack Hits (min count: 500)

      RETURN_STACK_OVERFLOWS: (counter: 0, 1, 2)

              Return Stack Overflows (min count: 500)

      INSTRUCTION_CACHE_VICTIMS: (counter: 0, 1, 2)

              Instruction Cache Victims (min count: 500)

      INSTRUCTION_CACHE_INVALIDATED: (counter: 0, 1, 2)

              Instruction Cache Lines Invalidated (min count: 500)

              Unit masks (default 0xf)

              ----------

              0x01: Non-SMC invalidating probe that missed on in-flight instructions

              0x02: Non-SMC invalidating probe that hit on in-flight instructions

              0x04: SMC invalidating probe that missed on in-flight instructions

              0x08: SMC invalidating probe that hit on in-flight instructions

      ITLB_RELOADS: (counter: 0, 1, 2)

              ITLB Reloads (min count: 500)

      ITLB_RELOADS_ABORTED: (counter: 0, 1, 2)

              ITLB Reloads Aborted (min count: 500)

      RETIRED_INSTRUCTIONS: (counter: all)

              Retired Instructions (min count: 50000)

      RETIRED_UOPS: (counter: all)

              Retired uops (min count: 50000)

      RETIRED_BRANCH_INSTRUCTIONS: (counter: all)

              Retired Branch Instructions (min count: 500)

      RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS: (counter: all)

              Retired Mispredicted Branch Instructions (min count: 500)

      RETIRED_TAKEN_BRANCH_INSTRUCTIONS: (counter: all)

              Retired Taken Branch Instructions (min count: 500)

      RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED: (counter: all)

              Retired Taken Branch Instructions Mispredicted (min count: 500)

      RETIRED_FAR_CONTROL_TRANSFERS: (counter: all)

              Retired Far Control Transfers (min count: 500)

      RETIRED_BRANCH_RESYNCS: (counter: all)

              Retired Branch Resyncs (min count: 500)

      RETIRED_NEAR_RETURNS: (counter: all)

              Retired Near Returns (min count: 500)

      RETIRED_NEAR_RETURNS_MISPREDICTED: (counter: all)

              Retired Near Returns Mispredicted (min count: 500)

      RETIRED_INDIRECT_BRANCHES_MISPREDICTED: (counter: all)

              Retired Indirect Branches Mispredicted (min count: 500)

      RETIRED_MMX_FP_INSTRUCTIONS: (counter: all)

              Retired MMX/FP Instructions (min count: 500)

              Unit masks (default 0x7)

              ----------

              0x01: x87 instructions

              0x02: MMX(tm) instructions

              0x04: SSE instructions (SSE,SSE2,SSE3,SSSE3,SSE4A,SSE4.1,SSE4.2,AVX,XOP,FMA4)

      INTERRUPTS_MASKED_CYCLES: (counter: all)

              Interrupts-Masked Cycles (min count: 500)

      INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING: (counter: all)

              Interrupts-Masked Cycles with Interrupt Pending (min count: 500)

      INTERRUPTS_TAKEN: (counter: all)

              Interrupts Taken (min count: 500)

      DECODER_EMPTY: (counter: 0, 1, 2)

              Decoder Empty (min count: 500)

      DISPATCH_STALLS: (counter: 0, 1, 2)

              Dispatch Stalls (min count: 500)

      DISPATCH_STALL_FOR_SERIALIZATION: (counter: 0, 1, 2)

              Microsequencer Stall due to Serialization (min count: 500)

      DISPATCH_STALL_FOR_RETIRE_QUEUE_FULL: (counter: 0, 1, 2)

              Dispatch Stall for Instruction Retire Q Full (min count: 500)

      DISPATCH_STALL_FOR_INT_SCHED_QUEUE_FULL: (counter: 0, 1, 2)

              Dispatch Stall for Integer Scheduler Queue Full (min count: 500)

      DISPATCH_STALL_FOR_FPU_FULL: (counter: 0, 1, 2)

              Dispatch Stall for FP Scheduler Queue Full (min count: 500)

      DISPATCH_STALL_FOR_LDQ_FULL: (counter: 0, 1, 2)

              Dispatch Stall for LDQ Full (min count: 500)

      MICROSEQ_STALL_WAITING_FOR_ALL_QUIET: (counter: 0, 1, 2)

              Microsequencer Stall Waiting for All Quiet (min count: 500)

      FPU_EXCEPTIONS: (counter: all)

              FPU Exceptions (min count: 500)

              Unit masks (default 0x1f)

              ----------

              0x01: Total microfaults

              0x02: Total microtraps

              0x04: Int2Ext faults

              0x08: Ext2Int faults

              0x10: Bypass faults

      DR0_BREAKPOINTS: (counter: all)

              DR0 Breakpoint Match (min count: 500)

      DR1_BREAKPOINTS: (counter: all)

              DR1 Breakpoint Match (min count: 500)

      DR2_BREAKPOINTS: (counter: all)

              DR2 Breakpoint Match (min count: 500)

      DR3_BREAKPOINTS: (counter: all)

              DR3 Breakpoint Match (min count: 500)

      IBS_OPS_TAGGED: (counter: all)

              Tagged IBS Ops (min count: 50000)

              Unit masks (default 0x1)

              ----------

              0x01: Number of ops tagged by IBS

              0x02: Number of ops tagged by IBS that retired

              0x04: Number of times op could not be tagged by IBS because of previous tagged op that has

                    not retired

      DISPATCH_STALL_FOR_STQ_FULL: (counter: all)

              Dispatch Stall for STQ Full (min count: 500)