4 Replies Latest reply on Sep 7, 2012 3:49 AM by rocky67

    Problem with performance counters on Opteron 6172

    aleksr9

      Hi,

      I've been trying to analyze certain applications with performance counters on a Opteron 6172, running Red Hat Enterprise Linux Workstation release 6.2 (Santiago).

       

      I'm using PAPI v4.1.3.0 which uses the AMD native events CPU_CLK_UNHALTED for counting total cycles and DATA_CACHE_ACCESSES for counting L1 Data cache accesses.

       

      http://support.amd.com/us/Processor_TechDocs/31116.pdf

      - CPU_CLK_UNHALTED

      The number of clocks that the CPU is not in a halted state (due to STPCLK or a HLT instruction). Note: this

      event allows system idle time to be automatically factored out from IPC (or CPI) measurements, providing the

      OS halts the CPU when going idle. If the OS goes into an idle loop rather than halting, such calculations are

      influenced by the IPC of the idle loop.

       

      - DATA_CACHE_ACCESSES

      The number of accesses to the data cache for load and store references. This may include certain microcode

      scratchpad accesses, although these are generally rare. Each increment represents an eight-byte access,

      although the instruction may only be accessing a portion of that. This event is a speculative event.

       

      The problems I've been experiencing is that the number of cache accesses have been higher than the total number of cycles in some cases. A cache access does not halt the cpu, to my understanding, so it should fit within the total cycles. Also when dividing the total cycles by the clock frequency of the Opteron 6172 I get a pretty accurate estimate of the runtime, which makes me think that the total cycles is ok and the problem has to be with the counting of the data cache accesses.

       

      Any help or reason to why this can occur is greatly appreciated, thanks in advance