How to relate execution time to the CPU_CLK_UNHALTED sampling?

Discussion created by yingbo on Aug 21, 2009
Latest reply on Oct 14, 2009 by pdrongowski

Hi @ll,

I used OProfile on AMD Opteron 2.0GHz.

opcontrol --event=CPU_CLK_UNHALTED:3001 --image=gzip.exe

The sampling result is 902,645.   The execution time of gzip.exe is 9.281s (linux time command).

Estimated execution time: sampling result * sampling interval / CPU frequency
902,645 * 3001 / 2,000,000,000 = 1.35s.  But there is a big discrepancy between the two times.

The event CPU_CLK_UNHALTED means "CPU Clocks Not Halted". Does Oprofile count the latency time caused by cache misses or I/O?  These events can cause CPU halt.  If not, the CPU_CLK_UNHALTED sampling result makes little sense and cannot represent program performance.

But the document "Basic Performance Measurements for AMD Athlon™ 64,AMD Opteron™ and AMD Phenom™ Processors"  says "IPC = Ret_instructions / CPU_clocks", which means CPU_CLK_UNHALTED counts cache misses and I/O waiting time. Is it right?

BTW, when I changed event count to 1,0001 the sampling result is 752,647. And changed to 10,0001, the result is 167,612. Why does not the sampling result scale with the event count?

I'm confused...

Any suggestion is welcome.