I used OProfile on AMD Opteron 2.0GHz.
opcontrol --event=CPU_CLK_UNHALTED:3001 --image=gzip.exe
The sampling result is 902,645. The execution time of gzip.exe is 9.281s (linux time command).
Estimated execution time: sampling result * sampling interval / CPU frequency
902,645 * 3001 / 2,000,000,000 = 1.35s. But there is a big discrepancy between the two times.
The event CPU_CLK_UNHALTED means "CPU Clocks Not Halted". Does Oprofile count the latency time caused by cache misses or I/O? These events can cause CPU halt. If not, the CPU_CLK_UNHALTED sampling result makes little sense and cannot represent program performance.
But the document "Basic Performance Measurements for AMD Athlon™ 64,AMD Opteron™ and AMD Phenom™ Processors" says "IPC = Ret_instructions / CPU_clocks", which means CPU_CLK_UNHALTED counts cache misses and I/O waiting time. Is it right?
BTW, when I changed event count to 1,0001 the sampling result is 752,647. And changed to 10,0001, the result is 167,612. Why does not the sampling result scale with the event count?
Any suggestion is welcome.