My guess is that 3001 is kinda too small for CPU_CLK_UNHALTED event. There will be many sample lost during profiling. Please check /var/lib/oprofile/samples/oprofiled.log if there are any sample lost due to overflow etc.
I agree with Lei, a sampling period of 3001 is _way_ too aggressive for a high frequency event like CPU_CLK_UNHALTED. I've found that a sampling period of 100,000 is a practical limit for this event and RETIRED_INSTRUCTIONS. The data collection overhead increases rapidly under 100,000 and the interrupt handling (to collect samples) pollutes the caches/TLB/branch history tables. The pollution affects the workload behavior and the workload behavior is no longer "representative."
Generally, the CPU clock is not halted for cache misses. The load to use latency for a cache miss is usually pretty short and a clock halt is not necessary. I/O is another issue because the CPU idle period is much longer.
IPC is a local measure of performance and indicates instruction-level parallelism within a small local neighborhood (like a hot loop). Yep, IPC can be affected by a halted clock. However, if IPC is applied within a tight compute bound loop, it can still be an effective performance measurement.
We also recommend disabling clock frequency throttling. If the clock speed is throttled, no all cycles will be the same length of time! You may need to disable any power management software that changes the clock frequency.
Yep, I agree -- using CPU_CLK_UNHALTED can be tricky!