Memory bandwidth

Would a measure of total L1 cache accesses (hits and misses) be representative of the number of load and store instructions executed in an application?


It seems like non-temporal stores could mess up this number, but that isn't a problem here. I was curious if there were any other gotchas that could make this inaccurate.


I am looking for total memory instructions executed. I do not care if they are found in L1 or in system memory.