I might be missing something... but I don't see that I can measure the amount of data written by kernels via GPUPerf. I would like to have this so that I can automatically generate estimates of the bandwidth of kernels... so far I have been counting bytes by hand...