I am using laptop with RADEON HD-6290. Rated peak performance of gpu is 44 gflops (checked on wikipedia). When i am running OpenCL sample examples (provided by AMD) on gpu, it is giving performance of 5-6 G instructions. Why such a large difference between peak vs actual capacity.
Please find attached file for more detail.
2nd last column is instructions/sec (calculated as Total work item * ALU instructions * 1000 / time (ms)).
Last column is instructions/sec normalized to 100% ALU busy.