I used a Radeon 5850 for OpenCL development and now upgraded to a Radeon R9 280.
Using CodeXL I profiled the same kernel on the 5850 and R9 280 and I got 100% occupancy on the 5850 and 40% on the 280. CodeXL shows 7 VGPRs used on the 5850 and 52 VGPRs + 28 SGPRs used on the 280.
So my question is basically: ***? First, is the notion of VGPR's different between Cypress and Tahiti? And if not, why is the code using so many more registers on Tahiti?
By the way, the kernel does run twice as fast on the 280 compared to the 5850 (according to CodeXL's mesurement), which is about what I expected, but I still find this measurement strange and I'm not sure how to deal with it.