I experienced the same problem when upgrading to 12.1 - about 15% performance drop - increasing the workgroup size from 64 to 256 helped me reclaim most of the lost performance on one of my kernels.
The worse news is that after testing Catalyst 12.2 preview drivers, performance has dropped again by about 15% for the same app, but this time there's no work around for me.
I realise some previous compiler optimisations may have been causing incorrect results in some corner situations, therefore had to be removed, but it's hard to swallow the fact that ever since Catalyst 11.11 performance has been steadily decreasing (quite significant performance drops in fact).
Makes me want to upgrade to GCN earlier, just so I don't have to look back to when times were better....that or moving to the dark side
Sorry, but it's another case.
I see improvement in performance of 12.1 drivers. In benchmark and in real-world load... most of time. But time to time performance drops hugely, not 15-20%, in few times. And it can be restored by restarting the app.
So IMHO it's not just compiler optimizations or smth like this. It's some problem in runtime/driver that leads time to time to such "hangs" (cause even desktop GUI becomes laggy those times).
My Experience of running both of Raistmer's apps is similar to his, GPU load sometimes is close to 0% with small signs of activity, sometimes 30% to 40%, sometimes 60%, suspending the task, waiting a good few seconds, and resuming the task often gets the GPU load up to +90%, I've seen this problem on about 20 different revisions of app, first on Cat 11.5, then Cat's 11.9, 11.12 and 12.1, this is on an i7-2600K with a GTX460 and a HD5770 on Windows 7 x64, these two GPU's were previously in an C2D E8500 Win 7 x64 computer with Cat 11.5, that computer had no problem with ATI GPU load and ran one of the apps for well over 6 months without me noticing any problem, on the i7-2600K/GTX460/HD5770 trying different affinity settings hasn't helped, i also have on and off been running an Nvidia variant of one of the apps, inspite of their 100% CPU Bug (where i have to keep a core idle), the Nvidia app doesn't show any slowdowns or low GPU load,
Edit: For Example app here had GPU load of ~60%, suspend & resume
increased GPU load to ~90%
Edit 2: Here's another example where GPU load was ~40%
before suspending & resuming raised it to ~90%