First of all, thanks a lot for the link to correct driver package. Now I have Cat 12.8 on C-60 netbook to try.
I will play a little with new driver to estimate changes in performance and then will re-do profiler runs. If strange results will be here still, then will try SDK sample and send binary for your investigation. Will report later.
Unfortunately, can't say anything good about Catalyst 12.8 so far...
With free CPU cores overall performance stayed the same as with Catalyst 11.12, but with busy CPU cores performance became even worse than before... It's long standing issue with recent catalyst drivers: when CPU cores busy, even with idle-priority precesses, GPU usage drops erratically and strong. App execution times increase considerably, even if app CPU priority rised to above normal.
But on C-60 with Cat 11.12 I saw big [b]decrease[/b] in consumed CPU time on busy CPU. This compensated GPU performance drop ( cause another app used more CPU time for good cause). With Catalyst 12.8 overall performance on busy CPU drops still, [b]but[/b] there is no drop in CPU consumption, CPU consumption remains very high, almost whole CPU core remains occupied (for GPU app!!!).
To illustrate this I post performance picture for my app (along X axis some parameter that increase kernel some kernel domain size inside app). Look how increase Elapsed and CPU times after switching to Catalyst 12.8. The same app binary used for all tests...
Well, I tried under Catalyst 12.8, no good...
1. APP Profiler shows correct flags now, both read and write (look picture).
2. But incorrect timings still here. Look on picture, line at bottom describes selected map operation.
One can see it's true zero copy one (as I expected it should be). But what time it has?: ~10 milliseconds
Too long for zero copy, right... So, I thins this part of APP Profiler requires improvement too. Data is unreliable. Together with unreliable time stamps for kernels executed in different queues I'm afraid, it makes whole picture too unreliable to base any performance investigations on it . Hope this will be vastly improved in next releases.
For now I will send binary along with input data to you via tour link and will try SDK sample.