Well, I tried under Catalyst 12.8, no good...
1. APP Profiler shows correct flags now, both read and write (look picture).
2. But incorrect timings still here. Look on picture, line at bottom describes selected map operation.
One can see it's true zero copy one (as I expected it should be). But what time it has?: ~10 milliseconds
Too long for zero copy, right... So, I thins this part of APP Profiler requires improvement too. Data is unreliable. Together with unreliable time stamps for kernels executed in different queues I'm afraid, it makes whole picture too unreliable to base any performance investigations on it . Hope this will be vastly improved in next releases.
For now I will send binary along with input data to you via tour link and will try SDK sample.