I've been experimenting with GL-interop, and for some reason, even the SimpleGL SDK sample is ridiculously slow (around 60-70 fps).
After some investigating using APP Profiler, it would seem that the kernel itself executes very fast in about 0.3 ms, but ACQUIRE_GL_OBJECTS and RELEASE_GL_OBJECTS tasks (listed in the Data Transfer row) take up about 5.6 milliseconds, each!
As the time consumed is proportional to the size of the texture, I must assume that some kind of (possibly CPU-GPU) copying takes place. The question is, why? I would have thought avoiding these costly copies is the point of using interop.
I'm using VS2010, APP SDK 2.5, Catalyst 11.8, Windows 7 x64 and a single Radeon HD 4850 video card.