Hi boys and girls,
I read the GCN ISA manual and came to found an DS_SWIZZLE instruction. which is capable for doing inter-thread data exchange without touching LDS memory.
But the instruction is not exported into amd app sdk's opencl language. So, How to use it?
It's a great feature, which is exactly the AMD version of the "warp shuffle" feature of NV's kepler cards.
So it's better to use it.