cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

sp314
Adept II

What's the best way to broadcast a uint32 from one thread in a wave to all other threads on R290x?

Assume the local work size = 64, and I want to broadcast a uint32 from the thread with local id = 0 to all threads within the workgroup. How should I do it?


Those DS_BPERMUTE_B32 instructions on Vega etc. are certainly nice, but I don't see them mentioned in the Hawaii ISA doc. I've tried a simple test using work_group_broadcast() with OpenCL 2, and it generated about 150 lines of assembly with all sorts of ifs and branches, so I suspect there could be a better way.

What would you recommend that I do?

Thank you!

References -

http://developer.amd.com/wordpress/media/2013/12/Vega_Shader_ISA_28July2017.pdf

https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/work_group_broadcast.html

0 Likes
1 Solution
dipak
Big Boss

Hi,

Actually, DS-Permute instructions like DS_BPERMUTE_B32  were introduced in GCN3, so these are not available in Hawaii (Sea Islands). I think, the DS_SWIZZLE_B32 instruction could be used for this purpose though. Here is good article that describes its usage: https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/

View solution in original post

0 Likes
2 Replies
dipak
Big Boss

Hi,

Actually, DS-Permute instructions like DS_BPERMUTE_B32  were introduced in GCN3, so these are not available in Hawaii (Sea Islands). I think, the DS_SWIZZLE_B32 instruction could be used for this purpose though. Here is good article that describes its usage: https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/

0 Likes

Hi dipak,

that's a very nice article! I believe DS_SWIZZLE_B32 will work.

Thank you very much!

sp

0 Likes