The Vega Shader ISA doc (https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf) describes S_WAKEUP instruction as follows (I quote) -
Allow a wave to 'ping' all the other waves in its threadgroup to force them to wake up immediately from an S_SLEEP instruction. The ping is ignored if the waves are not sleeping. This allows for efficient polling on a memory location. The waves which are polling can sit in a long S_SLEEP between memory reads, but the wave which writes the value can tell them all to wake up early now that the data is available. This is useful for fBarrier implementations (speedup). This method is also safe from races because if any wave misses the ping, everything still works fine (waves which missed it just completes their normal S_SLEEP).
I understand the polling part and the need for the instruction, but then...
1 are threadgroups equivalent to workgroups in OpenCL, or is an OpenCL workgroup a threadgroup? Is this right?
2 I quote - if any wave misses the ping, everything still works fine (waves which missed it just completes their normal S_SLEEP). Why would a wave miss the ping? How often does that happen? Why?
3 are there any examples of using S_SLEEP/S_WAKEUP to implement a barrier/an fBarrier?
Thank you in advance. I found the answers to many of my questions on this forum, so I appreciate AMD setting it up, and I appreciate your help.