Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

Question Regarding Flow Control

In regards to flow control AMD's SDK mentions the following:

"For example, if a work-item contains a branch with two paths, the wavefront first executes one path, then the second path. The total time to execute the branch is the sum of each path time. An important point is that even if only one work-item in a wavefront diverges, the rest of the work-items in the wavefront execute the branch. "

What is meant by EXECUTE here? Does it mean just to check the branch condition of does it mean to actually execute all the instructions within the branch??

Suppose hypothetically I am running an appliation with 64 threads through the following code;

if(tid < 64/2)





So in this case does threads from 32 to 63 execute if part too and first print "if" and then "else" ? Ofcourse that does not happen so someone could explain me the execution sequence here please?

with regards,


8 Replies

well execute mean that all units execute all instructions in branch. they are just blocked so they dont change registers or write read from memory.


Originally posted by: nou well execute mean that all units execute all instructions in branch. they are just blocked so they dont change registers or write read from memory.


Thanks for you reply nou. Why is it neccessary for other threads not to change the memory state and registers? I mean in what cases a conflict may occour??




i dont know how exactly it is mplemented in HW.

but take for example this

else a--;

all workitems execute both increment and decrement instructions. but you need some mechnism to block one of this path to get correct result.

but that is not important how exactly it is working. what is importatnt is that diverge path are serialized.


You can't have the "threads" (that are NOT threads, this is the entire reason for divergence being an issue!) changing the memory state because if some updated data when they weren't meant to the state of the computation would be wrong.

Think of it as transforming:

for each 64 lanes: if( a ) { read; compute; write; } else { read; compute; write }


read64; compute64; write64( data & mask ); read64; compute64; write( data & mask );

It's a mask on each write instruction. 


Related to this, can somebody explain:

1. Why are ?: and select() better than if?

2. Is there a difference between ?: and select() for scalar values?



I don't know if the compiler will generate select for a ?:. It's certainly more likely to do it for that than for an if.

The normal reason to consider for this is that if you use select the compiler knows it can predicate instead of generating control flow, partly because the semantics of select are that both branches must be executed before being passed to the function. With a ?: or an if block you can short-circuit, and the side effects may matter so you have to analyse whether you can simply predicate. In simple cases it shouldn't matter, the compiler should be able to make them equivalent, but it's always nice to give compilers hints.

I think the spec defines ?: in terms of select doesn't it? I remember something in there because select has quirky semantics (the definition of truth is different for vector elements and scalars).


Thanks for your answer. The spec simply says that "result = c ? b : a" for scalars. This is characteristically vague and succinct (for the OpenCL spec) :). I took this to mean they are the same, but it could also simply mean that they produce the same result (but are implemented differently).

I am actually trying to figure how "predicating" works, exactly, so that I can understand how it provides better perfornace over branching.


Well in most circumstances they are the same. I wouldn't guarantee it if you have side effects in there, though.

result = c ? b : {*bob = 7; a}; 

for example. If that's valid... 🙂 I haven't checked, it's just something to think about. My thinking is that select would guarantee that bob was 7, ?: might not execute that code if c were true.

Other than that kind of possible oddity, I would say they're the same.