cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

cocular
Journeyman III

Cannot do the math: what is the number of cycle needed in a branch.

In APP guide, there is a code snippet:


if (A>B) {
C += D;
} else {
C -= D;
}



It says that


In the first block of code, this translates into an IF/ELSE/ENDIF sequence of


conditional code, each taking ~8 cycles. If divergent, this code executes in


~36 clocks; otherwise, in ~28 clocks. A branch not taken costs four cycles


(one instruction slot); a branch taken adds four slots of latency to fetch


instructions from the instruction cache, for a total of 16 clocks. Since the


execution mask is saved, then modified, then restored for the branch, ~12


clocks are added when divergent, ~8 clocks when not.


Anyway I cannot get 36 or 28……  How many cycles each line takes in both case?

0 Likes
6 Replies
roger512
Adept II

Hi,

I'm curious, how did you get clock cycles count ?

ty

0 Likes

AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.

Page 125.

0 Likes

I have asked some people about it. It is interesting to me too.

0 Likes

Any jump costs a minimum of 4 quad cycles to fetch instructions from the instruction cache.  With the if-else case there is a jump in either side of the branch, so you’re guaranteed a 32 clock penalty if the code is divergent.  Since there’s an add in each side, that’s another 8 clocks.  So 40 clocks not counting the instructions to set up the conditional.  If non-divergent, then you just get 1 jump and 1 add, so you would have 20 clocks.

Courtesy: Jeff Golds

0 Likes

I think that's very clear.  But APP guides says 36 or 28..  Is that wrong?

0 Likes

Unfortunately yes. We will fix it soon there.

0 Likes