roger512 Jun 19, 2013 11:00 AM (in response to cocular)Hi,
I'm curious, how did you get clock cycles count ?
ty

cocular Jun 19, 2013 9:42 PM (in response to roger512)AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.
Page 125.

himanshu.gautam Jun 21, 2013 12:19 AM (in response to cocular)I have asked some people about it. It is interesting to me too.

himanshu.gautam Jun 24, 2013 12:50 AM (in response to cocular)Any jump costs a minimum of 4 quad cycles to fetch instructions from the instruction cache. With the ifelse case there is a jump in either side of the branch, so you’re guaranteed a 32 clock penalty if the code is divergent. Since there’s an add in each side, that’s another 8 clocks. So 40 clocks not counting the instructions to set up the conditional. If nondivergent, then you just get 1 jump and 1 add, so you would have 20 clocks.
Courtesy: Jeff Golds

cocular Jun 24, 2013 6:43 AM (in response to himanshu.gautam)I think that's very clear. But APP guides says 36 or 28.. Is that wrong?

himanshu.gautam Jun 24, 2013 10:58 PM (in response to cocular)Unfortunately yes. We will fix it soon there.



