Branching penalty

Discussion created by BarsMonster on Jan 5, 2009
Latest reply on Jan 13, 2009 by udeepta@amd

Did anyone noticed huge branching penalty?

For 500-vliw long core, having 1 branch (success chance ~1/400'000'000, so everyone goes on same branch most of the time) reduces speed by around 10-20%. Is there anyone else noticed that?

That is quite unusual after CUDA where branches are almost free as long as all threads follow the same code path.