2 Replies Latest reply on Jan 13, 2009 6:47 AM by udeepta@amd

    Branching penalty

    BarsMonster

      Did anyone noticed huge branching penalty?

      For 500-vliw long core, having 1 branch (success chance ~1/400'000'000, so everyone goes on same branch most of the time) reduces speed by around 10-20%. Is there anyone else noticed that?

      That is quite unusual after CUDA where branches are almost free as long as all threads follow the same code path.