cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

sh2
Adept II

Ultra-Threaded Dispatch processor performance

There is a lot information about ALU and TEX performance, but I can't find any information about dispatch processor.

So my quenstion is: how many control flow instructions could be executed per cycle?

0 Likes
3 Replies

A rough estimate is about 40 cycles per control flow instruction.
0 Likes

I think that 40 cycles is the CF instruction latency, not throughput. I asked about throughput. In other worlds, how much would be CF/ALU ratio before we get CF bound?

0 Likes

sh,

The control flow instruction just change the clause of the instructions. It takes about 40 cycles before the new ALU clause can start. So you should be able to hide this latency by having high ALU\CF ratio or having large number of wavefronts in a compute unit.

But CF latency when the CF instruction diverges is much higher and thus very difficult to hide.

0 Likes