The user guide says that, "in a thread processor, up to 4 threads can issue 4 VLIW instruction over 4 cycles. ..For example, the 16 thread processors execute the same instructions, with each thread processor processing 4 threads at a time, this appears as a 64-wide SIMD engine". Why each thread processor can process 4 thread at a time? What is the meaning of "at a time"? Obviously not in one clock.