AnsweredAssumed Answered

VLIW on Cypress and vector addition

Question asked by cadorino on Jul 2, 2012
Latest reply on Jul 5, 2012 by cadorino

Hi to everybody.
I'm thinking about VLIW utilization on a 5870 HD.

Suppose you have the following kernel:


__kernel void saxpy(const __global float * x, __global float * y, const float a)


          uint guid = get_global_id(0);

                    y[guid] = a * x[guid] + y[guid];



Each work item operates on a single vector element and no vectorization (float4).
Is the compiler still capable of packing instructions to exploit the 4 ALUs of each processing element?

Is there any tool to determine the way instructions are packed into VLIW?


Thank you very much!