0 Replies Latest reply on Jul 18, 2009 9:11 PM by Raistmer

    Memory access performance for 2D streams

      I need horizontal addition of 2 float 4 elements to form element of output stream.
      Is there some difference in performance what index will be used for access to sequental elements?

      I.e.: will 1) and 2) differ in performance?

      1) o.xy=inp[tID][i].xz+inp[tID][i].yw; o.zw=inp[tID][i+1].xz+inp[tID][i+1].yw; 2) o.xy=inp[i][tID].xz+inp[i][tID].yw; o.zw=inp[i+1][tID].xz+inp[i+1][tID].yw; with kernel void k(float4 inp[][], out float4 o<>);