AnsweredAssumed Answered

Access arbitrary element of a vector

Question asked by alariq on Nov 22, 2012
Latest reply on Nov 22, 2012 by alariq

Hello, All  I have a kernel where i have a uchar16 and i need to fetch specific element of this vector based on the local id.

Suppose i have work group size = 64

then i have next code: 

uchar16 data = // read data from a buffer  

// .... some code goes here

// .... some code goes here

// .... some code goes here


int i = get_local_id(0)%16; // 0 <= i <= 15

uchar16 new_data = ((uchar*)&data)[i];


This of course works, but, suddenly, my kernel uses 132 scratch registers (this amount i can explain, compuler just puts "data"  in global memory because i have a dependent read from it, and "i" is not a compile time constant and i also use data in many other places).


So i tried 2 other ways: 


1) declare one additional array and copy data there

uchar t[16];  t[0] = data.s0; ..... t[15] = data.s15;

then do: new_data = t[i];


but this even worse (in addition to 132 scratch regs. my kernel uses more VGPRS) 

so i did this way

2) use shuffle to get component and then splat it:

uchar2 tt = shuffle(data, (uchar2)(i,0)); // i use uchar2 because function does not accept scalars

new_data = (uchar16)(tt.x); 

still same bad results (even more registers + 132 scratch regs.).

However, if i use compile time constant in case 2) everything goes fine. I assume compiler just throws away function call. 


Is there any way to do what i want in a good way with dynamic "i" without a big overhead? I can guarantee that it will be in [0,15] range :-) 











Too long text!
Détecter la langue » English