AnsweredAssumed Answered

maximum local variable usage

Question asked by lantis on Oct 7, 2013
Latest reply on Oct 7, 2013 by lantis

I'm trying to maximize local variable usage as a temporary variable for computation-- but it seems I'm hitting bank conflicts from the results I'm getting (at least, that's what I think my problem is).


What I'm trying to do is using uint4 array of four elements for computation, something like:


uint4 Y[4] = {some initialized values here};

uint gid = get_local_id(0);

uint offset = gid*4;


__local uint4*x;


for (uint i=3; --i; ){


x[offset] = Y[0] + Y[1] + Y[2] + Y[3];

x[offset+1] = x[offset] + x[offset];

x[offset+2] = x[offset+1] * x[offset];

x[offset+3] = x[offset+1] * x[offset+2];

Y[0] = x[offset];

Y[1] = x[offset+1];

Y[2] = x[offset+2];

Y[3] = x[offset+3];




First : is my offset correct?  Multiplying local id by four will get me the four element space I need for each local id. Or should it be get_workgroup_id(0) * get_local_id(0) * 4?

Second : uint4 is 4 unsigned ints, so 4 bytes * 4 = 16 bytes?  I need a worksize of at least 64.  What's the workgroup/worksize I need to maximize the 32kb local memory size and avoiding bank conflicts/out of border computation?