Originally posted by: drstrip The brook code (mersenne_twister.br) contains the following set of lines, repeated for each output stream A1 ... A8 -
A1.x = a.x ^ e.x ^ ((b.x >> thirteen) & mask11) ^ f.x ^ (r2.x << fifteen); A1.y = a.y ^ e.y ^ ((b.y >> thirteen) & mask12) ^ f.y ^ (r2.y << fifteen); A1.z = a.z ^ e.z ^ ((b.z >> thirteen) & mask13) ^ f.z ^ (r2.z << fifteen); A1.w = a.w ^ e.w ^ ((b.w >> thirteen) & mask14) ^ f.w ^ (r2.w << fifteen);
where mask11, mask12, mask13 are unsigned ints. Why can't this be written
A1 = a ^e^((b >> thirteen) & mask) ^f^(r2<< fifteen);
where thirteen is now uint4(13U, 13U, 13U, 13U),
mask = uint4(mask11, mask12, mask13, mask14) and
fifteen is now uint4(15U, 15U, 15U, 15U).
Am I missing an efficiency issue, or even worse, are they not equivalent?
Thank you for pointing this. This sample has been written before complete int vector type supported . You are using efficiently. Let us know how much improvement you see after this change.