1 Reply Latest reply on Dec 1, 2009 2:15 AM by genaganna

    Question about Mersenne Twister sample

    drstrip

      The brook code (mersenne_twister.br) contains the following set of lines, repeated for each output stream A1 ... A8 -

          A1.x = a.x ^ e.x ^ ((b.x >> thirteen) & mask11) ^ f.x ^ (r2.x << fifteen);
          A1.y = a.y ^ e.y ^ ((b.y >> thirteen) & mask12) ^ f.y ^ (r2.y << fifteen);
          A1.z = a.z ^ e.z ^ ((b.z >> thirteen) & mask13) ^ f.z ^ (r2.z << fifteen);
          A1.w = a.w ^ e.w ^ ((b.w >> thirteen) & mask14) ^ f.w ^ (r2.w << fifteen);

      where mask11, mask12, mask13 are unsigned ints. Why can't this be written

      A1 = a ^e^((b >> thirteen) & mask) ^f^(r2<< fifteen);

      where thirteen is now uint4(13U, 13U, 13U, 13U),

      mask = uint4(mask11, mask12, mask13, mask14) and

      fifteen is now uint4(15U, 15U, 15U, 15U).

      Am I missing an efficiency issue, or even worse, are they not equivalent?

        • Question about Mersenne Twister sample
          genaganna

           

          Originally posted by: drstrip The brook code (mersenne_twister.br) contains the following set of lines, repeated for each output stream A1 ... A8 -

           

              A1.x = a.x ^ e.x ^ ((b.x >> thirteen) & mask11) ^ f.x ^ (r2.x << fifteen);     A1.y = a.y ^ e.y ^ ((b.y >> thirteen) & mask12) ^ f.y ^ (r2.y << fifteen);     A1.z = a.z ^ e.z ^ ((b.z >> thirteen) & mask13) ^ f.z ^ (r2.z << fifteen);     A1.w = a.w ^ e.w ^ ((b.w >> thirteen) & mask14) ^ f.w ^ (r2.w << fifteen);

           

          where mask11, mask12, mask13 are unsigned ints. Why can't this be written

           

          A1 = a ^e^((b >> thirteen) & mask) ^f^(r2<< fifteen);

           

          where thirteen is now uint4(13U, 13U, 13U, 13U),

           

          mask = uint4(mask11, mask12, mask13, mask14) and

           

          fifteen is now uint4(15U, 15U, 15U, 15U).

           

          Am I missing an efficiency issue, or even worse, are they not equivalent?

           

           

          Thank you for pointing this. This sample has been written before complete int vector type supported .  You are using efficiently. Let us know how much improvement you see after this change.