1 Reply Latest reply on Dec 15, 2011 5:35 PM by notzed

    Is there a way to "insert" a value into a vector with an index?

    doug65536

      I'm looking for a way to populate a vector with values so I can do one big vector store, instead of several little stores. I'm trying to avoid doing byte stores, which are not always supported (right?).

      for (int32_t v = 0; v < 8; ++v)
      {
        uchar8 r = 0;

        for (int32_t u = 0; u < 8; ++u)
        {
          float8 t = (float8)(0.0f);

          for (int32_t y = 0; y < 8; ++y)
          {
            t += foo(v, u, y);
          }
          t.s0123 += t.s4567;
          t.s01 += t.s23;
          t.s0 += t.s1;

          uchar i = convert_uchar(clamp(rint(t.s0), 0.0f, 255.0f));
          
          // ***this is the part I'm asking about***
          // Here I want to "insert" t.s0 into r in vector member u
        }
        // Store whole vector
        p[v] = r;
      }

      Is there a way to do it that is better than using a long and doing this:


      #if __ENDIAN_LITTLE__
      r |= convert_long(clamp(rint(t.s0), 0.0f, 255.0f)) << (u * 8);
      #else
      r |= convert_long(clamp(rint(t.s0), 0.0f, 255.0f)) << (56-(u * 8));
      #endif

      Thanks!

       

        • Is there a way to "insert" a value into a vector with an index?
          notzed

          The AMD media ops let you do some stuff like this, but of course they are not portable  See the programming guide, Appendix A, section A.8.4  e.g. amd_pack(), or amd_bytealign() perhaps.

          Otherwise ... well i'd just stick to using longs or ints - doing things in sets of 4 seems fairly optimal alu wise on current hardware.

          Unless memory is an issue, I tend to just use floats for storage if multiple passes are involved and only convert to byte at the end for display/output, or use images and let the compiler/hardware do the packing to suit the data.