cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

doug65536
Journeyman III

Is there a way to "insert" a value into a vector with an index?

I'm looking for a way to populate a vector with values so I can do one big vector store, instead of several little stores. I'm trying to avoid doing byte stores, which are not always supported (right?).

for (int32_t v = 0; v < 8; ++v)
{
  uchar8 r = 0;

  for (int32_t u = 0; u < 8; ++u)
  {
    float8 t = (float8)(0.0f);

    for (int32_t y = 0; y < 8; ++y)
    {
      t += foo(v, u, y);
    }
    t.s0123 += t.s4567;
    t.s01 += t.s23;
    t.s0 += t.s1;

    uchar i = convert_uchar(clamp(rint(t.s0), 0.0f, 255.0f));
    
    // ***this is the part I'm asking about***
    // Here I want to "insert" t.s0 into r in vector member u
  }
  // Store whole vector
  p = r;
}

Is there a way to do it that is better than using a long and doing this:


#if __ENDIAN_LITTLE__
r |= convert_long(clamp(rint(t.s0), 0.0f, 255.0f)) << (u * 8);
#else
r |= convert_long(clamp(rint(t.s0), 0.0f, 255.0f)) << (56-(u * 8));
#endif

Thanks!

 

0 Likes
1 Reply
notzed
Challenger

The AMD media ops let you do some stuff like this, but of course they are not portable  See the programming guide, Appendix A, section A.8.4  e.g. amd_pack(), or amd_bytealign() perhaps.

Otherwise ... well i'd just stick to using longs or ints - doing things in sets of 4 seems fairly optimal alu wise on current hardware.

Unless memory is an issue, I tend to just use floats for storage if multiple passes are involved and only convert to byte at the end for display/output, or use images and let the compiler/hardware do the packing to suit the data.

0 Likes