14 Replies Latest reply on Jun 8, 2010 12:02 AM by bubu

    vector subscript

    bubu

      Is there any way to do this, pls?

       

      float4 v = float4(0.0f,1.0f,2.0f,3.0f);

      const int idx = 2;

      v[idx] += 1.0f;

       

      ???

        • vector subscript
          MicahVillmow
          You need to cast the scalar to a vector first.
          So your code should look like this.
          v[idx] += (float4)(1.0f);

          • vector subscript
            malcolm3141

            No there isn't a way to do that.

            However:

             

            float v[4];

            will work, and

            float4 v;

            if(idx == 0)    v.s0 += 1.0f;

            if(idx == 1)    v.s1 += 1.0f;

            if(idx == 2)    v.s2 += 1.0f;

            if(idx == 3)    v.s3 += 1.0f;

            will also work (and will likely be quite fast due to predication).

             

            Malcolm

            • vector subscript
              Illusio

              Any of these methods acceptable?

               

              ---------------- Table lookup: ---------------- float4 addOneTable[] = { float4(0.0f,0.0f,0.0f,1.0f), float4(0.0f,0.0f,1.0f,0.0f), float4(0.0f,1.0f,0.0f,0.0f), float4(1.0f,0.0f,0.0f,0.0f), }; v += addOneTable[idx]; ------------------------- Vectorized test on idx ------------------------- int4 indices = int4(0,1,2,3); // isequal returns all bits set(-1) in the location that matches idx, so subtract instead of adding v -= convert_float4( isequal( int4(idx), indices ) );

                • vector subscript
                  bubu

                  Yep, I could do it using a trick... but the point is that vector subscript should be supported directly... after all, you support swizzling and things like .s0123456789abcdef , so why not v[idx] ( where idx is not a constant ) ?

                    • vector subscript
                      eduardoschardong

                      Exactly because it's not a constant...

                      float4 is a structure, not an array, but OpenCL allows for the pointer casting like you did in third reply as also:

                      float va[4];

                      *(float4)va = (float4)(1, 2, 3, 4);

                       

                        • vector subscript
                          bubu

                           

                          Originally posted by: eduardoschardong Exactly because it's not a constant...

                           

                           

                           

                           

                          Yep, but the point is: if that's possible to do... why I must write a function to cast pointers manually? Why they don't simple add a simple feature to the OpenCL compiler to support vector subscripts? Pls, add that to the CL 1.1 spec !

                    • vector subscript
                      MicahVillmow
                      bubu,
                      Vector subscripts are not supported because at the hardware level, you cannot index into a vector. Vector's are not arrays, they are native data types of vector machines/units. Just like you cannot index into an integer because it is a fundamental data type, you cannot index into a vector for the same reason. You can write a function to do indexing into a vector, but it is a O(N^2) function on the length of the vector. Something like this:
                      void vec_index4(Vec A, int idx, Scalar B) {
                      if (idx == 0) {
                      A.x = B
                      } else if (idx == 1) {
                      A.y = B;
                      } else if ... {
                      } else {
                      A.last element = B;
                      }
                      }

                      Not very efficient, and definitely not the kind of code you want to generate.
                        • vector subscript
                          bubu

                           

                          Originally posted by: MicahVillmow bubu, Vector subscripts are not supported because at the hardware level, you cannot index into a vector


                          And how is this operation done then in hardware?

                          float4 val = (float4) ( 1.0f, 2.0f, 3.0f, 4.0f );

                           

                          float three = *((float*)&val[ j ]); /* j is not constant, for example is read from a texture */


                          using private memory?

                        • vector subscript
                          MicahVillmow
                          In this case, the vector is pushed onto the stack, and then the scalar value is read back from memory.
                            • vector subscript
                              malcolm3141

                              To the best of my knowledge subscripts on vector data types is not compliant OpenCL. The AMD implementation allowed it, although I believe that is changing (haven't tried lately).

                              Remember there is a big difference between a constant index and a dynamic index. With regard to constant memory, and often other cases on AMD hardware, a dynamic index usually means it has to do really slow tricks such as Micah mentioned.

                              Ironically, the vec_index4() function Micah illustrated is usually the fastest way to do dynamic indexing of a vector. In your specific case the second trick given by Illusio may be faster, but it is not general.

                              My advice is don't be afraid of simple branches, so long as you check the shader analyser and confirm the compiler is using predication for them (they'll turn into CMOV... instructions). I have not had issues convincing the compiler to use predication in these cases.

                               

                              Malcolm

                                • vector subscript
                                  bubu

                                  Ironically, the vec_index4() function Micah illustrated is usually the fastest way to do dynamic indexing of a vector. In your specific case the second trick given by Illusio may be faster, but it is not general.

                                  And the *(float*)&v what position will take? It's performing very well for me atm...