10 Replies Latest reply on Oct 2, 2009 6:22 AM by riza.guntur

    Swap

    VST_RT
      exchange values by index

      HI,

      what i want to do is a swap. iam using Brook+

      i call the kernel with a datastrea. in the kernel i want to change some specific values of this stream according to the index.

      to have acces to the index, i cant use a stream so i use gather array.

      kernel void
      ABC(int dir, int m0, float x[],...

      but a gather array is read only so i cant change the values.

      is there a build in function that i missed?

      local arrays are also forbidden, a local stream could help, eventually. but i dont know how to fill it, within the kernel function.

      thank you for helping.

       

        • Swap
          ryta1203

          You can't change the values of the incoming gather stream but you can assign the values of an outgoing stream and then use that.

          This is a streaming environment, so yes, it's like a way one street. I'm also  confused how a local array would help you here?

          Maybe you could post some simple psuedo code of what you are trying to accomplish.

            • Swap
              VST_RT

              Hi and thanks again.

              masterplan is a VST_PLUGIN. now i thought i could implement a FFT on the GPU.

              i would do something like that:

              fft(in,out)

              ...

              temp = in

              in = in[j]

              in[j] = temp

              ...

               

              local arrays or local pointers, would make it easier to get some c++ code "converted" to brook code. in this case copieng the input to a local array and do the swap thing on the local array or somethign like that.

              bye

                • Swap
                  VST_RT

                  Hi,

                  now iam doing it this way:

                  kernel void FFT(float p[][], out float data[][], int width)
                  {
                      int2 index = instance().xy;
                      ix = index.x;
                      iy = index.y;   
                      data[iy][ix] = p[iy][ix];

                   . . .

                                  temp = data[width-i-1][ix];
                                  data[width-i-1][ix] = data[width-j-1][ix]; //IS THIS ALLOWED?
                                  data[width-j-1][ix] = temp;

                   . . .

                   

                  }

                   

                  its compiling but its not working, after an IFFT the soundoutput is not ok.

                  i think ididnt really get the gather,scatter thing...

                    • Swap
                      gaurav.garg

                      Read from scatter is not allowed.

                      • Swap
                        ryta1203

                        Let me know if I've missed something, since I really don't know what's going on inside the "...." code lines.

                        kernel void FFT(float p[][], out float data[][], int width) { int2 index = instance().xy; ix = index.x; iy = index.y; data[iy][ix] = p[iy][ix]; . . . temp = p[width-i-1][ix]; data[width-i-1][ix] = p[width-j-1][ix]; //IS THIS ALLOWED? data[width-j-1][ix] = temp; . . . } Why not that?

                        • Swap
                          youplaboom

                           

                          Originally posted by: VST_RT

                           

                          i think ididnt really get the gather,scatter thing...

                           

                          gather: read only

                          scatter: write only

                            • Swap
                              VST_RT

                              thanks for the replys!

                              ok then read or not read... if thats all i got it.

                               

                               @ryta: argh, is see i posted a wrong piece of code.  it was :

                              data[][] = data[][];

                              and thats not nice. but compiled without error.

                              i really think, that a FFT isnt a good piece of work to do on a GPU. because the data is "indexed", so there is no independeny of data, and thats a problem in the whole SIMD idea!?

                                • Swap
                                  ryta1203

                                  FFTs work fine on GPUs... personally I just think you are having a problem grasping the "streaming" concept still.

                                    • Swap
                                      VST_RT

                                      mh. ok, so maybe you can lighten me? some good piece of paper to read?

                                      i really appreciate your help, thank you very much.

                                       

                                        • Swap
                                          riza.guntur

                                          Just remember, the variable without out is only for reading, the out variable is write only, can't be read whatsoever. I mean don't do it!

                                          Stream environment is like a waterflow, once down, you need a pump to let it go up again

                                          If a pump is an variable, then we need extra memory to copy it back to where it begin

                                          A swapped to become A'

                                          In CPU term, we can write it to exact same place with a help from one register, remember a register that may cause bottleneck

                                          In Streaming environment, A swapped to another memory which we allocate as B

                                          So for any algorithm you should consider "extra space" to write the output

                                          For single index swap it would cause kernel call EXTREME cost, as for this maybe you should consider whether copying the data back to CPU is worth for this, if not then do the swap kernel