23 Replies Latest reply on Jul 19, 2008 3:07 PM by ryta1203

    Problems with gather

    ryta1203
      Not sure why this is not compiling (get "expect gather...blah blah blah", any help woud be great, from anybody:


      kernel void foo(float4 before_in[][], float4 out before_out[][])
      {
      float4 temp[100];
      int x=0;
      if (indexof(before_out) < 3)
      {
      // transfer to temp
      for (x=0;x<100;x++)
      {
      temp[x] = before_in[indexof(before_out)][x];
      }
      // do some work on temp (reads/writes) here

      // transfer to before_out
      for (x=0;x<100;x++)
      {
      before_out[indexof(before_out)][x] = temp[x];
      }
      }
      }
        • Problems with gather
          Remotion

          Hi,

          I think that for now gather is allowed only on 1D streams.

          kernel void foo(float4 before_in[][], float4 out before_out[])

          The access to 2D stream must be like this.

          float2 id = float2(x,y);

          stream2d[id] = 0.0;

          Not like c++ [][] also.

           

           

            • Problems with gather
              ryta1203
              Remotion,

              Thanks, unfortunately, this did not solve my gather problem with the float4 temp, for instance, I am still getting compiler errors for

              ....... = temp[x];

              due to the temp[x];

              Error is "Semantic Check found 2 errors", both for the .....=temp[x] lines.
            • Problems with gather
              eduardoschardong
              Originally posted by: ryta1203

              Not sure why this is not compiling (get "expect gather...blah blah blah", any help woud be great, from anybody:


              The problem is the scatter, wich seens to not be supported yet, if you chnage the out from [][] to <> you won't get this error (but need to check other parts of the code for other errors).

              kernel void foo(float4 before_in[][], float4 out before_out<>)

                • Problems with gather
                  ryta1203
                  Originally posted by: eduardoschardong

                  Originally posted by: ryta1203



                  Not sure why this is not compiling (get "expect gather...blah blah blah", any help woud be great, from anybody:





                  The problem is the scatter, wich seens to not be supported yet, if you chnage the out from [][] to <> you won't get this error (but need to check other parts of the code for other errors).



                  kernel void foo(float4 before_in[][], float4 out before_out<>)


                  This is not the problem. Also, if I change [][] to <> then the code has all kinds of problems since you can't do random access on a stream AND because of that, it really wouldn't be what I want anyways.

                  If I change "temp[100]" to just "temp" and "temp[x]" to just "temp", the problem goes away, but obviously this is not what I want logically speaking.

                  Why would the local array be causing these problems? Are local arrays not supported? Seems odd if they are not.

                  My kernel now looks like:

                  kernel void foo(float4 before_in[][], float4 out before_out[][])
                  {
                  float4 temp[100];
                  float2 pos;
                  int x=0;
                  if (indexof(before_out) < 3)
                  {
                  pos.x = indexof(before_out);
                  // transfer to temp
                  for (x=0;x<100;x++)
                  {
                  pos.y = x;
                  t_pos = x;
                  temp[x] = before_in[pos];
                  }
                  // do some work on temp (reads/writes) here

                  // transfer to before_out
                  for (x=0;x<100;x++)
                  {
                  pos.y=x;
                  t_pos=x;
                  before_out[pos] = temp[x];
                  }
                  }
                  }


                  Also, I thought scatter was supported, just the current massive limitation being that you have to scatter out 128 bits at a time (float4 or double2) which is what I am doing here.
                    • Problems with gather
                      Remotion

                      It looks like the problem is you temporal array, Brook+ probably just do not support  such thinks until now.

                      float4 temp[100];

                       

                        • Problems with gather
                          ryta1203
                          Originally posted by: Remotion

                          It looks like the problem is you temporal array, Brook+ probably just do not support  such thinks until now.




                          float4 temp[100];




                           



                          Yes, this seems that it might be the case; however it would be great to have someone from AMD verify that.

                          Although, unless this is a hardware limitation, it doesn't make any sense. You could create multiple single variables, so why not be able to create an array of them?
                    • Problems with gather
                      MicahVillmow
                      Ryta, this is correct, there are no temporary arrays in Brook+ yet, however, they are available via CAL/IL and CAL/AMDHLSL. There is a cal sample called scratch_buffer_IL that shows how to use a temp array.
                      • Problems with gather
                        MicahVillmow
                        I'm not 100% sure if it is in the current SDK, but if it is, there should be a sample located in the samples\languages\hlsl10 directory.

                        This should show how it is used within cal.
                          • Problems with gather
                            ryta1203
                            Micah,

                            Are temp arrays going to be available in the near future? This would be a very nice addition, considering CAL is much more time consuming than Brook+ and the setup overhead is quite enormous compared to some other GPGPU alternatives, such as CUDA. For example, it would take me no time at all to code and run that kernel in CUDA. I'm not trying to plug CUDA, but to be competitive with Nvidia don't you think this SDK should, at the very least, have very simple functionality like local C arrays?
                              • Problems with gather
                                jski

                                I tried something as simple as:

                                kernel void foo( float4 out b4_in[][], float4 out b4_out[][] )
                                {
                                   float2 pos = float2( 1.0f, 1.0f );
                                   float4 tmp;

                                   tmp = b4_in[pos];
                                   b4_out[pos] = tmp;
                                }

                                This resulted in a compilation error because of the indexed assignment to b4_out: b4_out[pos] = tmp.

                                Are assignments to scatter streams allowed in this fashion?  I.e., indexed assignments?

                                ---jski

                                  • Problems with gather
                                    ryta1203
                                      • Problems with gather
                                        jski

                                        That's what happens in the wee hours of the night when you're experimenting but in didn't clear up the compilation bug!  I still get the same compile-time error.  And if I comment out the assignment: b4_out[pos] = tmp, it compiles just fine

                                        ---jski

                                          • Problems with gather
                                            ryta1203
                                            Micah, it doesn't do what I want to do because I need a bidirectional array in the kernel, which is not yet supported, however you are correct on the 1D scatter stream being supported, I should have caught that in the release notes earlier.

                                            Apparently according to Micah maybe 2D scatter assignments are not working either. I think I actually might have seen this in the release notes.

                                            Yes, this is included in the Mar-08 release notes:

                                            Scatter
                                            -------

                                            Scatter to 1-dimensional targets is supported. The syntax is similar to gather
                                            operations, in that the stream is bound using square brackets instead of angle
                                            brackets and elements are accessed in an array-like fashion.

                                            So scatter is only supported on 1D but gather is supported on multi-dimensions it seems. How does this work with address translation and stream size limitations? Are 2D+ streams planning on being supported? What about local arrays?

                                  • Problems with gather
                                    MicahVillmow
                                    ryta, not sure why your's is not compiling, but assingment between float4's works fine. If you check the scatter example in the brook+ sdk, the following example does exactly what you want to do plus a little bit more. The thing that I think is different is that in the scatter example, the scatter stream is a 1D stream, whereas you have it specified as a 2D stream.

                                    kernel void scatter(float4 a[][], float4 b<>, float width, out float4 c[])
                                    {
                                    // Get the position in the stream of the current thread
                                    float idx = (indexof(c)).x;
                                    float2 apos = {idx % width, floor(idx / width) };

                                    // Write out to the scatter buffer
                                    c[idx] = a[apos] + b;
                                    }


                                      • Problems with gather
                                        jski

                                        Micah,

                                        I added your code (listed below) to an existing project, simple_matmult, just to see if it compiled.

                                        kernel void scatter(float4 a[][], float4 b<>, float width, out float4 c[])
                                        {
                                           // Get the position in the stream of the current thread
                                           float idx = (indexof(c)).x;
                                           float2 apos = {idx % width, floor(idx / width) };

                                           // Write out to the scatter buffer
                                           c[idx] = a[apos] + b;
                                        }

                                        And got:

                                        WARNING: ASSERT(GetResultSymbol().IsValid() + mDataTypeValue.IsValid() >= 1) failed
                                        While processing <buffer>:66
                                        In compiler at ResolveSymbols()[astdelayedlookup.cpp:139]
                                          *mName = c
                                        Message: unknown symbol

                                        ERROR: ASSERT(errorCount==0) failed
                                        While processing <buffer>:115
                                        In compiler at CompileShaderToStream()[astroot.cpp:157]
                                          errorCount = 1
                                        Message: Unknown Symbols exist
                                        Aborting...
                                        Problem with compiling built_d/simple_matmult_simple_matmult.hlslmkdir -p built_d... 

                                        ---jski

                                          • Problems with gather
                                            marcr

                                            Ok, this is going to sound a little silly, but I tinkered with the scatter example, and it seems that if you have multiple scatter kernels in the same file, the scatter target parameter has to be of the same name in all kernels (i.e. all have to be called "c[]"). Looks like a brcc bug. -- marcr
                                              • Problems with gather
                                                eduardoschardong
                                                And I can't compile any scatter, even the scatter.br sample fails
                                                The error:
                                                Argument to indexof not a stream

                                                And when commenting all indexof:
                                                Output is not a stream: out float4 c[].
                                                  • Problems with gather
                                                    ryta1203
                                                    Originally posted by: eduardoschardong

                                                    And I can't compile any scatter, even the scatter.br sample fails

                                                    The error:

                                                    Argument to indexof not a stream



                                                    And when commenting all indexof:

                                                    Output is not a stream: out float4 c[].


                                                    Can you post your code? That seems odd, I don't really have a problem running their scatter sample. What hardware are you using? R600 does not support scatter as far as I am aware, you must have HD38xx+ (or of course 9170, 9250).
                                                      • Problems with gather
                                                        eduardoschardong
                                                        Originally posted by: ryta1203

                                                        Originally posted by: eduardoschardong
                                                        Can you post your code? That seems odd, I don't really have a problem running their scatter sample. What hardware are you using? R600 does not support scatter as far as I am aware, you must have HD38xx+ (or of course 9170, 9250).

                                                        The code is just the scatter.br...
                                                        I know I can't run it on the older GPU, but was expect to compile and run on CPU mode right?
                                                          • Problems with gather
                                                            ryta1203
                                                            I would think it should run either way (CPU or GPU), can you post the code just in case? Is it the scatter sample unmodified? Maybe your environment is not setup proplery, are you having problems running any other samples?
                                                              • Problems with gather
                                                                eduardoschardong
                                                                Originally posted by: ryta1203

                                                                I would think it should run either way (CPU or GPU), can you post the code just in case? Is it the scatter sample unmodified? Maybe your environment is not setup proplery, are you having problems running any other samples?


                                                                I checked environment variables and found old values of alphas in user variables (the new ones already was on system variables), I deleted than and now it works fine, thank you for the help.
                                                    • Problems with gather
                                                      ryta1203
                                                      Are int kernel parameters not yet supported? Is this a planned implementation? I have problems with my kernels everytime I try to use an int in the kernel parameter, even if I don't use the parameter at all in the kernel. Any ideas?

                                                      I don't see why this is a problem since all non-stream inputs are considered constant anyways.