17 Replies Latest reply on Apr 28, 2008 7:36 PM by marcr

    Scatter stream base type has to be 128 bit

    nberger
      When trying to scatter write to a 2D stream I get the error message "Scatter stream base type has to be 128 bit" from brook. Does this imply that I can only do scatter to float4 and double2 streams? Is this a bug or a feature, and is it documented somewhere? Is it going to stay with us for the 1.0 release?

      Thanks

      Nik
        • Scatter stream base type has to be 128 bit
          MicahVillmow
          Nik,
          It is correct that it currently needs to be a float4 or double2 as a scatter target, however, this is not because of brook. This is a limitation imposed at the IL/CAL level and we are currently looking at ways around this. This is mainly done for performance reasons as if you export a float4, the performance is 22 GB/s, float2 exports at 5 GB/s and floats export at 1 GB/s. You can see this from the global_exp_IL example in the CAL sdk. Although you need a 128 bit export space, it is currently possible to write out float2's or floats. This is done by using write masking. So, if you want to write float2's, you could create a float4 stream and scatter using g[index].xy = someValue; As this would only write the first two values, therefor doing a float2 write.
            • Scatter stream base type has to be 128 bit
              nberger
              Hi again!
              I tried float4 and double2 streams for scatter and end up with the same error message. For my application, I found a way around the scatter, but I suppose you should look into this in some more detail.

              Cheers

              Nik
              • Scatter stream base type has to be 128 bit
                ryta1203
                Originally posted by: MicahVillmow

                Nik,

                It is correct that it currently needs to be a float4 or double2 as a scatter target, however, this is not because of brook. This is a limitation imposed at the IL/CAL level and we are currently looking at ways around this. This is mainly done for performance reasons as if you export a float4, the performance is 22 GB/s, float2 exports at 5 GB/s and floats export at 1 GB/s. You can see this from the global_exp_IL example in the CAL sdk. Although you need a 128 bit export space, it is currently possible to write out float2's or floats. This is done by using write masking. So, if you want to write float2's, you could create a float4 stream and scatter using g[index].xy = someValue; As this would only write the first two values, therefor doing a float2 write.


                Is there an example of using:

                g[index].xy = someValue;

                somewhere? My br file does not produce a cpp file when I try this.

                For example:


                kernel void kern(float index<>, float d[], float a[], float b[], float size, out float4 c[])
                {
                c[index].x =size;
                }
                  • Scatter stream base type has to be 128 bit
                    michael.chu
                    Hi ryta1203,

                    You have to scatter to entire 128-bit chunks. Hence, c[index].x isn't going to work.

                    However, is it possible for you to calculate 4 float values per iteration of your kernel instead of a single float value?

                    Michael.
                      • Scatter stream base type has to be 128 bit
                        ryta1203
                        Originally posted by: michael.chu@amd.com

                        Hi ryta1203,



                        You have to scatter to entire 128-bit chunks. Hence, c[index].x isn't going to work.



                        However, is it possible for you to calculate 4 float values per iteration of your kernel instead of a single float value?



                        Michael.


                        Oh, I was just wondering because Micah says thats possible in his post:

                        So, if you want to write float2's, you could create a float4 stream and scatter using g[index].xy = someValue; As this would only write the first two values, therefor doing a float2 write

                        If you can't use float4 indexing (.xyzw) then all four values would have to be the same, correct?

                        I only have the problem with the float4 indexing (.xyzw) when using [] not when using <> for the out.

                  • Scatter stream base type has to be 128 bit
                    MicahVillmow
                    Nik,
                    Thats good. In most cases if you can write the app without using scatter and using the stream model, you will have increased performance. For this problem, do you have a small example that shows the problem so that we can get it fixed and add it to our testing?

                    Thanks
                      • Scatter stream base type has to be 128 bit
                        nberger
                        I just found (trying to come up with a simple example) that my problem might also be one of notation; whereas

                        kernel void test(float2 index, out float4 output[])
                        {
                        output[index] = 1.0f;
                        }

                        compiles ok

                        kernel void test(float2 index, out float4 output[][])
                        {
                        output[index] = 1.0f;
                        }

                        does not and produces rhe 128 bit error message. Does this imply that I do not have to indicate the dimensionality of the stream with the brackets?
                        Sorry for the confusion
                        Nik
                          • Scatter stream base type has to be 128 bit
                            ryta1203
                            Originally posted by: nberger

                            I just found (trying to come up with a simple example) that my problem might also be one of notation; whereas



                            kernel void test(float2 index, out float4 output[])

                            {

                            output[index] = 1.0f;

                            }



                            compiles ok



                            kernel void test(float2 index, out float4 output[][])

                            {

                            output[index] = 1.0f;

                            }



                            does not and produces rhe 128 bit error message. Does this imply that I do not have to indicate the dimensionality of the stream with the brackets?

                            Sorry for the confusion

                            Nik



                            Hey nberger, did the one that compile for you run fine? I have the same thing that compiled fine but then crashed on execution.
                        • Scatter stream base type has to be 128 bit
                          MicahVillmow
                          Well, the scatter operation is to a 1D surface and thus using double [] should be an error. The global buffer that brook utilizes to implement scatter can be thought of as a huge 1D array. I'll pass this on to the brook compiler people so that they can hopefully generate a more meaningful error message.
                          • Scatter stream base type has to be 128 bit
                            MicahVillmow
                            Nik, Just talked to one of the brook developers. There are no limits on the 1D arrays as they are all internally address translated to 2D Streams. The syntax for scatter is 1D and the memory needs to be allocated as a 1D stream, but internal representation can be different. I've been told we have tested up to 8 million elements in a 1D scatter stream.
                            • Scatter stream base type has to be 128 bit
                              MicahVillmow
                              Ryta,
                              What michael is saying is correct. I was mistaking what is possible with what is implemented. The only example we have of doing the masking of global buffers is at the CAL/IL level, not at the brook level. So, to do what you might want would require coding at the IL level or patching the brook generated IL code.
                                • Scatter stream base type has to be 128 bit
                                  ryta1203
                                  Micah and Michael thanks.

                                  I think I understand. It's not possible and you have to use something like:

                                  c[index] = float4(.., ..., ..., ...);

                                  Unfortunately, this is still crashing. If this syntax, or something like it, is correct how would you about writing this stream back out to a 1D array in main() for example?
                                    • Scatter stream base type has to be 128 bit
                                      michael.chu
                                      Hi ryta1203,

                                      I'll just write the same thing in this post as I did in the other post in case someone misses the other post... :-)

                                      Can you try using an int instead of a float for index?

                                      Michael.
                                        • Scatter stream base type has to be 128 bit
                                          ryta1203
                                          Michael,

                                          Since ints are supported, I'm not sure how to go about doing that. I have tried several different things:

                                          using a constant int for index (like [0]), this doesn't work
                                          using a function parameter int (like func(..,..,..,.., int j).....[j];, this doesn't work
                                          declaring an int inside the kernel and using that, this doesn't work

                                          Is there an example I can look at because I didn't find one in the ..\samples\test or ..\samples\apps folders that shows how scatter (like this one) works.

                                          I'm sorry to be a pain about this, any help is appreciated, I'm just stuck since scatter (like this) is supported, I would like to be able to at least implement a simple example (which is all I am trying to do).

                                          EDIT:: If I was working with similar size arrays, then I could use a bunch of "if"statements such as: if (indexof(c) == somevalue), if(indexof(c) == someOtherValue), etc, etc, etc.. and then write "c = something + something" and that would work, however, since I am working witih dissimilar array sizes, I can't because I want to use an offset for "c", something like: c[indexof(d)+H*L*W], where d is a 2D array and c is a 3D array.

                                          EDIT2:: When trying to use an int anywhere with the kernel, the brcc compiles with no errors but produces only the code UP TO the int for the cpp file. That is, the cpp file is created UP TO where the "int" word is, it stops right before that, so of course I get compilation errors when I compile my cpp file.
                                    • Scatter stream base type has to be 128 bit
                                      nberger
                                      Sorry, I actually did not try - I found a way to solve my problem without any scatter. I suppose if it crashes on execution with you, it will do the same with me...
                                        • Scatter stream base type has to be 128 bit
                                          marcr
                                          Hi,

                                          There is now a scatter example available at:

                                          ftp://streamcomputing:streamcomputing@ftp-developer.amd.com/samples

                                          This code reflects the current scatter limitations (1D scatter target stream, 128 bit
                                          element size).

                                          Simply drop the scatter directory in your desktop and build.

                                          -- marcr