6 Replies Latest reply on Jun 18, 2008 9:09 PM by udeepta@amd

    Crashes of reduction kernels for certain stream lengths

    nberger
      Hi!
      I have a simple reduction kernel, just summing up values in a stream. For certain values of the stream length, the result is expected, for others the program crashes. The first few non-working stream lengths are
      11, 13, 17, 19, 22, 23, 26, 29, 31, 33...

      Cheers

      Nik
        • Crashes of reduction kernels for certain stream lengths
          udeepta@amd

          Nik,

          There are constraints on the stream size for a reduction kernel -- the prime factorization of the stream size can have only 2, 3, 5 and 7 as factors. Sizes that are multiples of primes (11 or greater) will not work in reduction.

          We are improving our documentation -- this will be properly noted in there.

          udeepta

          • Crashes of reduction kernels for certain stream lengths
            nberger
            So if this is supposed to be a "feature", is there any way to pad a stream with zeros without copying it to a new stream?
            Would it be possible to do something like that automatically, as doing a prime factorization every time I create a stream seems a bit painful?
              • Crashes of reduction kernels for certain stream lengths
                udeepta@amd

                Nik,

                Currently, the user takes the responsibility to verify that the stream size passes the above constraint when it is used in a reduction kernel. One can either test the constraint during stream creation, or copy to a padded stream just before calling reduction.

                But what padded value to use depends on the user's reduction kernel. For example, if we have a reduction that calculates the minimum value {b = min(b,a);} , we might want to pad with a very large number instead of zero.

                Creating a small function that checks whether there is any prime factor > 7 shouldn't be difficult -- I will create one when I find some time. :)

              • Crashes of reduction kernels for certain stream lengths
                nberger
                Is there an easy way to do the padding without copying the stream to main memory and element wise copy to a new, long enough stream?
                • Crashes of reduction kernels for certain stream lengths
                  Ceq
                  Certainly it would be handy to have a better way than copying the whole vector again...
                  I think some kind of substream selection command would do, for example:

                  float A<8>;
                  float subA(A, 1, 6);

                  SubA would 'point' to the same stream but without the first and last element, which is useful in some kernels where otherwise you'll have to write a boundary check that could slow down computation.
                  This way instead of copying to a new padded stream, you can use a bigger stream with valid size for reduction, and just use the selection for standard kernels.

                    • Crashes of reduction kernels for certain stream lengths
                      udeepta@amd

                      Exactly. You can use the domain feature -- A.domain(1,7) is a stream from A(1) to A(6).

                      float A<8>;
                      float B<6>;
                      mycopy(A.domain(1,7), B);

                      //kernel void mycopy( float a<>, out float b<> ) { b=a;}

                      You do not need to do the copying for padding a stream. If you declare a "large enough" stream to begin with, you can work only on the relevant part of it (using domain) until you need to do the padding -- at which point you pass the whole stream to the kernel.