4 Replies Latest reply on Jun 19, 2009 4:54 PM by Sternenprinz

    [solved] Brook+ 1.4 - Address virtualiziation? Reduce Streams?

    Sternenprinz

      Hi,

      i got new problems while my experiments with Stream:

      1)

      If i create a kernel which just outputs data (e.g. zeros) in a 1D-Stream and the stream size exceeds the usual 8192 the output data is sometimes corrupted (usually one corrupted value at varying positions [looks like Bus-Errors -> often one Bit wrong], sometimes all values beginning at 8192 are corrupted or simply not transfered).

      On the other hand, if i create a 2D Stream of the same total size (e.g. 8192x8 instead of 65536) everything works fine.

      Any ideas what my Noobishness could have done wrong?(I cant provide a meaningfull demo, because the demo surprisingly works*)

       

      * Exactly(!) the same code as the buggy one, problem persits if both -the working/no working- versions are in the same build. Strange...

      2) Reduce streams...

      The following demo produces several different results, depending on 1D/2D Stream, single Value/output Stream, size of input Stream, input data (0s or 1s), counting dependend on input data/simply counting iterations, and so on...

      main.cpp:

      #include "brook/Device.h"
      #include "brook/Stream.h"
      #include
      #include "brookgenfiles/count.h"

      int main(int,char**)
      {
          #define    USE_REDUCESTREAM 0
          typedef unsigned int TYPE;
          static const unsigned int AMD_TEXTURE_SIZE    = 8192*16;
          static const unsigned int BUFFERSIZE        = 65536;
          static const unsigned int D1                = (BUFFERSIZE % AMD_TEXTURE_SIZE) ? (BUFFERSIZE % AMD_TEXTURE_SIZE) : AMD_TEXTURE_SIZE;
          static const unsigned int D2                = (BUFFERSIZE / AMD_TEXTURE_SIZE);
          static const unsigned int D                    = D2 ? 2 : 1;
         
          unsigned int streamSize[] = { D1 , D2 };
          TYPE zeroes[BUFFERSIZE];
          brook::Stream< TYPE >* zeroStream = new brook::Stream< TYPE >(D,streamSize);
          streamSize[0] = 1;
          streamSize[1] = 1;
          brook::Stream< TYPE >* reduceStream    = new brook::Stream< TYPE >(D,streamSize);

          for (int i=0;i
              zeroes = (TYPE)1;

              TYPE symbolCount = (TYPE)-1;
              zeroStream->read(zeroes);
      #if (USE_REDUCESTREAM)
                  countNonNull    (*zeroStream,*reduceStream);
                  reduceStream->write(&symbolCount);
      #else
                  countNonNull    (*zeroStream,symbolCount);
      #endif
              if (symbolCount == (TYPE)-1)
                  std::cout << "not done" << std::endl; else
                  std::cout << "result: "    << symbolCount;

          delete zeroStream;
          delete reduceStream;
          return 0;
      }

      count.br:

      reduce void countNonNull(uint data<>,reduce uint nonNullCount<>
      {
          if (data > (uint) 0)
              nonNullCount += (uint)1;
      }

      // MSVC 9.0 SP1 - Stream 1.4 Beta - Catalyst 9.6 - HD4650 - 790GX Chipset

        • Brook+ 1.4 - Address virtualiziation? Reduce Streams?
          gaurav.garg

          Catalyst 9.5 and later have a regression with 1D streams > 8192. If you check error on stream after kernel call, Brook+ should give an error.

          Reduction kernel mentioned by you will not work properly and is dependent on input data order. Look at the last paragraph of section A.4.1.2 of stream computing user guide.

            • Brook+ 1.4 - Address virtualiziation? Reduce Streams?
              Sternenprinz

              Hi gaurav,

              i am sorry, but i cant see why the kernel (2nd prob.) is dependent on input data order*. Furthermore problem persits if the kernel only counts its invocations (i.e. reduce_var += 1). Maybe both problems are in connection with the regression you mentionend. I will check that.

              Thanks for your help!

              * since it uses an addition, has no explicit dependency on invocation order.

                • Brook+ 1.4 - Address virtualiziation? Reduce Streams?
                  gaurav.garg

                  Did you read the mentioned section? Is your input data values only 0 or 1?

                  If yes, the reduction kernel is similar to-

                  reduce void countNonNull(uint data<>,reduce uint nonNullCount<> )
                  {
                          nonNullCount += data;
                  }

                  and it should work fine. But, if the values are not 0 or 1, reduction kernel would not work because of the constraints given by the above mentioed section.

                    • Brook+ 1.4 - Address virtualiziation? Reduce Streams?
                      Sternenprinz

                      Hi,

                      i got the problem.

                      Section 4.1.2 says:

                      "The requirement that the operation be assiociative and commutative means that the result is independent of evaluation order, modulo, any issues due to limited floating point precision."

                      What it does not say, is that

                      even the constant '1' is an implicit result of an arbitrary function f(x) (f(x) = 1) which implicit takes the current stream argument as argument, which is forbidden in reduce streams (at the end of 4.1.2).

                      So far to 'not so well documented' :-/

                      BTW: Problem solved, it was the regression thing...