20 Replies Latest reply on Jun 16, 2009 7:32 AM by Raistmer

    Kernel Execution : Error with input streams

    Raistmer
      is it possible to get more info about error?

      App runs OK with CPU backend but fails to run on CAL backend.

      errorLog() returned "Kernel Execution : Error with input streams

      "

      What typical reasons could lead to such situation?

       ADDON:

      The kernel in question:

      kernel void GPU_fetch_array_kernel(float src[],int src_offset,out float dest<>)
      {
      dest+=src[src_offset+instance().x];
      }

       

       

        • Kernel Execution : Error with input streams
          gaurav.garg

          What are your stream dimensions? I have seen with recent drivers that large 1D streams > 8192 fails and shows the same error. Try to ckeck error on streams just after declaring them, it should give a better errorLog.

          If you are using 1D streams > 8192, change your Catalyst to 9.2 and see if it works.

            • Kernel Execution : Error with input streams
              Raistmer

               

              Originally posted by: gaurav.garg What are your stream dimensions? I have seen with recent drivers that large 1D streams > 8192 fails and shows the same error. Try to ckeck error on streams just after declaring them, it should give a better errorLog.

              If you are using 1D streams > 8192, change your Catalyst to 9.2 and see if it works.

              Thank you for hint.

              Stream is 1D and its size >8192 indeed.

              But there is no errors on stream creation and on stream filling from host memory buffer.

              First error occured only in kernel that uses that stream as input parameter.

              //Loading fold buffer data into GPU memory (into stream)
              unsigned int stream_size=n_bins;
              #if 1
              fprintf(stderr,"Requested data stream size %u\n",stream_size);
              #endif
              brook::Stream gpu_data(1,&stream_size);
              if(gpu_data.error())
              fprintf(stderr,"ERROR in gpu_data (declaration): %s\n",gpu_data.errorLog());
              gpu_data.read(data);
              while(!gpu_data.isSync()) Sleep(0);
              if(gpu_data.error())
              fprintf(stderr,"ERROR in gpu_data: %s\n",gpu_data.errorLog());

              Output is:
              Requested data stream size 65536
              And no errors from this fragment reported.

              Will try to use older Catalyst drivers.

               

            • Kernel Execution : Error with input streams
              Gipsel

               

              Originally posted by: Raistmer

              kernel void GPU_fetch_array_kernel(float src[],int src_offset,out float dest<>)
              {
                 dest+=src[src_offset+instance().x];
              }



              AFAIK such a read-write access to the output stream is not allowed in Brook. I just tested it, and what actually happens is that

              dest = 0.0f + src[src_offset+instance().x];

              gets executed. At least that is what the StreamKernelAnalyzer tells me.

                • Kernel Execution : Error with input streams
                  Raistmer

                  Thanks, it seems you are right. It should accumulate signal but seems doesn't. I know that test dataset contains few signals above threshold but running on CAL backend app founds no signals.

                  Again, the same app running on CPU backend found all signals that CPU version detected. It seems CPU backend far less usable for app checking than ATI advertises in its manuals...   :/

                  (BTW, this forum engine bugged in high degree. I tired to edit message - it get reparsed in something I don't intended to express).

                    • Kernel Execution : Error with input streams
                      Raistmer

                      From "Stream computing user guide" (they prohibit copy operation on pdf document, for what reason ???):

                      "

                      2.6.1.1 Dynamic Stream Management

                      Brook, BrookGPU, and the legacy version of Brook+ use a statically allocated stream graph and prohibit streams that are bound fr simultaneous read and write. At the C++ API level, there are no such restrictions ...

                      "

                      Now error from kernel:

                      Kernel Execution : Input stream is same as output stream.
                      Binding kernels read-write is prohibited.

                      What the hell ??

                       

                        • Kernel Execution : Error with input streams
                          Ceq

                          Well, just in case you didn't know, you can rewrite it as follows:

                          kernel void GPU_fetch_array_kernel(float src[], int src_offset, float destI<>, out float dest< > ) {
                          dest = destI + src[ src_offset+instance().x ];
                          }

                          And call it with the same parameter for dest and destI:

                          GPU_fetch_array_kernel(src, offset, dest, dest);

                           

                          Note that while doing this you can't perform gather/scatter operations on "dest", only streaming, as it would result in race conditions and undefined behaviour. If you get a runtime error about using the same parameter as input and output in the kernel, set the environment variable BRT_PERMIT_READ_WRITE_ALIASING = 1.

                            • Kernel Execution : Error with input streams
                              Gipsel

                               

                              Originally posted by: Ceq

                              If you get a runtime error about using the same parameter as input and output in the kernel, set the environment variable BRT_PERMIT_READ_WRITE_ALIASING = 1.



                              The problem is that you have normally no control over environment variables on the system the app is running on. At least if you intend to distribute it to a lot of people, as Raistmer wants to do (think of applications for Distributed Computing projects like SETI ). Okay, you could deliver a setup script, setting the variable, but I would prefer another solution.

                                • Kernel Execution : Error with input streams
                                  Ceq

                                  If you don't like using a startup script you can change it inside the program, just use putenv function. Putenv can be used to set environment variables in a running program. Example:

                                  int main(int argc, char *argv[]) {
                                  putenv("BRT_PERMIT_READ_WRITE_ALIASING=1");
                                  ...

                                    • Kernel Execution : Error with input streams
                                      Raistmer

                                       

                                      Originally posted by: Ceq If you don't like using a startup script you can change it inside the program, just use putenv function. Putenv can be used to set environment variables in a running program. Example:

                                      int main(int argc, char *argv[]) { putenv("BRT_PERMIT_READ_WRITE_ALIASING=1"); ...

                                      Thanks for hint, will keep it in mind, maybe it will be useful too.

                                    • Kernel Execution : Error with input streams
                                      Raistmer

                                      LoL

                                      Yes, it's exact that case.

                                      I came to additional accumulator stream creation alredy too, thanks.

                                       

                                        • Kernel Execution : Error with input streams
                                          Gipsel

                                           

                                          Originally posted by: Raistmer LoL

                                          Yes, it's exact that case.



                                          I know. Btw., I've chosen Milkyway@home as this much smaller project fits better to my limited time resources

                                          You should be glad SETI works only with float values. Using doubles for MW forced me to basically write the kernels in IL assembly. I used brook only for prototyping. I experienced some quite severe bugs of the SDK which made the "repair" on the IL level necessary. But I was amazed to see that some of them (like a mixed up ordering of arguments in the constant cache of the GPU when using gather arrays) only apply if you are working with doubles.

                                            • Kernel Execution : Error with input streams
                                              Raistmer

                                              Yes, doubles used only in few places, most of processing goes in float.

                                              BTW,

                                              putenv("BRT_PERMIT_READ_WRITE_ALIASING=1");

                                              didn't work unfortunately (that is, CAL error remains). Setting env variable on system level works though.

                                               

                                                • Kernel Execution : Error with input streams
                                                  Gipsel

                                                  Originally posted by: Raistmer

                                                  putenv("BRT_PERMIT_READ_WRITE_ALIASING=1");

                                                  didn't work unfortunately (that is, CAL error remains). Setting env variable on system level works though.



                                                  I guess the brook runtime is initialized (and reads the environment variable) at startup of the program, so it is too late to change it within the program.

                                                    • Kernel Execution : Error with input streams
                                                      Raistmer

                                                      Seems so.

                                                      Runtime exists as brook.dll so it loaded before main() called.

                                                      It's initialization functions are called before too perhaps.

                                                        • Kernel Execution : Error with input streams
                                                          Ceq

                                                          That is strange, as far as I know Brook+ runtime reads that variable the first time you define a stream. Maybe it is system dependant, I'm using WinXP x64, MSVC 2005, Brook+ 1.4 and Catalyst 9.5.

                                                          Try the following code:

                                                          File "ker.br"

                                                          kernel void inc(float in1< >, out float out1< > ) { out1 = in1 + 1.0f; }

                                                           

                                                          File "main.cpp"

                                                          #include <cstdio >
                                                          #include <cstdlib >
                                                          #include "brook/Stream.h"
                                                          #include "built/ker.h"

                                                          using namespace std;
                                                          using namespace brook;

                                                          int main(int argc, char** argv) {

                                                              unsigned int i, SIZE = 1 << 4;

                                                              // Memory arrays
                                                              float* v = (float*)malloc(SIZE * sizeof(float));

                                                              // Set environment variable
                                                              putenv("BRT_PERMIT_READ_WRITE_ALIASING=1"); // *********

                                                              // Init
                                                              for(i = 0; i < SIZE; ++i)
                                                                  v[i ] =  (float)i;

                                                              {
                                                                  // Stream arrays
                                                                  Stream<float > s(1, &SIZE);
                                                                  
                                                                  // Load
                                                                  s.read(v);

                                                                  // Kernel
                                                                  inc(s, s);

                                                                  // Save
                                                                  s.write(v);
                                                              }

                                                              // Print
                                                              for(i = 0; i < 8; i++)
                                                                  printf("v[%i] = (%7.3f);\n", i, v[i ] );
                                                          }

                                                            • Kernel Execution : Error with input streams
                                                              Raistmer
                                                              Thanks, will try:

                                                              1)
                                                              .\built\ker.cpp(11) : error C2005: #line expected a line number, found '-'
                                                              I already encountered this error.
                                                              It appears when code in br file starts from first line.
                                                              In ATI counting starts from -1 perhaps, not from zero as in the rest of world ;)
                                                              (or big corporation guys can't imagine some source w/o copyrigt/left comments in few first dozens of code lines.... )

                                                              Healed by adding one blank line in the beginning of br file.

                                                              2)App output:

                                                              Kernel Execution Error: Input stream is same as output stream.
                                                              Binding kernels read-write is prohibited.
                                                              Environment variable BRT_PERMIT_READ_WRITE_ALIASING can be used to allow input-output aliasing.
                                                              But the results can be unpredictable.
                                                              v[0] = ( 0.000);
                                                              v[1] = ( 1.000);
                                                              v[2] = ( 2.000);
                                                              v[3] = ( 3.000);
                                                              v[4] = ( 4.000);
                                                              v[5] = ( 5.000);
                                                              v[6] = ( 6.000);
                                                              v[7] = ( 7.000);

                                                              OS is Vista x86 SP1, compiler - VC2005 SP1
                                                              Driver: Catalyst 9.2 (sorry, can't use 9.5 - it can't handle arrays of size I need).
                                                              SDK&RT: Brook 1.4 beta
                                                                • Kernel Execution : Error with input streams
                                                                  youplaboom

                                                                  I had this error too after instaling the latest catalyst driver 9.6 (I skipped 9.5)

                                                                  It took me some time and tests to realize that the size of 1D streams was now limited to 8192. Go figure... The error message does not help either to understand what's going on exactly.

                                                                  Just so you know, you can use catalyst 9.4; the size limitation appeared since version 9.5

                                              • Kernel Execution : Error with input streams
                                                Gipsel

                                                 

                                                Originally posted by: Raistmer Thanks, it seems you are right. It should accumulate signal but seems doesn't. I know that test dataset contains few signals above threshold but running on CAL backend app founds no signals.


                                                My standard solution to this problem is to use two streams (one input and one for the accumulated output) and switch them between consecutive kernel calls.

                                                 

                                                Originally posted by: Raistmer

                                                (BTW, this forum engine bugged in high degree. I tired to edit message - it get reparsed in something I don't intended to express).



                                                You aren't telling me anything new