Archives Discussions

Raistmer · ‎06-03-2009

is it possible to get more info about error?

App runs OK with CPU backend but fails to run on CAL backend.

errorLog() returned "Kernel Execution : Error with input streams

"

What typical reasons could lead to such situation?

ADDON:

The kernel in question:

kernel void GPU_fetch_array_kernel(float src[],int src_offset,out float dest<>)
{
dest+=src[src_offset+instance().x];
}

gaurav_garg · ‎06-04-2009

What are your stream dimensions? I have seen with recent drivers that large 1D streams > 8192 fails and shows the same error. Try to ckeck error on streams just after declaring them, it should give a better errorLog.

If you are using 1D streams > 8192, change your Catalyst to 9.2 and see if it works.

Raistmer · ‎06-04-2009

Originally posted by: gaurav.garg What are your stream dimensions? I have seen with recent drivers that large 1D streams > 8192 fails and shows the same error. Try to ckeck error on streams just after declaring them, it should give a better errorLog.

If you are using 1D streams > 8192, change your Catalyst to 9.2 and see if it works.

Thank you for hint.

Stream is 1D and its size >8192 indeed.

But there is no errors on stream creation and on stream filling from host memory buffer.

First error occured only in kernel that uses that stream as input parameter.

//Loading fold buffer data into GPU memory (into stream)
unsigned int stream_size=n_bins;
#if 1
fprintf(stderr,"Requested data stream size %u\n",stream_size);
#endif
brook::Stream gpu_data(1,&stream_size);
if(gpu_data.error())
fprintf(stderr,"ERROR in gpu_data (declaration): %s\n",gpu_data.errorLog());
gpu_data.read(data);
while(!gpu_data.isSync()) Sleep(0);
if(gpu_data.error())
fprintf(stderr,"ERROR in gpu_data: %s\n",gpu_data.errorLog());

Output is:
Requested data stream size 65536
And no errors from this fragment reported.

Will try to use older Catalyst drivers.

Raistmer · ‎06-04-2009

Changed Catalyst 9.5 to 9.2 and this error disappeared!

Thanks again for good advise.

Hope ATI/AMD will improve its software on next release, not degrade it...

Gipsel · ‎06-04-2009

Originally posted by: Raistmer

kernel void GPU_fetch_array_kernel(float src[],int src_offset,out float dest<>)
{
dest+=src[src_offset+instance().x];
}

AFAIK such a read-write access to the output stream is not allowed in Brook. I just tested it, and what actually happens is that

dest = 0.0f + src[src_offset+instance().x];

gets executed. At least that is what the StreamKernelAnalyzer tells me.

Raistmer · ‎06-04-2009

Thanks, it seems you are right. It should accumulate signal but seems doesn't. I know that test dataset contains few signals above threshold but running on CAL backend app founds no signals.

Again, the same app running on CPU backend found all signals that CPU version detected. It seems CPU backend far less usable for app checking than ATI advertises in its manuals... 😕

(BTW, this forum engine bugged in high degree. I tired to edit message - it get reparsed in something I don't intended to express).

Raistmer · ‎06-04-2009

From "Stream computing user guide" (they prohibit copy operation on pdf document, for what reason ???):

"

2.6.1.1 Dynamic Stream Management

Brook, BrookGPU, and the legacy version of Brook+ use a statically allocated stream graph and prohibit streams that are bound fr simultaneous read and write. At the C++ API level, there are no such restrictions ...

"

Now error from kernel:

Kernel Execution : Input stream is same as output stream.
Binding kernels read-write is prohibited.

What the hell ??

Ceq · ‎06-04-2009

Well, just in case you didn't know, you can rewrite it as follows:

kernel void GPU_fetch_array_kernel(float src[], int src_offset, float destI<>, out float dest< > ) {
dest = destI + src[ src_offset+instance().x ];
}

And call it with the same parameter for dest and destI:

GPU_fetch_array_kernel(src, offset, dest, dest);

Note that while doing this you can't perform gather/scatter operations on "dest", only streaming, as it would result in race conditions and undefined behaviour. If you get a runtime error about using the same parameter as input and output in the kernel, set the environment variable BRT_PERMIT_READ_WRITE_ALIASING = 1.

Gipsel · ‎06-04-2009

Originally posted by: Ceq

If you get a runtime error about using the same parameter as input and output in the kernel, set the environment variable BRT_PERMIT_READ_WRITE_ALIASING = 1.

The problem is that you have normally no control over environment variables on the system the app is running on. At least if you intend to distribute it to a lot of people, as Raistmer wants to do (think of applications for Distributed Computing projects like SETI ). Okay, you could deliver a setup script, setting the variable, but I would prefer another solution.

Ceq · ‎06-04-2009

If you don't like using a startup script you can change it inside the program, just use putenv function. Putenv can be used to set environment variables in a running program. Example:

int main(int argc, char *argv[]) {
putenv("BRT_PERMIT_READ_WRITE_ALIASING=1");
...

Raistmer · ‎06-04-2009

Originally posted by: Ceq If you don't like using a startup script you can change it inside the program, just use putenv function. Putenv can be used to set environment variables in a running program. Example:

int main(int argc, char *argv[]) { putenv("BRT_PERMIT_READ_WRITE_ALIASING=1"); ...

Thanks for hint, will keep it in mind, maybe it will be useful too.

Raistmer · ‎06-04-2009

LoL

Yes, it's exact that case.

I came to additional accumulator stream creation alredy too, thanks.

Gipsel · ‎06-04-2009

Originally posted by: Raistmer LoL

Yes, it's exact that case.

I know. Btw., I've chosen Milkyway@home as this much smaller project fits better to my limited time resources

You should be glad SETI works only with float values. Using doubles for MW forced me to basically write the kernels in IL assembly. I used brook only for prototyping. I experienced some quite severe bugs of the SDK which made the "repair" on the IL level necessary. But I was amazed to see that some of them (like a mixed up ordering of arguments in the constant cache of the GPU when using gather arrays) only apply if you are working with doubles.

Raistmer · ‎06-06-2009

Yes, doubles used only in few places, most of processing goes in float.

BTW,

putenv("BRT_PERMIT_READ_WRITE_ALIASING=1");

didn't work unfortunately (that is, CAL error remains). Setting env variable on system level works though.

Gipsel · ‎06-06-2009

Originally posted by: Raistmer

putenv("BRT_PERMIT_READ_WRITE_ALIASING=1");

didn't work unfortunately (that is, CAL error remains). Setting env variable on system level works though.

I guess the brook runtime is initialized (and reads the environment variable) at startup of the program, so it is too late to change it within the program.

Raistmer · ‎06-06-2009

Seems so.

Runtime exists as brook.dll so it loaded before main() called.

It's initialization functions are called before too perhaps.

Ceq · ‎06-06-2009

That is strange, as far as I know Brook+ runtime reads that variable the first time you define a stream. Maybe it is system dependant, I'm using WinXP x64, MSVC 2005, Brook+ 1.4 and Catalyst 9.5.

Try the following code:

File "ker.br"

kernel void inc(float in1< >, out float out1< > ) { out1 = in1 + 1.0f; }

File "main.cpp"

#include <cstdio >
#include <cstdlib >
#include "brook/Stream.h"
#include "built/ker.h"

using namespace std;
using namespace brook;

int main(int argc, char** argv) {

    unsigned int i, SIZE = 1 << 4;

    // Memory arrays
    float* v = (float*)malloc(SIZE * sizeof(float));

    // Set environment variable
    putenv("BRT_PERMIT_READ_WRITE_ALIASING=1"); // *********

    // Init
    for(i = 0; i < SIZE; ++i)
        v[i ] = (float)i;

    {
        // Stream arrays
        Stream<float > s(1, &SIZE);

        // Load
        s.read(v);

        // Kernel
        inc(s, s);

        // Save
        s.write(v);
    }

    // Print
    for(i = 0; i < 8; i++)
        printf("v[%i] = (%7.3f);\n", i, v[i ] );
}

Raistmer · ‎06-07-2009

Thanks, will try:

1)
.\built\ker.cpp(11) : error C2005: #line expected a line number, found '-'
I already encountered this error.
It appears when code in br file starts from first line.
In ATI counting starts from -1 perhaps, not from zero as in the rest of world 😉
(or big corporation guys can't imagine some source w/o copyrigt/left comments in few first dozens of code lines.... )

Healed by adding one blank line in the beginning of br file.

2)App output:

Kernel Execution Error: Input stream is same as output stream.
Binding kernels read-write is prohibited.
Environment variable BRT_PERMIT_READ_WRITE_ALIASING can be used to allow input-output aliasing.
But the results can be unpredictable.
v[0] = ( 0.000);
v[1] = ( 1.000);
v[2] = ( 2.000);
v[3] = ( 3.000);
v[4] = ( 4.000);
v[5] = ( 5.000);
v[6] = ( 6.000);
v[7] = ( 7.000);

OS is Vista x86 SP1, compiler - VC2005 SP1
Driver: Catalyst 9.2 (sorry, can't use 9.5 - it can't handle arrays of size I need).
SDK&RT: Brook 1.4 beta

youplaboom · ‎06-16-2009

I had this error too after instaling the latest catalyst driver 9.6 (I skipped 9.5)

It took me some time and tests to realize that the size of 1D streams was now limited to 8192. Go figure... The error message does not help either to understand what's going on exactly.

Just so you know, you can use catalyst 9.4; the size limitation appeared since version 9.5

Raistmer · ‎06-16-2009

Originally posted by: youplaboom

Just so you know, you can use catalyst 9.4; the size limitation appeared since version 9.5

Thanks!
Though I really surprised that so big disadvantage was keeped till next driver release....
The way brook+ handles data structures appears more and more restrictive...

Gipsel · ‎06-04-2009

Originally posted by: Raistmer Thanks, it seems you are right. It should accumulate signal but seems doesn't. I know that test dataset contains few signals above threshold but running on CAL backend app founds no signals.

My standard solution to this problem is to use two streams (one input and one for the accumulated output) and switch them between consecutive kernel calls.

Originally posted by: Raistmer

(BTW, this forum engine bugged in high degree. I tired to edit message - it get reparsed in something I don't intended to express).

You aren't telling me anything new