cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Sternenprinz
Journeyman III

short2 in Stream 1.4 Beta supported? Bug?

Hi,

while my first steps with Stream i got the following problem:

Modifying the 'Copy-Kernel' from the samples from 'float' to 'short' works, to 'short2' crashes constantly (-> Program hangs, -> GPU-Recover tries to get GPU back up). After 3-4 crashes that way all GPU acceleration including 2D-Desktop acceleration vanishes until system reboot.

Working:

kernel void copy(short i<>, out short o<>
{
    o = i;
}

Not-Working:

kernel void copy(short2 i<>, out short2 o<>
{
    o = i;
}

Any ideas?

 

System Details:

Driver Packaging Version    8.612-090428a-080257C-ATI (current Catalyst 9.5)

Graphics Chipset    ATI Radeon HD 4650

Stream Distribution: 1.4 Beta (aquired 6.15 -> yesterday)

OS: WinXP SP3 32bit

 

0 Likes
9 Replies
gaurav_garg
Adept I

I did the same change in the sample under \samples\CPP\tutorials\SimpleKernel and it works fine for me. I have driver 8.62 (Catalyst 9.6) on my system.

0 Likes

Hi Sternenprinz.

Short2 works fine for me too, maybe there is something wrong with the memory allocation and the stacks gets corrupted or something like that. Just to make sure test the following code:

 

File "test.br":

#include <  stdio.h >
#include < stdlib.h >

kernel void
copy(short2 in1< >, out short2 out1< > ) {
    out1 = in1;
}


int main(int argc, char* argv[ ] ) {
    const int SIZE = 1 << 8;
    unsigned int i;

    // Memory arrays
    short vIn1[SIZE];
    short vOut[SIZE];

    // Init
    for(i = 0; i < SIZE; ++i) {
        vIn1[i ] = (short) i;
        vOut[i ] = 0;
    }

    {
        int halfSIZE = SIZE / 2;
        // Stream arrays
        short2 sIn1< halfSIZE >;
        short2 sOut< halfSIZE >;
        // Load
        streamRead(sIn1, vIn1 );
        // Kernel
        copy(sIn1, sOut );
        // Save
        streamWrite(sOut, vOut );
    }

    // Check
    for(i = 0; i < SIZE; i++)
        if(vIn1[i ] != vOut[i ] ) break;
    if(i == SIZE) printf("Test OK\n" );
    else printf("Error : vIn[%i] = %i, vOut[%i] = %i\n",
        i, vIn1[i ], i, vOut[i ] );

    return 0;
}

 

// WinXP 64, MSVC 2005, Radeon 4850, Brook+ 1.4, Catalyst 9.6

0 Likes

Hi,

thanks a lot for your answers. I guess my problem is elsewhere, definetly in combination with the difference between 'short' and 'short2'.

In general i want to work on a lot (e.g. 100MB) data so my first shot was to use a trivial loop that iterates over constant sized blocks. For each cycle i do:

inputStream.read(ibuf);

copy(inputStream,outputStream);

outputStream.write(obuf);

but after some hundred iterations it crashes (not depending on the stream size).

A rather compact code that shows this behaviour:

main.cpp:

#include "brook/Device.h"

#include "brook/Stream.h"

#include "brookgenfiles/copy.h"

int main(int,char**)
{

    int iterations = 1000; // crashes here - 500 works fine
    int size       = 256;     // does not matter - same with 8192
    short* buf  = new short[size]; // content does not matter

    unsigned int streamSize[] = { size/2 };
    brook::Stream< short2 > inputStream (1, streamSize);
    brook::Stream< short2 > outputStream(1, streamSize);
    for (int i=0;i<iterations;i++)
    {
        inputStream.read(buf);
        copy(inputStream,outputStream);
        outputStream.write(buf);
    }
}

copy.br:

kernel void copy(short2 i<>, out short2 o<>)
{
    o = i;
}

If i change the type from 'short2' to 'short' and streamSize from 'size/2' to 'size' everything works fine (i.e. also for some thousand iterations).

// MSVC 9.0 SP1 - Catalyst 9.6 - Stream Dist. 1.4

0 Likes

Have you disabled VPU recover?

0 Likes

No! And without VPU-Recover it works, but only sometimes.

0 Likes

Gaurav, running it only once works fine for me too, but if I run it several times my systems ends crashing.

I did another test: I deleted the copy kernel call, used a bigger number of iterations like 10000 and executed it several times, the systems ends crashing just copying data. If I try enabling VPU recover the GPU is recovered, but actually there wasn't any kernel call ?!

Looks like it could be related to CPU-GPU memory copy routines.

0 Likes

Exactly the same here!

It worked some times for me, so i editet the topic. But further testings showed the same behaviour as for Ceq, it works 2-3 times then

- the system crashes with a BSOD (without VPU-Recover [WinXP says the GraphicsDriver would not respond due to a live-lock])

or

- the App hangs and the GPU gets recovered (with VPU recover). But after some cycles that way, all graphics acceleration including 2D vanishes (as mentioned earlier).

But only for 'short2', not for 'short'.

0 Likes

Yes, somehow "short" works right. However "short2" isn't the only affected type, looks like "float" and "float2" have the same issue. I didn't test more types because at the third VPU recover you lose hardware acceleration.

0 Likes

It looks like if this is a bug in the Catalyst 9.5/9.6 driver...

At last even the 'PCIe Speedtest' program crashed here. Therefore i went to Catalyst 9.4 which works fine.

0 Likes