Hi,
while my first steps with Stream i got the following problem:
Modifying the 'Copy-Kernel' from the samples from 'float' to 'short' works, to 'short2' crashes constantly (-> Program hangs, -> GPU-Recover tries to get GPU back up). After 3-4 crashes that way all GPU acceleration including 2D-Desktop acceleration vanishes until system reboot.
Working:
kernel void copy(short i<>, out short o<>
{
o = i;
}
Not-Working:
kernel void copy(short2 i<>, out short2 o<>
{
o = i;
}
Any ideas?
System Details:
Driver Packaging Version 8.612-090428a-080257C-ATI (current Catalyst 9.5)
Graphics Chipset ATI Radeon HD 4650
Stream Distribution: 1.4 Beta (aquired 6.15 -> yesterday)
OS: WinXP SP3 32bit
I did the same change in the sample under \samples\CPP\tutorials\SimpleKernel and it works fine for me. I have driver 8.62 (Catalyst 9.6) on my system.
Hi Sternenprinz.
Short2 works fine for me too, maybe there is something wrong with the memory allocation and the stacks gets corrupted or something like that. Just to make sure test the following code:
File "test.br":
#include < stdio.h >
#include < stdlib.h >
kernel void
copy(short2 in1< >, out short2 out1< > ) {
out1 = in1;
}
int main(int argc, char* argv[ ] ) {
const int SIZE = 1 << 8;
unsigned int i;
// Memory arrays
short vIn1[SIZE];
short vOut[SIZE];
// Init
for(i = 0; i < SIZE; ++i) {
vIn1[i ] = (short) i;
vOut[i ] = 0;
}
{
int halfSIZE = SIZE / 2;
// Stream arrays
short2 sIn1< halfSIZE >;
short2 sOut< halfSIZE >;
// Load
streamRead(sIn1, vIn1 );
// Kernel
copy(sIn1, sOut );
// Save
streamWrite(sOut, vOut );
}
// Check
for(i = 0; i < SIZE; i++)
if(vIn1[i ] != vOut[i ] ) break;
if(i == SIZE) printf("Test OK\n" );
else printf("Error : vIn[%i] = %i, vOut[%i] = %i\n",
i, vIn1[i ], i, vOut[i ] );
return 0;
}
// WinXP 64, MSVC 2005, Radeon 4850, Brook+ 1.4, Catalyst 9.6
Hi,
thanks a lot for your answers. I guess my problem is elsewhere, definetly in combination with the difference between 'short' and 'short2'.
In general i want to work on a lot (e.g. 100MB) data so my first shot was to use a trivial loop that iterates over constant sized blocks. For each cycle i do:
inputStream.read(ibuf);
copy(inputStream,outputStream);
outputStream.write(obuf);
but after some hundred iterations it crashes (not depending on the stream size).
A rather compact code that shows this behaviour:
main.cpp:
#include "brook/Device.h"
#include "brook/Stream.h"
#include "brookgenfiles/copy.h"
int main(int,char**)
{
int iterations = 1000; // crashes here - 500 works fine
int size = 256; // does not matter - same with 8192
short* buf = new short[size]; // content does not matter
unsigned int streamSize[] = { size/2 };
brook::Stream< short2 > inputStream (1, streamSize);
brook::Stream< short2 > outputStream(1, streamSize);
for (int i=0;i<iterations;i++)
{
inputStream.read(buf);
copy(inputStream,outputStream);
outputStream.write(buf);
}
}
copy.br:
kernel void copy(short2 i<>, out short2 o<>)
{
o = i;
}
If i change the type from 'short2' to 'short' and streamSize from 'size/2' to 'size' everything works fine (i.e. also for some thousand iterations).
// MSVC 9.0 SP1 - Catalyst 9.6 - Stream Dist. 1.4
Have you disabled VPU recover?
No! And without VPU-Recover it works, but only sometimes.
Gaurav, running it only once works fine for me too, but if I run it several times my systems ends crashing.
I did another test: I deleted the copy kernel call, used a bigger number of iterations like 10000 and executed it several times, the systems ends crashing just copying data. If I try enabling VPU recover the GPU is recovered, but actually there wasn't any kernel call ?!
Looks like it could be related to CPU-GPU memory copy routines.
Exactly the same here!
It worked some times for me, so i editet the topic. But further testings showed the same behaviour as for Ceq, it works 2-3 times then
- the system crashes with a BSOD (without VPU-Recover [WinXP says the GraphicsDriver would not respond due to a live-lock])
or
- the App hangs and the GPU gets recovered (with VPU recover). But after some cycles that way, all graphics acceleration including 2D vanishes (as mentioned earlier).
But only for 'short2', not for 'short'.
Yes, somehow "short" works right. However "short2" isn't the only affected type, looks like "float" and "float2" have the same issue. I didn't test more types because at the third VPU recover you lose hardware acceleration.
It looks like if this is a bug in the Catalyst 9.5/9.6 driver...
At last even the 'PCIe Speedtest' program crashed here. Therefore i went to Catalyst 9.4 which works fine.