cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

kartik
Journeyman III

Brook+ help

Kernel cannot create temporary linear streams

Hi, I've just started coding on brook+. I have a 4870 installed on my pc and I tried to run a simple convolution kernel on the gpu, but it returns BR_KERNEL_ERROR. Stepping through with the debugger, I see the error string

+        _errorString    "Kernel Execution : Failed to create temporary linear stream
"    std::basic_string,std::allocator > *
 'kernel ecexcution

This same code runs on another 4870 correctly, so I'm not sure why its failing on my card. This is my kernel

kernel void convolute(float a[][], float b[][], out float c[][],int J,int K)
{
    int j,k;
    int2 ind = instance().xy;
   
    float val = 0.0f;
   
    for (j=0;j
    {
        for (k=0;k
        {
            val = (((ind.x-j)<0) || ((ind.y-k)<0)) ? 0.0f : a[ind.x-j][ind.y-k];
            c[ind.x][ind.y] += b*val;
        }
    }
}

Any help on this would be most appreciated. Thanks

Kartik

 

*UPDATE : some of the brook+ sample apps like matrixmult cause my gpu to hang and VPU recover has to reset the gpu. Has anyone else experienced this before?

0 Likes
5 Replies
ryta1203
Journeyman III

1) You can disable VPU Recover

2) You can't use an "out" scatter stream (or reg out stream) as both input and output, which you are doing by the "+=" operation. You need to maybe look at using a reduce kernel there somehow, or use a temp variable instead. On top of that, why does "c" need to be a scatter stream, it looks like you could just use a regular stream there "out float c<>"
0 Likes

Thanks ryta1203, Yes, c doesn't have to be a scatter stream. I figured out the reason for the hang. Apparently 'float a[][]' needs to be of size atleast 64x64. I was using 10x10 and I guess this is not allowed.

 

0 Likes

Originally posted by: kartik Thanks ryta1203, Yes, c doesn't have to be a scatter stream. I figured out the reason for the hang. Apparently 'float a[][]' needs to be of size atleast 64x64. I was using 10x10 and I guess this is not allowed.


There is no limit on stream sizes, so size of 10X10 should just work fine. But, scatter implementation requires creation of temporary linear CAL buffers (creation of this buffer fails on your card - may be you didn't have enough memory). One workaround is to avoid use of scatter stream and instead use regular output stream. As in your case, you are writing to only the instance position of kernel, reular output stream should just work fine.

0 Likes

Hi - I found this problem similar to mine but I do need to have scatter stream on input and output as I do need indexing of input and output streams.

Is there a way that I could avoid this error, thanks

0 Likes

SDK 1.3 requires scatter width to be multiple of 64. This has been fixed in 1.4.

 

0 Likes