cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

foxx1337
Adept I

*SOLVED* about the maximum adressable size for an 1D stream

*SOLVED*

Hello.

 

I've written a simple reduction kernel (btw, there's a bug in the Stream_Computing_User_Guide.pdf shipped with brook+ 1.2.1 beta at the section about reduction kernels, i think that syntax isn't supported anymore) which looks like:

 

reduce void GpuSum(float a<>, reduce float s)
{
    s += a;
}

float Sum(int n, float *a)
{
    float sa;
    float r;

    streamRead(sa, a);
    GpuSum(sa, r);

    return r;
}

 

The main function calling this is like:

int main()
{
    const int SIZE = 100 * 1024;
    float a[SIZE];

    for (int i = 0; i < SIZE; ++i)
        *(a + i) = static_cast(i);
    float r = Sum(SIZE, a);
    printf("%f\n", r);
    return 0;
}

Thing is i can't get SIZE to be 1024 * 1024 (program simply crashes). 100 * 1024 works fine. So does 224 * 1024; 228 * 1024 prints "Failed to find usable kernel fragment to implement requested reduction." as if no address virtualization was supplied, while 256 * 1024 directly makes my app crash.

Is this normal? In documentation is stated that 1D arrays of up to 8192 * 8192 = 64M elements can be accessed when brook-compiled without -r, so what am i doing wrong here?

 

4850 with 8.10 whql
2008 server x64
brook+ and cal beta 1.2.1 x64
visual studio 2008 with x64 Release build.

 

LE

ok, by simply modifying in main to

    const int SIZE = 1024 * 1024;
    float *a = new float[ SIZE ];

the thing seems to work now, but when size is 228 * 1024, the program still gives the "Failed to find usable kernel fragment to implement requested reduction." error.

0 Likes
2 Replies
Ceq
Journeyman III

Hi foxx1337, by default visual studio uses a small stack, so things like:
const int SIZE = 1024 * 1024;
float a[SIZE];
won't work unless you change stack default size:
project -> properties -> linker -> system -> stack
(Using new works because allocates heap memory)

About the error using 228 * 1024 input there is a restriction that size can only have 2, 3, 5 and 7 as factors:
http://forums.amd.com/devforum...id=96153&enterthread=y
0 Likes

spot on!

thank you and keep it up

0 Likes