jean-claude

Bug report for stream domain indexes

Discussion created by jean-claude on Oct 5, 2008
Latest reply on Oct 9, 2008 by MicahVillmow
Stream domain addressing

Hi,

I would like to bring to your attention the following bug related
to stream domain addressing.

Could ATI have a look, thanks::


Here is the simplified code for illustration:


//==========================================================================
// two (very) simple kernel definitions
//==========================================================================
kernel void copy1( float input<>, out float output<> )
{ output = input; }

kernel void copy2( float input<>, out float out1<>, out float out2<> )
{ float v=input; out1=v ; out2=v; }


//==========================================================================
// test
//==========================================================================
extern void test(float *out_dat, float *in_dat, unsigned int siz_dat) {
    int i;
   
    float A<siz_dat>;
    float B<siz_dat>;
    float C<siz_dat>;
   
    // alloc working memory
    float *buff = (float *) malloc(siz_dat*sizeof(float));
    if( buff == NULL ) {
        printf( "Insufficient memory available\n" ); return;
    }

    // copy input values into buffer & fill A
    // --------------------------------------
    memcpy((void *)buff, (void *)in_dat, siz_dat*sizeof(float));
    streamRead(A, buff);

    // fill B & C
    // ----------
    for (i=0; i<siz_dat; i++) buff=  0.0f;
    streamRead(B, buff);
    for (i=0; i<siz_dat; i++) buff
= -1.0f;
    streamRead(C, buff);

    // test simple copy : this works OK!
    // ----------------------------------
    copy1( A.domain(5, 15), B.domain(3, 13) );
    streamWrite(A, buff);        print_buff(buff,siz_dat);
    streamWrite(B, buff);        print_buff(buff,siz_dat);

    // test double copy (well not so good ;-))
    // ---------------------------------------
    copy2( A.domain(4, 14), B.domain(2, 12), C.domain(3, 13) );
    streamWrite(A, buff);        print_buff(buff,siz_dat);
    streamWrite(B, buff);        print_buff(buff,siz_dat);
    streamWrite(C, buff);        print_buff(buff,siz_dat);

    memcpy((void *)out_dat, (void *)buff, siz_dat*sizeof(float));
    free (buff);
}



Here is the result :

at the beginning

A =  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
B =  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
C = -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1


simple copy: A.domain(5, 15) -> B.domain(3, 13)

A =  0  1  2  3  4  5  6  7   8   9 10 11 12 13 14 15
B =  0  0  0  5  6  7  8  9 10 11 12 13 14   0   0   0

that's OK

THEN: dual copy: A.domain(4, 14) -> B.domain(2, 12) and A.domain(4, 14) -> C.domain(3, 13) );

A =   0   1   2   3   4   5   6   7    8    9  10  11  12  13  14  15
B =   0   0   4   5   6   7   8   9  10  11  12  13  14    0    0    0
C = -1  -1   4   5   6   7   8   9  10  11  12  13   -1   -1   -1  -1

C should have been
C = -1  -1  -1   4   5   6   7   8   9  10  11  12  13  -1  -1  -1

Apparently C.domain was aligned to B.domain for this copy
resulting effectively in :
A.domain(4, 14) -> B.domain(2, 12)      which is correct
and A.domain(4, 14) -> C.domain(2, 12)  which is a BUG !!!


I didn't fully try on 2D domains, but I suspect a similar error does exist.

I suspect the Brook compiler may be the culprit...

Could AMD please fix this issue or provide a quick turn around??

Many thanks

Jean-Claude

Outcomes