Stream domain addressing
Hi,
I would like to bring to your attention the following bug related
to stream domain addressing.
Could ATI have a look, thanks::
Here is the simplified code for illustration:
//==========================================================================
// two (very) simple kernel definitions
//==========================================================================
kernel void copy1( float input<>, out float output<> )
{ output = input; }
kernel void copy2( float input<>, out float out1<>, out float out2<> )
{ float v=input; out1=v ; out2=v; }
//==========================================================================
// test
//==========================================================================
extern void test(float *out_dat, float *in_dat, unsigned int siz_dat) {
int i;
float A<siz_dat>;
float B<siz_dat>;
float C<siz_dat>;
// alloc working memory
float *buff = (float *) malloc(siz_dat*sizeof(float));
if( buff == NULL ) {
printf( "Insufficient memory available\n" ); return;
}
// copy input values into buffer & fill A
// --------------------------------------
memcpy((void *)buff, (void *)in_dat, siz_dat*sizeof(float));
streamRead(A, buff);
// fill B & C
// ----------
for (i=0; i<siz_dat; i++) buff= 0.0f;
streamRead(B, buff);
for (i=0; i<siz_dat; i++) buff= -1.0f;
streamRead(C, buff);
// test simple copy : this works OK!
// ----------------------------------
copy1( A.domain(5, 15), B.domain(3, 13) );
streamWrite(A, buff); print_buff(buff,siz_dat);
streamWrite(B, buff); print_buff(buff,siz_dat);
// test double copy (well not so good ;-))
// ---------------------------------------
copy2( A.domain(4, 14), B.domain(2, 12), C.domain(3, 13) );
streamWrite(A, buff); print_buff(buff,siz_dat);
streamWrite(B, buff); print_buff(buff,siz_dat);
streamWrite(C, buff); print_buff(buff,siz_dat);
memcpy((void *)out_dat, (void *)buff, siz_dat*sizeof(float));
free (buff);
}
Here is the result :
at the beginning
A = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
B = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
C = -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
simple copy: A.domain(5, 15) -> B.domain(3, 13)
A = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
B = 0 0 0 5 6 7 8 9 10 11 12 13 14 0 0 0
that's OK
THEN: dual copy: A.domain(4, 14) -> B.domain(2, 12) and A.domain(4, 14) -> C.domain(3, 13) );
A = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
B = 0 0 4 5 6 7 8 9 10 11 12 13 14 0 0 0
C = -1 -1 4 5 6 7 8 9 10 11 12 13 -1 -1 -1 -1
C should have been
C = -1 -1 -1 4 5 6 7 8 9 10 11 12 13 -1 -1 -1
Apparently C.domain was aligned to B.domain for this copy
resulting effectively in :
A.domain(4, 14) -> B.domain(2, 12) which is correct
and A.domain(4, 14) -> C.domain(2, 12) which is a BUG !!!
I didn't fully try on 2D domains, but I suspect a similar error does exist.
I suspect the Brook compiler may be the culprit...
Could AMD please fix this issue or provide a quick turn around??
Many thanks
Jean-Claude