cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

genchan
Journeyman III

Decrement Loop doesn't work correctly.

Hello.

I made a program including loop which calls a simple function.

I had made it with "decrement" loop at first, but it didn't work correctly.

I remade it with "increment" loop, it worked.

I think this is a bug of optimizer.

Sample of bad case is here:

void Replace( unsigned short a[], unsigned short To, unsigned short From  )

{

   // simple load, and, not and, or, store procedures.

}

void Sample( __local unsigned short img[128*128] )

{

unsigned short L[64]; unsigned short R[64];

// after initialization of L, R.....

unsigned int m;

for( m=63; m<64; --m ){     // <- This is the point. An increment loop works correctly.

     unsigned short To = L;

     unsigned short From = R;

     if( From == To || !From || !To ){ continue; }

     // some function calls......

     Replace( R, To, From );

     // subsequent processes.....

}

}

My environment:

     Driver Package: 9.012-121219a-152192C-ATI

     Catalyst Version: 13.1

     Radeon HD 7970

0 Likes
7 Replies
LeeHowes
Staff

for( m=63; m<64; --m ){

For m is 63, while m < 64, subtract one from m? It's going to loop forever, the bounds make no sense. Is that a typo in your post or an actual bug in your code?

You could do:

for( unsigned int m = 63; m > 0; --m )

but don't do:

for( unsigned int m = 63; m >= 0; --m )

with an unsigned int, whatever you do, because that also will loop forever

0 Likes

Hello, Lee. Thank you for your reply.

Sorry, I'm careless of my explanation.

I mean the result of replacement with "decrement loop" is not equal what I expected.

After change into "increment loop", it worked as I expected.

The replacement code is like this. (simple replacement)

void Replace( unsigned short dst[], const unsigned short To, const unsigned short From )

{

     const ushort8 vFrom = (ushort8)( From );

     const ushort8 vTo = (ushort8)( To );

     size_t i;

     const size_t ip = 64 >> 3;

     ushort8 flag, val;

     for( i=0; i<ip; ++i ){

          val = vload8( i, dst );

          flag = convert_ushort8( val == vFrom );

          vstore8( ( vTo & flag ) | ( val & ~flag ), i, dst );

     }

}

-------------------------------------------------------

You know, m is "unsigned int".

So, it is between 0 and 4294967295.

When m = 0, m goes to 4294967295 after decrement.

for( unsigned int m=63; m<64; --m ) doesn't loop forever, and I confirmed that.

0 Likes

I'm still not entirely sure I understand your question. The new loop only goes from 0 to 16 (>>3). What exactly is the difference in the iteration spaces of the two loops? Have you tried just writing i out and seeing what the values you get are? You could try using printf if it doesn't change the way the compiler generates the code too significantly.

0 Likes

I tried these 3 loops.

The combination "Decremental Loop" and "Function Call" didn't worked correctly.

So that I thought "Optimizer is something wrong."

1. Good :

for( unsigned int m=63; m<64; --m ){    // <- Decremental Loop

     unsigned short To = L;

     unsigned short From = R;

     if( From == To || !From || !To ){ continue; }

     // replacement without function cal

     for( unsigned int mm = m; mm<64; --mm ){ R[mm] = R[mm] == R ? L : R[mm] }

}

2. Bad :

for( unsigned int m=63; m<64; --m ){    // <- Decremental Loop

     unsigned short To = L;

     unsigned short From = R;

     if( From == To || !From || !To ){ continue; }

      Replace( R, To, From );               // <- Replacement Function Call

}

3. Good :

for( unsigned int m=0; m<64; ++m ){     // <- Incremental Loop

     unsigned short To = L;

     unsigned short From = R;

     if( From == To || !From || !To ){ continue; }

      Replace( R, To, From );                   // <- Replacement Function Call

}

0 Likes

Your for loops are  quite crazy . Why to use unsigned integers for loops, when ints can do the job.

also I do not see "to" and "from" variables used in case 1. Also in case 2 & 3, I see the same set of 64 values being read, processed and written back, for every iteration for variable "m". Surely going backward or forward may give different results.

Please attach a zip file with the whole code.

0 Likes

The problem happens in Combine* kernel, WorkCombineH and WorkCombineV functions.

(attached file is already modified to incremental loop.)


1. LabelTo128x128

2. Add128x256

3. Border128x256

4. Combine128x256

5. Add256x256

6. Border256x256

7. Combine256x256

8. Add256x512

9. Border256x512

10. Combine256x512

11. Add256x1024

12. Border256x1024

13. Combine256x1024

14. Add256x2048

15. Border256x2048

16. Combine256x2048

17. Add256x4096

18. Border256x4096

19. Combine256x4096

20. InitInfo

21. CreateInfo

22. CreateLrRef

23. LrCombine

-----------------------------------------------------------------------------------------------------------

I always use unsigned int for array access.

The reasons are here:

1. It never point negative (illegal) array index.

    Unsigned value is always 0 or larger than 0.

2. Both increment loop and decrement loop can use same loop condition, "iterator < array length".

    for( unsigned int ui = 0; ui < ARRAY_LENGTH; ++ui )                             // incremental

    for( unsigned int ui = ARRAY_LENGTH - 1; ui < ARRAY_LENGTH; --ui )  // decremental

0 Likes

Can you post a narrowed down version of the testcase. The attached code is quite complicated and it is hard to say , if the issue is in Runtime or the application itself. Anyways It is hard to run these kernels without the host code.

0 Likes