3 Replies Latest reply on Nov 29, 2012 2:32 AM by cvazquezb

    Possible compiler bug with Catalyst 12.10

    cvazquezb

      Tested with a Radeon HD 7970 on Windows 7 64 bits. Driver updated to Catalyst 12.10

       

      The relevant code:

       

      #define PIXELS 20

           for(int dy=0;dy<PIXELS;dy++)

           {

                (...)

              if(dy>0)

              {

                  w=weights[(PIXELS-1-dy)*VOX_SLICE*workspace->xMaxVoxels];

                  c=evaluate(lines[-dy+PIXELS-1].z,lines[-dy+PIXELS-1].w,slice)+yChunk;

       

                  for(int y=dy;y<PIXELS;y++)

                  {   

                      if((c>=0)&&(c<CHUNK_SIZE))

                          chunk[c *VOX_ROW]+=w;

       

                      barrier(CLK_LOCAL_MEM_FENCE);

                      c+=Z_SLOPE;

                  }

              }

           }

       

      If I change the seventh line for the equivalent:


                  w=weights[(-dy+PIXELS-1)*VOX_SLICE*workspace->xMaxVoxels];

       

      The generated IL changes a lot from register renaming with the first version having a few more instructions. Worse still, after the change results are completely wrong. The code works fine either way on a variety of Nvidia platforms. I'm afraid I can't provide the full code without an NDA but I'd be happy to help in any other way.

        • Re: Possible compiler bug with Catalyst 12.10
          binying

          How about with Catalyst 12.11 beta?

          • Re: Possible compiler bug with Catalyst 12.10
            yurtesen

            Did you try printf to see if the values are what you expect?  I would think it would be unlikely for the compiler to make a mistake in (PIXELS-1-dy) vs (-dy+PIXELS-1). It probably replaces PIXELS-1 with 19... so it would end up with 19-dy or -dy+19 ...  I am not sure if AMD can replicate the problem...?

             

            Did you try on different cards? also on CPU device to see if you are getting same results? Perhaps you can put your kernel to KernelAnalyzer[1,2] to see what they think....

             

            With so little information, there is not much which can be said....

            1 of 1 people found this helpful
              • Re: Possible compiler bug with Catalyst 12.10
                cvazquezb

                I'm sorry about lack of details, I'm very limited (in a legal sense) about what kind of information I can provide publicly. That aside, I indeed tried to printf the results. And something very strange happened: printed results were fine, but the kernel slowed down to a crawl, even after I limited the printout to a couple lines. What normally takes <10 seconds was still halfway after three hours!

                 

                Anyway, I did what binying proposed and went with 12.11 beta. It worked like a charm, no modification to the code necessary. So it looks like a code generation bug indeed. I'm glad they sorted it out already, but it certainly doesn't make me feel very confident about the robustness of OpenCL on AMD platforms. If I hadn't had other platforms to test the code on I would have spent a lot of time trying to fix a non-existing problem on my side.

                 

                I thank you both for taking your time to answer.

                1 of 1 people found this helpful