9 Replies Latest reply on Jun 19, 2014 1:44 AM by pinform

    clBuildProgram segfaults and other problems.

    daniel

      I have a bug in my OpenCL code in which a computation results in nan. In the process of tracking the source of this bug I added the following code:

       

      if (isnan(var){
           debug_info[0] = 1;
      }
      

       

      After adding this code the bug disappeared and the code functioned as expected. To me, this says compiler optimization error. So I added -cl-opt-disable to the build options in order to test my theory and now clBuildProgram segfaults.

      I am using AMD-APP-SDK-v2.9-lnx64 and compiling for the hd7970. Unfortunately I cannot post my code publicly.

       

      Any help is appreciated.

        • Re: clBuildProgram segfaults and other problems.
          sudarshan

          Hi,

          It is difficult to make out where the error could be just from the information you have posted.

          If you can capture the same bug in a simplified code which you can post, it would be of great help.

           

          Thanks,

            • Re: Re: clBuildProgram segfaults and other problems.
              daniel

              I have tracked down the compiler segfault to a pice of code that looks like

               

              if(x == 1){
                  stuff;
              else if(x == 2){
                  other stuff;
              }
              

               

              If that is changed to

              if(x == 1){
                  stuff;
              if(x == 2){
                  other stuff;
              }
              

               

              Then the code compiles without error. I am working on a simplified version of the code that I can post.

               

              This allowed me to compile the code without optimizations and the nan bug disappeared confirming my hypothesis. I am not sure how to make a simple version of the code that replicates the optimization bug as it is part of a complex piece of code and any attempt to track the bug makes it disappear. Any advice in this area would be helpful.

              • Re: Re: clBuildProgram segfaults and other problems.
                daniel

                OK here is the simplest I could make the kernel and replicate the bug

                 

                _kernel void foo(__global const int*  mem)
                {
                    int work_id = 1;
                
                    int y = me[1];
                    
                    
                    while (true){
                
                            int old_y_value = 1;
                
                            if( old_y_value == y)
                            {
                                int other_id = 1;
                                
                                    bool loop_two_done = false;
                                    
                                    while (!loop_two_done){
                                        
                                        if(1 == work_id){
                                            
                                            if (other_id == 0){
                                                
                                            }else if(1 == other_id){
                                                loop_two_done = true;
                                            }
                                        }
                                        
                                    }
                                
                            }
                    }
                    
                }
                
                

                If you replace the "else if" on line 24 with "if" then it compiles fine.