8 Replies Latest reply on Sep 20, 2012 9:23 PM by binying

    Strange Behavior in OpenCL

    registerme

      I have OpenCL code simply doing somthing like this:

       

      __kernel void test(__read_only frame, ..., __global ulong *gOut1, __global uint *gOut2)

      {

           for (int i=0; i<5; i++)

           {

                //Begin TestCode

                if ((get_global_id(0) == 0) && (get_global_id(1) == 0) && (i==1))

                {

                     *gOut2 = 0; gOut2++;

                }

                //End TestCode

       

                gOut1[uniq_global_idx] = func(frame, i, getl_global_id(0), get_global_id(1));

           }

      }

       

      It ran in the global scale of the frame size, i.e., the global group is in frame.x and frame.y size. Basically the kernel do some calculation to generate gOut1 result with size 5*frame.x*frame.y.

       

      Problems:

      - the AMD gDebugger is not working - it works for the sample code, and the slightly modified code from sample code, but not anything changed further.

      - the TestCode section has to be there for the code to run. When that TestCode is removed, when I ran it, it stuck forever.

      - Somehow that 'gOut2++' statement is "crucial", even if it is not needed as the 'if' statement will hit only once, it needs to be there so that the gOut1 result is correct.

      - When the exact same function is duplicated with a different name, i.e, kernel test(..) and test1(..) are the same, when I call test(..), it stuck forever.

       

      Anybody can explain what is happening?

        • Re: Strange Behavior in OpenCL
          binying

          gOut1[uniq_global_idx] .

          where did you declare uniq_global_idx?

            • Re: Strange Behavior in OpenCL
              registerme

              __kernel void test(__read_only frame, ..., __global ulong *gOut1, __global uint *gOut2)

              {

                   int unique_global_id = (get_global_id(1) * frame.size_x + get_global_id(0)) * 5;

               

                   for (int i=0; i<5; i++)

                   {

                        //Begin TestCode

                        if ((get_global_id(0) == 0) && (get_global_id(1) == 0) && (i==1))

                        {

                             *gOut2 = 0; gOut2++;

                        }

                        //End TestCode

                   

                        gOut1[uniq_global_idx + i] = func(frame, i, getl_global_id(0), get_global_id(1));

                   }

              }

               

              Where func is a simple function just read a region of frame and do some calculation, returns a value. It does not change any global memory value.

                • Re: Strange Behavior in OpenCL
                  binying

                  what happened if you comment out "gOut1[uniq_global_idx + i] = func(frame, i, getl_global_id(0), get_global_id(1));"

                   

                  Is gOut2++ still crucial?

                    • Re: Strange Behavior in OpenCL
                      registerme

                      If I replace func() with some simple value, the returned gOut1 value back to CPU are good - whether the test code is there or not does not matter. But if I call func, the test code has to be there for func to be correct. If gOut2++ is missing, the result of func is not correct.

                        • Re: Strange Behavior in OpenCL
                          binying

                          Somehow, I think there is sth. wrong with "gOut1[uniq_global_idx + i] = func(frame, i, getl_global_id(0), get_global_id(1));".

                           

                          Is the index "uniq_global_idx + i" good for the array gOut1?

                          1 of 1 people found this helpful
                            • Re: Strange Behavior in OpenCL
                              registerme

                              I double checked unique_idx+i and put a statement there to check the lower and upper bound of it, if it's out of the bound, I will write some value to gOut2, and I removed the TestCode to replace it with this bound check statement, it can run without an error and the values are all good. The memory in gOut2 is untouched, which means there is no bound problem.

                               

                              So again it is crucial to have some kind of unrelated code to be there for the gOut1 result to be correct.

                               

                              And when I first ran the above bound check code, it is taking very long time to compile, and it stuck there when I ran it. I have to stop and start it again and it ran normally. I suspect there is something wrong with the setup of the driver, but don't know what is a good way to check it.

                    • Re: Strange Behavior in OpenCL
                      registerme

                      Not sure if it's related, sometimes I see a warning dialogbox:

                      catalystWarning.jpg

                       

                      Here is the software info from catalyst control center:

                       

                      Driver Packaging Version8.961-120405a-137813C-ATI
                      ProviderAdvanced Micro Devices, Inc.
                      2D Driver Version8.01.01.1243
                      2D Driver File Path/REGISTRY/MACHINE/SYSTEM/ControlSet001/Control/CLASS/{4D36E968-E325-11CE-BFC1-08002BE10318}/0006
                      Direct3D Version7.14.10.0903
                      OpenGL Version6.14.10.11631
                      Catalyst Control Center Version2012.0405.2205.37728
                      AMD Audio Driver Version7.12.0.7706

                       

                      The AMD driver version is 8.961.0.0