5 Replies Latest reply on Mar 10, 2015 3:17 PM by dorono

    AMD APP SDK v3.0 Beta Debugging

    omnidirectional

      We enjoy profiling OpenCL 1.2 code with CodeXL on our Ubuntu 14.04 system.  CodeXL has helped us speed up one kernel 9x

       

      Screen Shot 2015-03-05 at 11.52.38 AM.png

       

      While the GPU Profiling & Timeline functions work on our code, we cannot get the debugging & breakpoints to work properly.  We can set a breakpoint in a kernel, but it doesn't stop at the proper line in our code or allow us to look at the memory inside the kernel.  We can however get the debugger to work on the Teapot example, so this should mean that our SDK & drivers are OK.  How can debugging work on the Teapot but not our code?    Do we need to set some compiler flags to enable debugging?


      Suggestions? 

        • Re: AMD APP SDK v3.0 Beta Debugging
          dorono

          Hi,

          Glad to hear the profiler has helped you.

          There are a number of possible reasons why kernel debugging fails:

          1. The kernel is using atomic operations

          2. The kernel is using printf calls

          3. The kernel is using OpenCL 2.0.

          Can you share the CodeXL log files? They should be stored under /tmp/ Look for 2 files: CodeXL-your_login_name.log and CodeXLServers-your_login_name.log

          Can you share your kernel source code? If uploading to a public forum is inconvenient then you can send it to gputools.support@amd.com


          1 of 1 people found this helpful
            • Re: Re: AMD APP SDK v3.0 Beta Debugging
              omnidirectional

              Dorono,

               

              Thanks for the response.  Unfortunately your 3 suggestions are not the problem.

               

              1) We are not using atomic operations

              2) We are not using printfs, but we commented out the pargma for printf & tested it anyway.

              3) We are explicitly compiling in OpenCL 1.2, so 2.0 shouldn't be a factor.

               

              BUT, I did find a LOT of errors in the LOG files that you asked for.  I've never been so happy to see errors <g>.  The files are attached.  I hope that they can help identifyt/solve the problem.

               

              tia

                • Re: Re: AMD APP SDK v3.0 Beta Debugging
                  urishomroni

                  Hi tia,

                   

                  Not all the error messages you see in the log are related to the cause of the problem, in fact most of them seem to stem from it.

                   

                  It cannot be easily deduced what the issue is.

                   

                  1. Another clear-cut issue that might be happening is using two-step program building (clCompileProgram + clClinkPrograms) or using pre-built binaries (clCreateProgramWithBinaries). It doesn't seem so from the log - but is that the case?

                  1.1. If that is the case, switch to using only clCreateProgramWithSource and clBuildProgram.

                  2. The title of the thread is "AMD APP SDK v3.0 Beta debugging". Are you attempting to debug on one of the SDK samples? If not, can you try debugging one of them (that is not an OpenCL 2.0 sample, doesn't use two step building, etc.) and seeing if it works for you?

                  3. Could you share the kernel source with us?

                  3.1. If not, could you try stripping away code from the kernel until you get a minimal sample that reproduces the issue, and send *that* to us?

                   

                  Thanks,

                    • Re: Re: AMD APP SDK v3.0 Beta Debugging
                      omnidirectional

                      Uri,

                       

                      Thanks for the response.  BTW, tia means "Thanks In Advance."   My name is Skip.

                       

                      Concerning your suggestions:

                       

                      1) We already use clCreateProgramWithSource and clBuildProgram.

                      2) We can debug the SDK's non-OpenCL 2.0 kernels, so that should prove that our driver & environment are correctly configured.  We are having problems using CodeXL to debug our code.

                      3) I'd rather not share my code.

                       

                      ***** Here is our current theory about the debugging problem.

                       

                      We think that the size of our kernel file may be causing problems with CodeXL's debugging.  We have 20 kernels that take up 2,284 lines of code.  While we cannot debug in CodeXL after loading the entire kernel file, we can debug it if we comment out a kernel or two.  What has us confused is that commenting out different combinations of kernels allows debugging, so we cannot blame any single kernel:  The problem only occurs when we try to load *all* of kernels.

                       

                      The good news is that we can *finally* debug our code with CodeXL by breaking our kernel file into 2 parts.  We can now put breakpoints into all of our kernels, and that was our objective.  

                       

                      Please let us know if you have any other explanation for this behavior.  Unless we hear otherwise, we will try to limit the size of our kernel files.  It would be nice to know if there is some file-size limit.  Ideally CodeXL would let users know if they are loading a collection of kernels that is too big for debugging.