6 Replies Latest reply on Dec 22, 2014 7:56 AM by jsnyder

    Can't Debug Kernel in Teapot Example

    jsnyder

      I have an HP Z820 configured width an Intel Xeon E5-2650, 32GB RAM, and an msi Radeon R9 280X Gaming 6G card. I am running Windows 8.1 Pro, and have Visual Studio Premium 2013 Update 3, AMD Catalyst Version 14.9(Driver 14.301.1001-140915a-176154C), AMD APP SDK 2.9-1, and Code XL 1.5.6571 installed.

       

      I am trying to debug a kernel in the AMD Teapot example. I compile the AMDTTeaPot project in Debug Configuration. When I run without any break points, the application runs at 127 FPS.

       

      When I put a breakpoint in the kernel applyBuoyancy() by clicking to the left of line 20 in tpApplyBuoyancy.cl, the application slows down to 23 FPS and I observe a lot of pairs of threads being created and then destroyed. The application does not break.

       

      If I remove that breakpoint, continue, and use the OpenXL menu to put a New CodeXL Breakpoint in the Kernel Function applyBouyancy(), the application breaks just before the call to _clEnqueueReleaseGLObjects() on line 2303 of AMDTTeapotOCLSmokeSystem.cpp. Stepping does not move the cursor.

       

      If I remove the previous breakpoint, continue, and add a New CodeXL Breakpoint in clEnqueueNDRangeKernel(), the application breaks just before the call to _clEnqueueNDRangeKernel() on line 2112 of AMDTTeapotOCLSmokeSystem.cpp. Attempting to step brings up a dialog box, which says

       

      The Process was suspended before a kernel enqueued for debug has started executing.

      Disable all API function breakpoints and resume debugging(F5) to continue into the kernel.

       

      After deleting all breakpoints and hitting F5, a dialog box appears with the message,

       

      Could not debug kernel. Error during kernel debugging.

       

      A coworkwer with a similar setup, but different (R9 290X) card, has no trouble putting a breakpoint in applyBouyancy(). His version of Visual Studio appears to load the same DLLs as mine does during program invocation. My "CodeXLServers-<userid>.log" file is the same as his up until the very end of his file. The last line in agreement between the two files looks like this:

       

      2014.11.26    09:49:48.828    #14418177828    #ERROR    #0    #3836    #gsSamplersMonitor::updateContextDataSnapshot    #src

      \gsSamplersMonitor.cpp    #291    #Assertion failure (m_glGetSamplerParameteriv != 0 && m_glGetSamplerParameterfv != 0)

       

      After that, my log file contains a large number of repetitions of this error:

       

      2014.11.26    09:49:48.287    #14418360287    #ERROR    #0    #4268    #csDWARFParser::getAddressScope    #src\csDWARFParser.cpp    #2926   

      #Assertion failure (pAddressScope != 0)

       

      Can you help me figure out what is going on?

       

      Message was edited by: Jeffrey Snyder I corrected my comment about the behavior when breaking on clEnqueueNDRangeKernel(). I had mistakenly said that the application breaks on _clEnqueueReleaseGLObjects(), but that was a copy/paste error from the previous case.

        • Re: Can't Debug Kernel in Teapot Example
          dorono

          Hi Jeffrey,

          Can you run clinfo on your machine and upload the results?

          Thanks

            • Re: Can't Debug Kernel in Teapot Example
              jsnyder

              Thanks for offering to help. This is what I get by running C:\Windows\System32\clinfo.exe:

               

               

              C:\Windows\System32>clinfo

              Number of platforms:                             2

                Platform Profile:                              FULL_PROFILE

                Platform Version:                              OpenCL 1.1 CUDA 6.0.1

                Platform Name:                                 NVIDIA CUDA

                Platform Vendor:                               NVIDIA Corporation

                Platform Extensions:                           cl_khr_byte_addressable_store c

              l_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_

              sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query

              cl_nv_pragma_unroll

                Platform Profile:                              FULL_PROFILE

                Platform Version:                              OpenCL 1.2 AMD-APP (1573.4)

                Platform Name:                                 AMD Accelerated Parallel Proces

              sing

                Platform Vendor:                               Advanced Micro Devices, Inc.

                Platform Extensions:                           cl_khr_icd cl_khr_d3d10_sharing

              cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offl

              ine_devices

               

               

                Platform Name:                                 NVIDIA CUDA

              Number of devices:                               1

                Device Type:                                   CL_DEVICE_TYPE_GPU

                Vendor ID:                                     10deh

                Max compute units:                             1

                Max work items dimensions:                     3

                  Max work items[0]:                           1024

                  Max work items[1]:                           1024

                  Max work items[2]:                           64

                Max work group size:                           1024

                Preferred vector width char:                   1

                Preferred vector width short:                  1

                Preferred vector width int:                    1

                Preferred vector width long:                   1

                Preferred vector width float:                  1

                Preferred vector width double:                 1

                Native vector width char:                      1

                Native vector width short:                     1

                Native vector width int:                       1

                Native vector width long:                      1

                Native vector width float:                     1

                Native vector width double:                    1

                Max clock frequency:                           875Mhz

                Address bits:                                  32

                Max memory allocation:                         268435456

                Image support:                                 Yes

                Max number of images read arguments:           256

                Max number of images write arguments:          16

                Max image 2D width:                            32768

                Max image 2D height:                           32768

                Max image 3D width:                            4096

                Max image 3D height:                           4096

                Max image 3D depth:                            4096

                Max samplers within kernel:                    32

                Max size of kernel argument:                   4352

                Alignment (bits) of base address:              4096

                Minimum alignment (bytes) for any datatype:    128

                Single precision floating point capability

                  Denorms:                                     Yes

                  Quiet NaNs:                                  Yes

                  Round to nearest even:                       Yes

                  Round to zero:                               Yes

                  Round to +ve and infinity:                   Yes

                  IEEE754-2008 fused multiply-add:             Yes

                Cache type:                                    Read/Write

                Cache line size:                               128

                Cache size:                                    16384

                Global memory size:                            1073741824

                Constant buffer size:                          65536

                Max number of constant args:                   9

                Local memory type:                             Scratchpad

                Local memory size:                             49152

                Kernel Preferred work group size multiple:     32

                Error correction support:                      0

                Unified memory for Host and Device:            0

                Profiling timer resolution:                    1000

                Device endianess:                              Little

                Available:                                     Yes

                Compiler available:                            Yes

                Execution capabilities:

                  Execute OpenCL kernels:                      Yes

                  Execute native function:                     No

                Queue properties:

                  Out-of-Order:                                Yes

                  Profiling :                                  Yes

                Platform ID:                                   000000CE6A7EEEA0

                Name:                                          Quadro K600

                Vendor:                                        NVIDIA Corporation

                Device OpenCL C version:                       OpenCL C 1.1

                Driver version:                                332.50

                Profile:                                       FULL_PROFILE

                Version:                                       OpenCL 1.1 CUDA

                Extensions:                                    cl_khr_byte_addressable_store c

              l_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_

              sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query

              cl_nv_pragma_unroll  cl_khr_global_int32_base_atomics cl_khr_global_int32_extend

              ed_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics c

              l_khr_fp64

               

               

                Platform Name:                                 AMD Accelerated Parallel Proces

              sing

              Number of devices:                               2

                Device Type:                                   CL_DEVICE_TYPE_GPU

                Vendor ID:                                     1002h

                Board name:                                    AMD Radeon R9 200 Series

                Device Topology:                               PCI[ B#5, D#0, F#0 ]

                Max compute units:                             32

                Max work items dimensions:                     3

                  Max work items[0]:                           256

                  Max work items[1]:                           256

                  Max work items[2]:                           256

                Max work group size:                           256

                Preferred vector width char:                   4

                Preferred vector width short:                  2

                Preferred vector width int:                    1

                Preferred vector width long:                   1

                Preferred vector width float:                  1

                Preferred vector width double:                 1

                Native vector width char:                      4

                Native vector width short:                     2

                Native vector width int:                       1

                Native vector width long:                      1

                Native vector width float:                     1

                Native vector width double:                    1

                Max clock frequency:                           1020Mhz

                Address bits:                                  64

                Max memory allocation:                         4244635648

                Image support:                                 Yes

                Max number of images read arguments:           128

                Max number of images write arguments:          8

                Max image 2D width:                            16384

                Max image 2D height:                           16384

                Max image 3D width:                            2048

                Max image 3D height:                           2048

                Max image 3D depth:                            2048

                Max samplers within kernel:                    16

                Max size of kernel argument:                   1024

                Alignment (bits) of base address:              2048

                Minimum alignment (bytes) for any datatype:    128

                Single precision floating point capability

                  Denorms:                                     No

                  Quiet NaNs:                                  Yes

                  Round to nearest even:                       Yes

                  Round to zero:                               Yes

                  Round to +ve and infinity:                   Yes

                  IEEE754-2008 fused multiply-add:             Yes

                Cache type:                                    Read/Write

                Cache line size:                               64

                Cache size:                                    16384

                Global memory size:                            6442450944

                Constant buffer size:                          65536

                Max number of constant args:                   8

                Local memory type:                             Scratchpad

                Local memory size:                             32768

                Kernel Preferred work group size multiple:     64

                Error correction support:                      0

                Unified memory for Host and Device:            0

                Profiling timer resolution:                    1

                Device endianess:                              Little

                Available:                                     Yes

                Compiler available:                            Yes

                Execution capabilities:

                  Execute OpenCL kernels:                      Yes

                  Execute native function:                     No

                Queue properties:

                  Out-of-Order:                                No

                  Profiling :                                  Yes

                Platform ID:                                   00007FF9D5D5BFB0

                Name:                                          Tahiti

                Vendor:                                        Advanced Micro Devices, Inc.

                Device OpenCL C version:                       OpenCL C 1.2

                Driver version:                                1573.4 (VM)

                Profile:                                       FULL_PROFILE

                Version:                                       OpenCL 1.2 AMD-APP (1573.4)

                Extensions:                                    cl_khr_fp64 cl_amd_fp64 cl_khr_

              global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int3

              2_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_

              khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store

              cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd

              _vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d1

              0_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buff

              er cl_khr_spir cl_khr_gl_event

               

               

                Device Type:                                   CL_DEVICE_TYPE_CPU

                Vendor ID:                                     1002h

                Board name:

                Max compute units:                             16

                Max work items dimensions:                     3

                  Max work items[0]:                           1024

                  Max work items[1]:                           1024

                  Max work items[2]:                           1024

                Max work group size:                           1024

                Preferred vector width char:                   16

                Preferred vector width short:                  8

                Preferred vector width int:                    4

                Preferred vector width long:                   2

                Preferred vector width float:                  8

                Preferred vector width double:                 4

                Native vector width char:                      16

                Native vector width short:                     8

                Native vector width int:                       4

                Native vector width long:                      2

                Native vector width float:                     8

                Native vector width double:                    4

                Max clock frequency:                           2594Mhz

                Address bits:                                  64

                Max memory allocation:                         8571055104

                Image support:                                 Yes

                Max number of images read arguments:           128

                Max number of images write arguments:          8

                Max image 2D width:                            8192

                Max image 2D height:                           8192

                Max image 3D width:                            2048

                Max image 3D height:                           2048

                Max image 3D depth:                            2048

                Max samplers within kernel:                    16

                Max size of kernel argument:                   4096

                Alignment (bits) of base address:              1024

                Minimum alignment (bytes) for any datatype:    128

                Single precision floating point capability

                  Denorms:                                     Yes

                  Quiet NaNs:                                  Yes

                  Round to nearest even:                       Yes

                  Round to zero:                               Yes

                  Round to +ve and infinity:                   Yes

                  IEEE754-2008 fused multiply-add:             Yes

                Cache type:                                    Read/Write

                Cache line size:                               64

                Cache size:                                    32768

                Global memory size:                            34284220416

                Constant buffer size:                          65536

                Max number of constant args:                   8

                Local memory type:                             Global

                Local memory size:                             32768

                Kernel Preferred work group size multiple:     1

                Error correction support:                      0

                Unified memory for Host and Device:            1

                Profiling timer resolution:                    394

                Device endianess:                              Little

                Available:                                     Yes

                Compiler available:                            Yes

                Execution capabilities:

                  Execute OpenCL kernels:                      Yes

                  Execute native function:                     Yes

                Queue properties:

                  Out-of-Order:                                No

                  Profiling :                                  Yes

                Platform ID:                                   00007FF9D5D5BFB0

                Name:                                                Intel(R) Xeon(R) CPU E5-2

              650 v2 @ 2.60GHz

                Vendor:                                        GenuineIntel

                Device OpenCL C version:                       OpenCL C 1.2

                Driver version:                                1573.4 (sse2,avx)

                Profile:                                       FULL_PROFILE

                Version:                                       OpenCL 1.2 AMD-APP (1573.4)

                Extensions:                                    cl_khr_fp64 cl_amd_fp64 cl_khr_

              global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int3

              2_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_

              khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store

              cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec

              3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sh

              aring cl_khr_spir cl_khr_gl_event

                • Re: Can't Debug Kernel in Teapot Example
                  jsnyder

                  I would really appreciate it if someone could help with this problem. I have upgraded to OpenXL 1.6, but I am still having the previously mentioned problem. I have tried different GPUs with the same result. I am unable to stop in a kernel using either the VisualStudio 2013 interface or the CodeXL console interface. When running the CodeXL console, I can start and run the Teapot example. When I put a breakpoint in applyBuoyancy(), the application stops. Here is the content of the "Debugged Process events" tab:

                   

                  <<

                  Thread Created: 2848

                  DLL Loaded: C:\Windows\SysWOW64\ntdll.dll

                  DLL Loaded: C:\Windows\SysWOW64\kernel32.dll

                  DLL Loaded: C:\Windows\SysWOW64\KernelBase.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\spies\opengl32.dll

                  DLL Loaded: C:\Windows\SysWOW64\user32.dll

                  DLL Loaded: C:\Windows\SysWOW64\gdi32.dll

                  DLL Loaded: C:\Windows\SysWOW64\shell32.dll

                  DLL Loaded: C:\Windows\SysWOW64\msvcr120.dll

                  DLL Loaded: C:\Windows\SysWOW64\ddraw.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\AMDTServerUtilities.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\AMDTBaseTools.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\AMDTOSWrappers.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\AMDTApiClasses.dll

                  DLL Loaded: C:\Windows\SysWOW64\ole32.dll

                  DLL Loaded: C:\Windows\SysWOW64\msvcp120.dll

                  DLL Loaded: C:\Windows\SysWOW64\msvcrt.dll

                  DLL Loaded: C:\Windows\SysWOW64\combase.dll

                  DLL Loaded: C:\Windows\SysWOW64\shlwapi.dll

                  DLL Loaded: C:\Windows\SysWOW64\dciman32.dll

                  DLL Loaded: C:\Windows\SysWOW64\ws2_32.dll

                  DLL Loaded: C:\Windows\SysWOW64\version.dll

                  DLL Loaded: C:\Windows\SysWOW64\dbghelp.dll

                  DLL Loaded: C:\Windows\SysWOW64\advapi32.dll

                  DLL Loaded: C:\Windows\SysWOW64\oleaut32.dll

                  DLL Loaded: C:\Windows\SysWOW64\propsys.dll

                  DLL Loaded: C:\Windows\SysWOW64\rpcrt4.dll

                  DLL Loaded: C:\Windows\SysWOW64\sechost.dll

                  DLL Loaded: C:\Windows\SysWOW64\nsi.dll

                  DLL Loaded: C:\Windows\SysWOW64\sspicli.dll

                  DLL Loaded: C:\Windows\SysWOW64\cryptbase.dll

                  DLL Loaded: C:\Windows\SysWOW64\bcryptprimitives.dll

                  DLL Loaded: C:\Windows\SysWOW64\imm32.dll

                  DLL Loaded: C:\Windows\SysWOW64\msctf.dll

                  API Connection Established: CodeXL Servers Manager

                  Thread Created: 5504

                  Process Run Started

                  Thread Created: 5556

                  Thread Terminated: 5556

                  DLL Loaded: C:\Windows\SysWOW64\uxtheme.dll

                  DLL Loaded: C:\Windows\SysWOW64\dwmapi.dll

                  DLL Loaded: C:\Windows\SysWOW64\kernel.appcore.dll

                  DLL Loaded: C:\Windows\SysWOW64\SHCore.dll

                  DLL Loaded: C:\Windows\SysWOW64\opengl32.dll

                  DLL Loaded: C:\Windows\SysWOW64\glu32.dll

                  API Connection Established: CodeXL OpenGL Server

                  Debug String: CodeXL OpenGL Server was initialized

                  DLL Loaded: C:\Windows\SysWOW64\comctl32.dll

                  DLL Loaded: C:\Windows\SysWOW64\atiglpxx.dll

                  DLL Loaded: C:\Windows\SysWOW64\atioglxx.dll

                  DLL Loaded: C:\Windows\SysWOW64\setupapi.dll

                  DLL Loaded: C:\Windows\SysWOW64\cfgmgr32.dll

                  DLL Loaded: C:\Windows\SysWOW64\winmm.dll

                  DLL Loaded: C:\Windows\SysWOW64\winmmbase.dll

                  DLL Loaded: C:\Windows\SysWOW64\devobj.dll

                  DLL Loaded: C:\Windows\SysWOW64\atiadlxy.dll

                  DLL Loaded: C:\Windows\SysWOW64\userenv.dll

                  DLL Loaded: C:\Windows\SysWOW64\wtsapi32.dll

                  DLL Loaded: C:\Windows\SysWOW64\psapi.dll

                  DLL Loaded: C:\Windows\SysWOW64\IPHLPAPI.DLL

                  DLL Loaded: C:\Windows\SysWOW64\profapi.dll

                  DLL Loaded: C:\Windows\SysWOW64\winnsi.dll

                  DLL Loaded: C:\Windows\SysWOW64\wintrust.dll

                  DLL Loaded: C:\Windows\SysWOW64\crypt32.dll

                  DLL Loaded: C:\Windows\SysWOW64\msasn1.dll

                  DLL Loaded: C:\Windows\SysWOW64\atigktxx.dll

                  DLL Unloaded: C:\Windows\SysWOW64\atigktxx.dll

                  DLL Loaded: C:\Windows\SysWOW64\atigktxx.dll

                  Thread Created: 3604

                  OpenGL Render Context 1 Created

                  DLL Loaded: C:\Windows\SysWOW64\OpenCL.dll

                  DLL Unloaded: C:\Windows\SysWOW64\OpenCL.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\spies\OpenCL.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\AMDTHsaDebugging.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\spies\detoured.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\AMDHwDbgFacilities.dll

                  DLL Loaded: C:\Windows\SysWOW64\OpenCL.dll

                  DLL Loaded: C:\Program Files (x86)\AMD\CodeXL\AMDOpenCLDebug.dll

                  DLL Loaded: C:\Windows\SysWOW64\aticalrt.dll

                  DLL Loaded: C:\Windows\SysWOW64\aticalcl.dll

                  DLL Loaded: C:\Windows\SysWOW64\aticaldd.dll

                  DLL Loaded: C:\Windows\SysWOW64\amdocl.dll

                  DLL Loaded: C:\Windows\SysWOW64\nvopencl.dll

                  DLL Loaded: C:\Windows\SysWOW64\nvapi.dll

                  DLL Loaded: C:\Windows\SysWOW64\atiumdva.dll

                  API Connection Established: CodeXL OpenCL Server

                  Debug String: CodeXL OpenCL Server was initialized

                  OpenCL Error on function: clGetGLContextInfoKHR

                  OpenCL Compute Context 1 Created

                  Thread Created: 3296

                  OpenCL Queue 1 (Context 1) Created

                  OpenCL Program 1 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 2 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 3 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 4 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 5 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 6 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 7 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 8 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 9 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 10 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 11 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 12 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 13 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 14 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 15 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 16 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  OpenCL Program 17 (Context 1) Created

                  Building OpenCL Program  (Context 1)...

                  Building OpenCL Program  (Context 1) Ended

                              Build Log:

                   

                   

                  Debug String: CodeXL warning: The debugged process asked for an extension function pointer (glIsBuffer) from one render

                   

                  context, but called this function pointer in another render context (context #1)

                  Debug String: CodeXL warning: The debugged process asked for an extension function pointer (glGetBufferParameteriv) from

                   

                  one render context, but called this function pointer in another render context (context #1)

                  Thread Created: 5452

                  Thread Created: 256

                  Thread Terminated: 256

                  Thread Terminated: 5452

                  Step: clEnqueueReleaseGLObjects

                  >>

                   

                  Here is the Call Stack:

                   

                  wcsicmp - AMDTTeaPot.exe

                  sys_errlist - AMDTTeaPot.exe

                  wctype - AMDTTeaPot.exe

                  wctype - AMDTTeaPot.exe

                  wctype - AMDTTeaPot.exe

                  wctype - AMDTTeaPot.exe

                  wctype - USER32.dll

                  wctype - USER32.dll

                  wctype - USER32.dll

                  wctype - USER32.dll

                  wctype - AMDTTeaPot.exe

                  get_dstbias - AMDTTeaPot.exe

                  islower - KERNEL32.DLL

                  modf - ntdll.dll

                   

                   

                  Do any of you see anything in here which is indicative of a problem? I am wondering about the OpenGL messages myself.

                   

                  Regards,

                   

                  Jeff

                    • Re: Can't Debug Kernel in Teapot Example
                      urishomroni

                      Hi Jeff,

                       

                      1. You might want to upgrade your AMD driver to the new Catalyst Omega - this may help this issue.

                       

                      2. According to your log snippets, the error is apparently from the OpenCL program not having debug information in a supported format. This could stem from various reasons - is it possible that your machine has some specific settings pertaining to OpenCL build flags (e.g. environment variables)?

                       

                      3. Could it be that the kernels are being executed on a non-AMD device or an AMD CPU (both these scenarios are supposed to be handled with a specific message, but something might have gone wrong)? This can be verified by:

                      A. Having a look in the Teapot application's menu and seeing which device is selected for each type of calculation.

                      B. Setting a CodeXL breakpoint on clEnqueueNDRangeKernel, and checking the device list in the OpenCL context properties.

                       

                      Regards,

                        • Re: Can't Debug Kernel in Teapot Example
                          jsnyder

                          Thanks, Uri.

                           

                          I did not see your message this morning until I had already replied to myself. I actually was able to get CodeXL running last Friday with the reinstall described.

                           

                          Your message does spark further questions, though. When I go through the CodeXL menu in VisualStudio and set a breakpoint in clEnqueueNDRangeKernel, no source is visible after breaking. I don't know how I am supposed to look at the OpenCL context properties. There is nothing on the CodeXL menu that would allow me to do this.

                           

                          When I set the breakpoint through the Visual Studio Debug menu, I am not able to see context properties until I move up several levels in the call stack:

                           

                               OpenCL.dll!50501d40()    Unknown

                               [Frames below may be incorrect and/or missing, no symbols loaded for OpenCL.dll]   

                               AMDTTeaPot.exe!AMDTTeapotOCLSmokeSystem::compute(const AMDTTeapotRenderState & state, const Mat4 & modelTransformation, float deltaTimeSeconds, float deltaRot) Line 2121    C++

                          >    AMDTTeaPot.exe!AMDTTeapotOCLSmokeSystem::draw(const AMDTTeapotRenderState & state, const Mat4 & modelTransformation, float deltaRot) Line 1664    C++

                               AMDTTeaPot.exe!AMDTTeapotOGLCanvas::paintWindow() Line 413    C++

                               AMDTTeaPot.exe!AMDTTeapotOGLCanvas::onPaint() Line 154    C++

                               AMDTTeaPot.exe!MainWin::onPaint() Line 138    C++

                               AMDTTeaPot.exe!WndProc(HWND__ * hWnd, unsigned int message, unsigned int wParam, long lParam) Line 444    C++

                               [External Code]

                           

                          From the calling context of the draw() method (in this particular break), I am able to examine the global _programs variable to get the list of devices. There is only 1. It is "Tahiti". I am not able to examine _programs from any method above draw() in the call stack.

                           

                          Since both 3A and 3B indicate "Tahiti" (and did previously before my complete reinstall), I don't think that device selection was my problem.

                           

                           

                          Regards,

                           

                          Jeff

                        • Re: Can't Debug Kernel in Teapot Example
                          jsnyder

                          I asked my coworker about their installation. They indicated that they did not have AMD APP SDK installed. I then uninstalled all AMD software (CodeXL 1.6, AMD APP SDK 2.9.1, Catalyst). I then installed Catalyst from "amd-catalyst-omega-14.12-with-dotnet45-win8.1-64bit.exe" and CodeXL from "AMD_CodeXL_Win_1.6.7249.exe". I did NOT reinstall AMD APP SDK.

                           

                          I am now able to stop at breakpoints in kernels. CodeXL seems to be functional for me. I noticed that the date on "C:\Windows\SysWOW64\OpenCl.dll" and "C:\Windows\System32\OpenCl.dll" is now "11/20/2014 9:33 PM". Prior to my reinstall, the date was sometime in 9/2014.

                           

                          I still don't know what was causing my problem. My best guess is that going from "OpenCL 1.2 AMD-APP (1573.4)" to "OpenCL 1.2 AMD-APP (1642.5)" was a help.