Similar issues to case (2) here. Same error message on the shell but no GUI error message.
I can definitely run the example problem and step into kernel code. However CodeXL freezes when I do the same with my application. If I let it run, it unfreezes but it's not possible to stop the debugging process.
I am running Ubuntu 14.04 with the latest AMD drivers (Catalyst 14.4 rev2) and OpenCL SDK (1214.3) on an AMD Cedar GPU. Everything was installed from the tarballs.
I managed to at least hit breakpoints while linking against the libOpenCL provided by the drivers. As described in the release notes, "GPU Debugger breakpoints are not hit for Linux OpenCL applications that are built using DT_RPATH to locate the OpenCL runtime.". It turns out that's exactly my case since the output of readelf -d is the following:
Dynamic section at offset 0x77d98 contains 33 entries: Tag Type Name/Value 0x0000000000000001 (NEEDED) Shared library: [libopencv_core.so.2.4] 0x0000000000000001 (NEEDED) Shared library: [libopencv_highgui.so.2.4] 0x0000000000000001 (NEEDED) Shared library: [libopencv_imgproc.so.2.4] 0x0000000000000001 (NEEDED) Shared library: [libboost_system.so.1.53.0] 0x0000000000000001 (NEEDED) Shared library: [libboost_filesystem.so.1.53.0] 0x0000000000000001 (NEEDED) Shared library: [libOpenCL.so.1] 0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6] 0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] 0x000000000000000f (RPATH) Library rpath: [/usr/lib/fglrx:/home/daniele/ieiit/workspace/hand_pose_recognition/branches/new_rf_optimized/random_forests] 0x000000000000000c (INIT) 0x441638 0x000000000000000d (FINI) 0x464674 0x0000000000000019 (INIT_ARRAY) 0x677d78 0x000000000000001b (INIT_ARRAYSZ) 16 (bytes) 0x000000000000001a (FINI_ARRAY) 0x677d88 0x000000000000001c (FINI_ARRAYSZ) 8 (bytes) 0x000000006ffffef5 (GNU_HASH) 0x400298 0x0000000000000005 (STRTAB) 0x40fdc8 0x0000000000000006 (SYMTAB) 0x403918 0x000000000000000a (STRSZ) 194202 (bytes) 0x000000000000000b (SYMENT) 24 (bytes) 0x0000000000000015 (DEBUG) 0x0 0x0000000000000003 (PLTGOT) 0x678000 0x0000000000000002 (PLTRELSZ) 3888 (bytes) 0x0000000000000014 (PLTREL) RELA 0x0000000000000017 (JMPREL) 0x440708 0x0000000000000007 (RELA) 0x4405b8 0x0000000000000008 (RELASZ) 336 (bytes) 0x0000000000000009 (RELAENT) 24 (bytes) 0x000000006ffffffe (VERNEED) 0x4404c8 0x000000006fffffff (VERNEEDNUM) 5 0x000000006ffffff0 (VERSYM) 0x43f462 0x0000000000000000 (NULL) 0x0
Removing the RPATH section with the tool chrpath allows the debugger to hit the OpenCL API breakpoints, either way I'm still not able to skip into kernels code (same error message of the first post).
1. Beat me to the RPATH part, I'll just add that you can also avoid this issue entirely (without resorting to chrpath) by adding "--enable-new-dtags" to your own application's linker flags.
This is actually recommended in general, since DT_RPATH is deprecated and DT_RUNPATH is the replacement.
2. Regarding debugging the kernel itself, I could not find any telling hints in the logs.
2A. Are you able to debug the kernels in the SDK samples? The CodeXL teapot sample?
2B. If the other samples work, it might be that the issue is with the kernel itself not being debuggable (the message you've shown in your screenshot is the generic "something went wrong" message for that case, so it's hard to tell what happened). Common culprits are atomic operations and printf. Could you share the kernel sources?
urishomroni thank you for the RPATH hint. Actually it gets added by default using CMake each time I specify a linking path for the OpenCL shared library ...
Regarding the kernel debugging, I'm able to step into kernels when running the Teapot sample. I've also tried another program developed in the past months (e.g. the classifier.cl attached to the message) and it can be debugged with no issues. The kernels code that is causing the problem is attached to the message. The feat_type.cl file is dinamically generated, saved on the /tmp directory and included using the -I compilation flag. There're no printfs or atomic operations. Note that all the CL sources can be debugged using the older 1.3 CodeXL release with the driver 13.101 packaged in the Ubuntu repositories. Hope this stuff could help to track the problem.
Thanks for sharing your sources.
We will try to reconstruct the issue in our labs.
In the meantime, try commenting-out parts of the kernel to see if any specific area is the culprit, it might also clue you (and us) in to where the issue lies.