cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

pszilard
Adept I

sprofile segfault

The CodeXL 1.9 shipped sprofile segfaults while generating profile data at the end of each run. It looks like I do get a (possibly truncated) csv and a certainly truncated atp file.

Command line:

sprofile --apitrace --tracesummary --occupancy --perfcounter -w $PWD -o prof $PATH_TO_BINARY

Backtrace:

Program received signal SIGSEGV, Segmentation fault.

0x0000000000524a57 in std::__detail::_List_node_base::_M_unhook() ()

(gdb) bt

#0  0x0000000000524a57 in std::__detail::_List_node_base::_M_unhook() ()

#1  0x000000000047e888 in CLAtpFilePart::MergeTimestamp(std::string

const&, std::map<std::string, std::list<GPUTimestamp,

std::allocator<GPUTimestamp> >, std::less<std::string>,

std::allocator<std::pair<std::string const, std::list<GPUTimestamp,

std::allocator<GPUTimestamp> > > > >&, std::vector<CLAPIInfo*,

std::allocator<CLAPIInfo*> >&) ()

#2  0x0000000000480097 in

CLAtpFilePart::UpdateTmpTimestampFiles(std::string const&, std::string

const&) ()

#3  0x0000000000480ae6 in

CLAtpFilePart::WriteContentSection(std::basic_ofstream<char,

std::char_traits<char> >&, std::string const&, std::string const&) ()

#4  0x0000000000497061 in AtpFileWriter::SaveToAtpFile() ()

#5  0x0000000000416b7a in MergeTraceFile(int) [clone .isra.160] ()

#6  0x0000000000417359 in MergeFragFiles(int) ()

#7  0x000000000041a99f in main ()

Is there are better place (e.g. a proper bug-tracker) to report bugs?

0 Likes
4 Replies
chesik
Staff

Do you see this with any application (like perhaps one of the APP SDK samples) or is it specific to your particular application?

Does this only happen when you include both --apitrace and --perfcounter on the sprofile command line?  If you omit one, of those switches does it make a difference?

Chris

0 Likes

> Do you see this with any application (like perhaps one of the APP SDK samples) or is it specific to your particular application?

Not sure because I started to get a strange error:

sprofile --perfcounter -w /nethome/pszilard/tools/amd-appsdk_2.9_samples/opencl/cl/BufferBandwidth/bin/x86_64/Release -o prof /nethome/pszilard/tools/amd-appsdk_2.9_samples/opencl/cl/BufferBandwidth/bin/x86_64/Release/BufferBandwidth

AMD CodeXL GPU Profiler V3.1.10132 is Enabled

*** Error in `/nethome/pszilard/tools/amd-appsdk_2.9_samples/opencl/cl/BufferBandwidth/bin/x86_64/Release/BufferBandwidth': free(): invalid pointer: 0x00007fe83fe777b8 ***

> Does this only happen when you include both --apitrace and --perfcounter on the sprofile command line?  If you omit one, of those switches does it make a difference?

I think so, I was able to profile with only "--occupancy --perfcounter" earlier today.

0 Likes

Actually, it looks like no matter what program I try to profile and what sprofile command line argument combination I use, if I pass "-p/--perfcounter" I now always get the above mentioned error, e.g with the PrefixSum SDK example:

$ sprofile -p ./PrefixSum

AMD CodeXL GPU Profiler V3.1.10132 is Enabled

*** Error in `/nethome/pszilard/tools/amd-appsdk_2.9_samples/opencl/cl/PrefixSum/bin/x86_64/Release/./PrefixSum': free(): invalid pointer: 0x00007f4cae8cf7b8 ***

Failed to generate profile result /nethome/pszilard/Session1.csv.

The pointers it complains about seem to be quite close to each other, e.g. with another application I got 0x00007fb407f487b8.

This is really weird as I have been able to generate profile data on the same machine, same driver, same hardware about a week ago.

0 Likes

I figured this out and it is not pretty. Connecting to the remote headless Linux server (Ubuntu 14.04, no X running) without X forwarding, causes sprofile to always crash with -p, but when connecting with X forwarding, it seems to run fine.

Such bugs are preventable and should be prevented by beta testing IMO and it is very concerning that they are not. Is there a beta program for CodeXL?

Here's the backtrace, just in case...

(gdb) r

Starting program: /tmp/PrefixSum

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

AMD CodeXL GPU Profiler V3.1.10132 is Enabled

*** Error in `/tmp/PrefixSum': free(): invalid pointer: 0x00007ffff73ac7b8 ***

Program received signal SIGABRT, Aborted.

0x00007ffff7024cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.

(gdb) bt

#0  0x00007ffff7024cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

#1  0x00007ffff70280d8 in __GI_abort () at abort.c:89

#2  0x00007ffff7061394 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7ffff716fb28 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175

#3  0x00007ffff706d66e in malloc_printerr (ptr=<optimized out>, str=0x7ffff716bc19 "free(): invalid pointer", action=1) at malloc.c:4996

#4  _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840

#5  0x00007ffff7e3880d in ?? () from /usr/lib/libatiadlxx.so

#6  0x00007ffff7e48605 in ADL2_Main_Control_Destroy () from /usr/lib/libatiadlxx.so

#7  0x00007fffedefe1c7 in AMDTADLUtils::Unload() () from /opt/tcbsys/amd/codexl/1.9.10132/x86_64/libCLProfileAgent.so

#8  0x00007fffedefe568 in AMDTADLUtils::LoadAndInit() () from /opt/tcbsys/amd/codexl/1.9.10132/x86_64/libCLProfileAgent.so

#9  0x00007fffedefd4d7 in AMDTADLUtils::GetAsicInfoList(std::vector<ADLUtil_ASICInfo, std::allocator<ADLUtil_ASICInfo> >&) () from /opt/tcbsys/amd/codexl/1.9.10132/x86_64/libCLProfileAgent.so

#10 0x00007fffedeabce5 in CLGPAProfiler::Init(Parameters const&, std::string&) () from /opt/tcbsys/amd/codexl/1.9.10132/x86_64/libCLProfileAgent.so

#11 0x00007fffedea1ef7 in InitProfiler() () from /opt/tcbsys/amd/codexl/1.9.10132/x86_64/libCLProfileAgent.so

#12 0x00007fffedea1455 in clAgent_OnLoad () from /opt/tcbsys/amd/codexl/1.9.10132/x86_64/libCLProfileAgent.so

#13 0x00007ffff32c819d in ?? () from /usr/lib/libamdocl64.so

#14 0x00007ffff32c8f77 in ?? () from /usr/lib/libamdocl64.so

#15 0x00007ffff32d8305 in ?? () from /usr/lib/libamdocl64.so

#16 0x00007ffff32a8e53 in clIcdGetPlatformIDsKHR () from /usr/lib/libamdocl64.so

#17 0x00007ffff7bd55b2 in ?? () from /opt/tcbsys/amd/appsdk/2.9/lib/x86_64/libOpenCL.so.1

#18 0x00007ffff7bd7986 in ?? () from /opt/tcbsys/amd/appsdk/2.9/lib/x86_64/libOpenCL.so.1

#19 0x00007ffff6ddda90 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103

#20 0x00007ffff7bd7747 in ?? () from /opt/tcbsys/amd/appsdk/2.9/lib/x86_64/libOpenCL.so.1

#21 0x00007ffff7bd57a5 in ?? () from /opt/tcbsys/amd/appsdk/2.9/lib/x86_64/libOpenCL.so.1

#22 0x00007ffff7bd6f20 in clGetPlatformIDs () from /opt/tcbsys/amd/appsdk/2.9/lib/x86_64/libOpenCL.so.1

#23 0x0000000000412622 in appsdk::CLCommandArgs::validatePlatformAndDeviceOptions (this=0x622010) at /opt/tcbsys/amd/appsdk/2.9/include/SDKUtil/CLUtil.hpp:1121

#24 0x000000000041235e in appsdk::CLCommandArgs::parseCommandLine (this=0x622010, argc=1, argv=0x7fffffffe4a8) at /opt/tcbsys/amd/appsdk/2.9/include/SDKUtil/CLUtil.hpp:1103

#25 0x000000000040f059 in main (argc=1, argv=0x7fffffffe4a8) at /nethome/pszilard/tools/amd-appsdk_2.9_samples/opencl/cl/PrefixSum/PrefixSum.cpp:621

0 Likes