cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

elmar
Journeyman III

libamdocl32.so seems to prevent debugging of SSE exceptions

Hi,

to find the locations in my SSE/AVX code where something goes wrong, I enable various exceptions in the mxcsr register using the ldmxcsr command.

When my code then triggers the exception, my program segfaults as expected.

Unfortunately it is then not possible to find the location of the crash with GDB, because libamdocl32.so installs its own exception handler and seems to corrupt the stack, so that GDB can't show me which part of my code triggered the exception:

(gdb) where                                                                                                
#0  0xf778b430 in __kernel_vsyscall ()                                                                     
#1  0x007c6b11 in raise () from /lib/libc.so.6                                                             
#2  0x007c83ea in abort () from /lib/libc.so.6                                                             
#3  0xc916697e in amd::divisionErrorHandler(int, siginfo*, void*) () from /usr/lib/libamdocl32.so          
#4  0xcb1e45e0 in ?? () from /usr/lib/libamdocl32.so                                                       

Backtrace stopped: previous frame inner to this frame (corrupt stack?)

My current workaround is to hunt down the exceptions on a system with nVIDIA card, because the nVIDIA driver doesn't cause this problem.

Any other idea?

Thanks for your help,

Elmar

0 Likes
5 Replies
elmar
Journeyman III

And another sign of corruption:

When my program segfaults, the following message appears in the terminal:

Unhandled signal Unhanin divdledisionE sirrorgnHandal iler()                                               
nn divisionErrorHandler()

;-))

0 Likes
himanshu_gautam
Grandmaster

Can you please explain in detail how to reproduce this issue. Can you attach a small testcase as a zip file.

Anyways I will ask someone more knowledgeable about this.

0 Likes

Thanks for your reply. Unfortunately I don't have the time to come up with a minimal testcase, but I am confident that the description is clear enough, so that you only need to send it to the person in charge of amd::divisionErrorHandler in libamdocl32.so, who will know what it's about.

BTW, the OS is CentOS 6.4 64bit.

uname -a

Linux yasara1 2.6.32-358.11.1.el6.x86_64 #1 SMP Wed Jun 12 03:34:52 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

And the AMD driver version:

glxinfo | grep OpenGL

OpenGL vendor string: Advanced Micro Devices, Inc.

OpenGL renderer string: AMD Radeon HD 6700 Series

OpenGL version string: 4.2.12217 Compatibility Profile Context 12.104

OpenGL shading language version string: 4.20

0 Likes

I have forwarded this to driver team. Thanks for your help.

0 Likes

Hi elmar,

Issue does not seem to be reproducible at our end. Probably it is already fixed in the internal drivers. Although it cannot be verified for sure, without a test case from you. Considering this thread as assumed answered in that case.

0 Likes