AnsweredAssumed Answered

ACML and CentOS 5

Question asked by aciani1@uic.edu on Jan 10, 2013
Latest reply on Jan 11, 2013 by aciani1@uic.edu

I am obtaining unexpected results and segmentation faults using ACML 4.4.0 gfortran64 on a CentOS 5.8 system using gcc 4.1.2 or gcc 4.4.6.

 

I have performed some debugging, and it seems that after the call to ACML, the stack has been altered, clobbered or smashed.

 

For example:

Breakpoint 2, main (argc=17, argv=0x7fffffffe738) at sw_fit.c:209

209             dspsv(UPLO, Njac, 1, JTWJ, &i, JTWDY, Njac, &j);

(gdb) x /20xg $rsp

0x7fffffffe5a0: 0x00002aaaacc68000      0x0000003028e0cf65

0x7fffffffe5b0: 0x00007fffffffe738      0x0000001100000000

0x7fffffffe5c0: 0x0000000000000003      0x0000000000609010

0x7fffffffe5d0: 0x0000000000608ce0      0x0000001000000003

0x7fffffffe5e0: 0x00007fff00000007      0x0000002a00000010

0x7fffffffe5f0: 0x7fffffffffffffff      0x550000302901cbc0

0x7fffffffe600: 0x0000000000609030      0x000000000060afc0

0x7fffffffe610: 0x000000000062b430      0x000000000062c940

0x7fffffffe620: 0x000000000060f140      0x0000000000608c00

0x7fffffffe630: 0x0000000000616820      0x00000000006265b0

(gdb) cont

Continuing.

 

Breakpoint 3, main (argc=17, argv=0x7fffffffe738) at sw_fit.c:211

211             res = 0.0;

(gdb) x /20xg $rsp

0x7fffffffe5a0: 0x00002aaa0000002a      0x00007fffffffe5d8

0x7fffffffe5b0: 0x00007fffffffe738      0x0000001100000000

0x7fffffffe5c0: 0x0000000000000003      0x0000000000609010

0x7fffffffe5d0: 0x0000000000608ce0      0x000000010000002a

0x7fffffffe5e0: 0x0000000300000002      0x0000000500000004

0x7fffffffe5f0: 0x0000000700000006      0x0000000900000008

0x7fffffffe600: 0x0000000b0000000a      0x0000000d0000000c

0x7fffffffe610: 0x0000000f0000000e      0x0000001100000010

0x7fffffffe620: 0x0000001300000012      0x0000001500000014

0x7fffffffe630: 0x0000001700000016      0x0000001900000018

 

It appears as though the stack is clobbered or smashed with a numerical sequence.  I have had other strange things occur, such as numerical constants being changed.  For example, LDA might be 4 before a call to ACML, and then be 2 afterward.

 

The problem is occurring with multiple programs, from simple command line tools that solve small sets of linear equations, to density functional theory codes.

 

This behavior is also occurring with ACML 3.6.0.  Static or shared libraries.

Outcomes