I'm trying the FFT sample on a Ubuntu 11.04 with a FX-8150 and 4 Firepro 3d V5800
My hardware:
Motherboard: MSI FX990-GD80
CPU: AMD FX 8150 3.6 GHz
RAM: 32Gb
4x FirePro 3d V5800
OS Ubuntu 11.04
Drivers and SDK:
FirePro_8.911.3.1_Linux_X32X64_132092.zip
AMD-APP-SDK-v2.6-lnx64.tgz
The FFT sample runs fine on the GPUs:
valerio@hotshot:~$ /opt/AMDAPP/samples/opencl/bin/x86_64/FFT --device gpu
Platform 0 : Advanced Micro Devices, Inc.
Original Input Real
15.3732 201.81 51.9855 89.2322 92.572 34.4675 96.2478 66.3863 11.345 225.168
Original Input Img
0.0600514 0.788318 0.203068 0.348563 0.361609 0.134639 0.375968 0.259322 0.0443163 0.879562
Platform found : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : Juniper Device ID is 0x20fd5e0
Device 1 : Juniper Device ID is 0x28a5e30
Device 2 : Juniper Device ID is 0x2aced60
Device 3 : Juniper Device ID is 0x2a0a670
Executing kernel for 1 iterations
-------------------------------------------
Output real
131643 -1085.95 -997.15 -1791.52 532.118 1659.74 -166.271 969.692 1189.76 -862.707
Output img
514.23 2289.84 936.489 -603.839 699.7 1018.18 1900.06 795.439 -1328.03 -293.334
But it crashes on the cpu:
valerio@hotshot:~$ /opt/AMDAPP/samples/opencl/bin/x86_64/FFT --device cpu
Platform 0 : Advanced Micro Devices, Inc.
Original Input Real
15.3732 201.81 51.9855 89.2322 92.572 34.4675 96.2478 66.3863 11.345 225.168
Original Input Img
0.0600514 0.788318 0.203068 0.348563 0.361609 0.134639 0.375968 0.259322 0.0443163 0.879562
Platform found : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : AMD FX(tm)-8150 Eight-Core Processor Device ID is 0x1c0c650
<inline asm>:1:2: error: invalid instruction mnemonic 'vfmaddsd'
vfmaddsd %xmm2, %xmm0, %xmm1, %xmm0
^
LLVM ERROR: Error parsing inline asm
I seems that the llvm compiler, when run on a bulldozer cpu generates code for a valid instruction that is unknown to the assembler.
Is this a known problem? Am I doing something wrong?
Thanks,
Valerio
Solved! Go to Solution.
Yep, that was it. I restored the library libamdocl64.so from AMD-APP-SDK-v2.6-RC3-lnx64 to
/opt/AMDAPP/lib/x86_64/
and the problem went away.
There must be an old version of libamdocl64.so that comes with the Ubuntu open source package fglrx that uses llvm/clang to compile the code. That library was getting in the way.
dneill, thank you very much for your help and ideas,
Valerio
find /opt/AMDAPP/samples/opencl/bin/x86_64 -name as -print
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.10
DISTRIB_CODENAME=maverick
DISTRIB_DESCRIPTION="Ubuntu 10.10"
$ uname -a
Linux gcc0 2.6.35-22-generic #33-Ubuntu SMP Sun Sep 19 20:32:27 UTC 2010 x86_64 GNU/Linux
$ echo '.text
_start:
vfmaddsd %xmm0, %xmm1, %xmm2, %xmm3
' > tst.s
$ /usr/bin/as --version
GNU assembler (GNU Binutils for Ubuntu) 2.20.51-system.20100908
Copyright 2010 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-linux-gnu'.
$ /usr/bin/as -o tst.o tst.s
$ /usr/bin/objdump -d tst.o
tst.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_start>:
0: c4 e3 e9 6b d8 10 vfmaddsd %xmm0,%xmm1,%xmm2,%xmm3
Hi @valerioa,
I see that the AMD-APP-SDK-v2.6-lnx64.tar doesn't come with an assembler, so you'll have to determine what assembler is being called.
Perhaps your PATH is picking up an older assembler or you are missing a development package?
$ which as
/usr/bin/as
$ /usr/bin/as --version
GNU assembler (GNU Binutils for Ubuntu) 2.20.51-system.20100908
Copyright 2010 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-linux-gnu'.
$ dpkg-query --search /usr/bin/as
binutils: /usr/bin/as
$
$ dpkg-query --list binutils
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Description
+++-===========================-===========================-======================================================================
ii binutils 2.20.51.20100908-0ubuntu2 The GNU assembler, linker and binary utilities
Hi qneill,
thank you for your replies. I believe that with the AMD-APP-SDK-v2.6-lnx64.tar version of the SDK, the assembler is embedded in some opencl library. You can check it with:
strace -f -e trace=execve -v /opt/AMDAPP/samples/opencl/bin/x86_64/FFT --device cpu
execve("/opt/AMDAPP/samples/opencl/bin/x86_64/FFT", ["/opt/AMDAPP/samples/opencl/bin/x"..., "--device", "cpu"], ["TERM=xterm", "SHELL=/bin/bash", "XDG_SESSION_COOKIE=aba4fc779fbe0"..., "SSH_CLIENT=10.22.0.100 62778 22", "SSH_TTY=/dev/pts/0", "USER=valerio", "LS_COLORS=rs=0:di=01;34:ln=01;36"..., "LD_LIBRARY_PATH=:/opt/AMDAPP/lib"..., "MAIL=/var/mail/valerio", "PATH=/usr/local/sbin:/usr/local/"..., "AMDAPPSDKROOT=/opt/AMDAPP", "PWD=/home/valerio/log", "LANG=en_US.UTF-8", "SHLVL=1", "HOME=/home/valerio", "LOGNAME=valerio", "SSH_CONNECTION=10.22.0.100 62778"..., "LESSOPEN=| /usr/bin/lesspipe %s", "LESSCLOSE=/usr/bin/lesspipe %s %"..., "_=/usr/bin/strace", "OLDPWD=/home/valerio"]) = 0
Platform 0 : Advanced Micro Devices, Inc.
Original Input Real
15.3732 201.81 51.9855 89.2322 92.572 34.4675 96.2478 66.3863 11.345 225.168
Original Input Img
0.0600514 0.788318 0.203068 0.348563 0.361609 0.134639 0.375968 0.259322 0.0443163 0.879562
Platform found : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : AMD FX(tm)-8150 Eight-Core Processor Device ID is 0x3878f00
Process 5015 attached
Process 5016 attached
Process 5017 attached
Process 5018 attached
Process 5019 attached (waiting for parent)
Process 5019 resumed (parent 5015 ready)
Process 5020 attached
Process 5021 attached
Process 5022 attached (waiting for parent)
Process 5022 resumed (parent 5015 ready)
Process 5023 attached (waiting for parent)
Process 5023 resumed (parent 5015 ready)
<inline asm>:1:2: error: invalid instruction mnemonic 'vfmaddsd'
vfmaddsd %xmm2, %xmm0, %xmm1, %xmm0
^
LLVM ERROR: Error parsing inline asm
) = ? <unavailable>
No process is invoked with execve, so I suppose the assembler is embedded with the library. I was hoping the AMD APP SDK developers team would chime in.
You can also check it with:
strace -o log -ff /opt/AMDAPP/samples/opencl/bin/x86_64/FFT --device cpu
the GNU as on ubuntu 11.04 supports that instructions, as you showed on 10.10
Hmm, when I do that, I see execve searching around in my PATH for 'as' like this...
$ strace -f -e trace=execve -v /opt/AMDAPP/samples/opencl/bin/x86_64/FFT --device cpu 2>&1 | cut -c1-80
execve("/opt/AMDAPP/samples/opencl/bin/x86_64/FFT", ["/opt/AMDAPP/samples/opencl
Platform 0 : Advanced Micro Devices, Inc.
Original Input Real
15.3732 201.81 51.9855 89.2322 92.572 34.4675 96.2478 66.3863 11.345 225.168
Original Input Img
0.0600514 0.788318 0.203068 0.348563 0.361609 0.134639 0.375968 0.259322 0.04431
Platform found : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : AMD Engineering Sample Device ID is 0x1056250
Process 22988 attached (waiting for parent)
Process 22988 resumed (parent 22987 ready)
Process 22989 attached
Process 22990 attached (waiting for parent)
Process 22990 resumed (parent 22988 ready)
Process 22991 attached (waiting for parent)
Process 22991 resumed (parent 22988 ready)
Process 22992 attached (waiting for parent)
Process 22992 resumed (parent 22988 ready)
Process 22993 attached (waiting for parent)
Process 22993 resumed (parent 22987 ready)
Process 22987 suspended
[pid 22993] execve("/home/qneill/bin/as", ["as", "--64", "/tmp/OCL0C9tsr.s", "-o
[pid 22993] execve("/usr/local/sbin/as", ["as", "--64", "/tmp/OCL0C9tsr.s", "-o"
[pid 22993] execve("/usr/bin/as", ["as", "--64", "/tmp/OCL0C9tsr.s", "-o", "/tmp
Process 22987 resumed
Process 22993 detached
[pid 22987] --- SIGCHLD (Child exited) @ 0 (0) ---
Process 22994 attached (waiting for parent)
Process 22994 resumed (parent 22987 ready)
Process 22987 suspended
[pid 22994] execve("/home/qneill/bin/ld", ["ld", "-m", "elf_x86_64", "-shared",
[pid 22994] execve("/usr/local/sbin/ld", ["ld", "-m", "elf_x86_64", "-shared", "
[pid 22994] execve("/usr/bin/ld", ["ld", "-m", "elf_x86_64", "-shared", "/tmp/OC
Process 22987 resumed
Process 22994 detached
[pid 22987] --- SIGCHLD (Child exited) @ 0 (0) ---
Executing kernel for 1 iterations
-------------------------------------------
Process 22989 detached
Process 22990 detached
Process 22991 detached
Process 22992 detached
Process 22988 detached
Output real
131643 -1085.95 -997.15 -1791.52 532.119 1659.74 -166.27 969.692 1189.76 -862.70
Output img
514.23 2289.84 936.49 -603.839 699.7 1018.18 1900.06 795.439 -1328.03 -293.334
qneill,
which CPU do you run this on? is "AMD Engineering Sample Device ID is 0x1056250" an Athlon?
When I run it on a an Intel CPU or a k8-family AMD, I can see the execve's happening as well. But when I run it on a FX 8150, I don't see them anymore. I suspect that somewhere in clBuildProgram() there must be something like:
if (cpu =="Intel") {
execve(...)
} else if ( cpu == "AMD-k8") {
execve(...)
} else if ( cpu == "AMD-k10") {
execve(...)
} else if ( cpu == "AMD-15h" ) {
run_embedded_llvm_compiler_and_as(...)
}
I ran it on FX-8120 and saw the same behavior (execve of /usr/bin/as after trying other things in PATH) as well:
# head /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 21
model : 1
model name : AMD FX(tm)-8120 Eight-Core Processor
I reached out to the OpenCL team inside AMD, and @MicahVillmow on the forums, We should hear from them soon.
Message was edited by: Quentin Neill - added @MicahVillmow to the thread.
valerioa,
I got a response from an engineer internally. The CPU runtime uses the system assembler and the one on your system is not new enough to support FMA4 instructions. So, there are two options.
1) Update the system assembler to a newer version
2) pass in the compiler option "-disable-avx" to clBuildProgram to disable AVX code generation.
Micah,
thank you for your help. Would you kindly put me in that with that engineer? Reasons I'm asking is that the as I have actually does build AVX instructions
valerio@hotshot:~$ /usr/bin/as --version
GNU assembler (GNU Binutils for Ubuntu) 2.21.0.20110327
Copyright 2011 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-linux-gnu'.
valerio@hotshot:~$ cat test.s
_start:
vfmaddsd %xmm0, %xmm1, %xmm2, %xmm3
valerio@hotshot:~$ /usr/bin/as -o test.o test.s
valerio@hotshot:~$ objdump -D test.o
test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_start>:
0: c4 e3 e9 6b d8 10 vfmaddsd %xmm0,%xmm1,%xmm2,%xmm3
and when I run one of the samples under ltrace or strace, I see not assembler being execve'd during the process. It almost appears as if there is an llvm-as embedded in the library?
Hi @valerioa,
Can you try "strace -f" without the -e and see if an assembler is being called via some system call other than execve9()? Perhaps FFT is using another variant like plain exec().
Alternately, can you copy /usr/bin/as to the current directory and add PATH=.:$PATH and see if the problem goes away?
Just some thoughts.
Hello qneill,
and thank you very much for your willingness to help. I appreciate it very much. some extra infos:
1) I've installed the most recent binutils package from the GNU ftp site (binutils-2.22). the newest as is in the path
valerio@hotshot:~$ which as
/usr/local/pkg/binutils-2.22/bin/as
but the net result is the same:
valerio@hotshot:~$ /opt/AMDAPP/samples/opencl/bin/x86_64/FFT --device cpu
Platform 0 : Advanced Micro Devices, Inc.
Original Input Real
15.3732 201.81 51.9855 89.2322 92.572 34.4675 96.2478 66.3863 11.345 225.168
Original Input Img
0.0600514 0.788318 0.203068 0.348563 0.361609 0.134639 0.375968 0.259322 0.0443163 0.879562
Platform found : Advanced Micro Devices, Inc.
Selected Platform Vendor : Advanced Micro Devices, Inc.
Device 0 : AMD FX(tm)-8150 Eight-Core Processor Device ID is 0x2069580
<inline asm>:1:2: error: invalid instruction mnemonic 'vfmaddsd'
vfmaddsd %xmm2, %xmm0, %xmm1, %xmm0
^
LLVM ERROR: Error parsing inline asm
2) the error string "invalid instruction mnemonic '%s'" does not come from any as. if you put an invalid mnemonic in your test.s file as
valerio@hotshot:~$ cat test.s
_start:
vfmaddsdX %xmm0, %xmm1, %xmm2, %xmm3
the error coming from as would look very different
test.s: Assembler messages:
test.s:2: Error: no such instruction: `vfmaddsdx %xmm0,%xmm1,%xmm2,%xmm3'
3) the string 'invalid instruction mnemonic' can only be found in /usr/lib/fglrx/libamdocl64.so (or its 32bits counterpart), so I suspect that whatever attempts to assemble that code is in libamdocl64.so.
4) I did a full strace. I see no other forks or exec to run as.
Maybe I have the wrong libamdocl64.so ?
Mine is:
valerio@hotshot:~$ ls -l /usr/lib/fglrx/libamdocl64.so
-rw-r--r-- 1 root root 22287280 2012-03-06 14:54 /usr/lib/fglrx/libamdocl64.so
valerio@hotshot:~$ md5sum /usr/lib/fglrx/libamdocl64.so
8ab626fcdd8fa44bdeec995f4034ecd7 /usr/lib/fglrx/libamdocl64.so
qneill would you mind checking which library is loaded on your FX 8120? doing a
strace -o log opt/AMDAPP/samples/opencl/bin/x86_64/FFT --device cpu
and finding in log which libamdocl64.so library is loaded?
There should be a line with
open("/xxx/yyy/zzzz/libamdocl64.so", O_RDONLY) = <a positive integer number>
That's the library that is actually loaded.
Thanks,
Yep, that was it. I restored the library libamdocl64.so from AMD-APP-SDK-v2.6-RC3-lnx64 to
/opt/AMDAPP/lib/x86_64/
and the problem went away.
There must be an old version of libamdocl64.so that comes with the Ubuntu open source package fglrx that uses llvm/clang to compile the code. That library was getting in the way.
dneill, thank you very much for your help and ideas,
Valerio
No problem.
Interesting interaction that the OpenCL team should be tracking probably anyway.
Cheers,
--
Quentin
qneill,
you are right. As a matter of fact I was wrong in thinking that the library /usr/lib/fglrx/libamdocl64.so came from the open source fglrx package. It actually come from the debian package generated from the fglrx-8.911.3.1 driver off of the AMD website.
In the past, there must have been a version of the libamdocl64.so from AMD that used the llvm compiler for bulldozer CPUs
Details:
root@hotshot:~/ati# md5sum /usr/lib/fglrx/libamdocl64.so
8ab626fcdd8fa44bdeec995f4034ecd7 /usr/lib/fglrx/libamdocl64.so
root@hotshot:~/ati# dpkg -S /usr/lib/fglrx/libamdocl64.so
fglrx: /usr/lib/fglrx/libamdocl64.so
root@hotshot:~/ati# dpkg -l |grep fglrx
ii fglrx 2:8.911-0ubuntu1 Video driver for the AMD graphics accelerators
ii fglrx-amdcccle 2:8.911-0ubuntu1 Catalyst Control Center for the AMD graphics accelerators
ii fglrx-dev 2:8.911-0ubuntu1 Video driver for the AMD graphics accelerators (devel files)
Nice work. This will help others who might see similar issues, thanks for posting.