Hi,
I was quite disappointed to find that the AMD Firestream drivers require that an X server be running. It seems pretty arrogant for AMD to market a card for HPCC purposes, then create such a hurdle for HPCC users who usually don't have Xorg installed in the first place.
To make matters worse, it seems that this X server dependency is causing problems that make our four AMD Firestream 9270s completely useless in our production environment.
Observe the following:
20:16 astram@sailfish:~$ let i=1; while [[ $rc -eq 0 ]]; do echo "Iteration #$i"; /opt/firestream/bin/x86_64/fglrxinfo ; rc=$?; let i++; done &> out
20:23 astram@sailfish:~$ head out
Iteration #1
display: :0.0 screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD FireStream 9270
OpenGL version string: 3.3.11631 Compatibility Profile Context FireGL
display: :0.0 screen: 1
OpenGL vendor string: Advanced Micro Devices, Inc.
20:23 astram@sailfish:~$ tail -n15 out
Iteration #234
display: :0.0 screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD FireStream 9270
OpenGL version string: 3.3.11631 Compatibility Profile Context FireGL
display: :0.0 screen: 1
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD FireStream 9270
OpenGL version string: 3.3.11631 Compatibility Profile Context FireGL
Iteration #235
Maximum number of clients reachedMaximum number of clients reachedMaximum number of clients reachedMaximum number of clients reachedError: unable to open display (null)
If these are instances of an OpenCL program, instead of fglrxinfo, I will no longer be able to compute on the AMD gpus until Xorg is restarted.
It seems that with each instance of fglrxinfo (or anything that queries a GPU, for that matter), Xorg is prompted to open "/dev/ati/card0", but won't release it even when the program is done executing. Thus the "maximum number of clients" limit is hit pretty quickly.
Hence, after 235 iterations, I have something like this:
-su-4.1# lsof -p `pidof Xorg` | tail | |||||
Xorg | 22099 root 246u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 247u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 248u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 249u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 250u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 251u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 252u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 253u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 254u | CHR | 251,0 | 0t0 | 8270 /dev/ati/card0 |
Xorg | 22099 root 255u netlink | 0t0 | 171989 KOBJECT_UEVENT |
-su-4.1#
-su-4.1# /tmp/lsof -p `pidof Xorg` | grep '/dev/ati/' | wc -l
245
This is with the latest fglrx drivers, and X.Org X Server 1.9.3.
Any ideas are greatly appreciated.
--Alex
Solved! Go to Solution.
This can be fixed by starting X with the -noreset command. One of our developers starts X with the following command to fix the issue:
nohup /usr/bin/X11/X -ac -noreset &
This is only a solution if X is only being used for running OpenCL apps and not if you are using X for other uses.
Does this show up with clinfo, or any other OpenCL application? Or only fglrxinfo?
Yes, it occurs with clinfo, as well as any OpenCL application that I write.
I repeated the same experiment above, starting with a fresh instance of Xorg, and using clinfo instead of fglrxinfo. It is not long before the GPU device disappears (CPU device remains), with "Maximum number of clients reached" being printed to stderr.
And of course,
-su-4.1# lsof -p `pidof Xorg` | grep '/dev/ati/' | wc-l
245
-su-4.1#
Also, there seems to be a few other people that have had this problem: http://devgurus.amd.com/thread/149964
This can be fixed by starting X with the -noreset command. One of our developers starts X with the following command to fix the issue:
nohup /usr/bin/X11/X -ac -noreset &
This is only a solution if X is only being used for running OpenCL apps and not if you are using X for other uses.
Great, I got through 800+ iterations and:
-su-4.1# lsof -p `pidof Xorg` | grep '/dev/ati/' | wc -l
11
So it appears that this solved it. Thanks.
Hi!
Could someone help me how I can make Ubuntu start X with the following options as default options? I cannot find where exactly does the default init chain invokes X, neither can I find ServerFlags in xorg.conf that could correspond to these command-line switches. I intend on using the machine specified here to do remote OpenCL computations (with multiple users) alongside OpenGL as well (VirtualGL that is). I sort of get lost in the syntax of X startup scripts and all the flavours of login managers that invoke (or disregard) custom scripts.