cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

yurtesen
Miniboss

Support for headless GPU operation???

SDK 2.5 release notes mention headless GPU operation. I have installed SDK 2.6 and Catalyst 11.11 revision 12.1 on Fedora 16 and clinfo shows the cpu if I log into X only.

If the X is running (at login screen),. but I login remotely via ssh only then clinfo only shows CPU devices.

I searched the forums and it appears some people could get headless operation working and some couldnt. What am I doing wrong here? any hints?

0 Likes
1 Solution
dmeiser
Elite

For remote login there are also permission issues to deal with. AMD has the following knowledge base article for setting up headless operation:

http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=19

There are also these instructions:

http://code.compeng.uni-frankfurt.de/projects/caldgemm/wiki/Headless_system

Hope that helps.

Dominic

View solution in original post

14 Replies
nou
Exemplar

you need export COMPUTE=:0 variable and login user.

0 Likes

Setting the variable after logging in does not help. I am not sure what you meant by "login user" ?

0 Likes
dmeiser
Elite

For remote login there are also permission issues to deal with. AMD has the following knowledge base article for setting up headless operation:

http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=19

There are also these instructions:

http://code.compeng.uni-frankfurt.de/projects/caldgemm/wiki/Headless_system

Hope that helps.

Dominic

I already see that permissions are probably ok

crw-rw-rw-  1 root root 248, 0 Mar  7 22:54 card0

I checked the knowledgebase article and there doesnt seem to be anything interesting. I even tried to run program as rootto avoid any permission issues...and set DISPLAY=:0 and COMPUTE=:0 even

I wonder why they couldnt make this easier to function... I wonder if AMD's developers even know that it is so difficult to use AMD devices for OpenCL

0 Likes

I hear your pain.  I'm currently trying to setup a multi-gpu machine under centos 6 and I'm not having any luck either.  At some point I was able to use two of the three gpus when logged in locally.  Then I tried to set it up for remote operation and since then I can't use any of the GPUs, not even when logged in locally.  Not sure how I broke it.  I'll report back when I find out anything.

Cheers,

Dominic

0 Likes

You were right! I was missing

xhost +local:

in /etc/gdm/Init/Default (the link you gave says xhost + but +local: seems more secure and works...)

I had to setup COMPUTE=:0 also in a script at /etc/profile.d/ folder...

Now, I have the same problem on Ubuntu but I am not sure which file to edit for lightdm... any ideas?

0 Likes

To setup multi-gpu configuration do the following:

  • Verify the AMD kernel module was successfully compiled and load, call:  lsmod | grep fglrx .
  • call:  'aticonfig --lsa' to verify the driver sees all the devices.
  • call with root privileges : 'aticonfig --initial --adapter=all'  and restart the X server.
  • set the COMPUTE environment variable to be ':0' .

X-server must be running for the runtime to enumerate the GPUs either locally or remotely, we are working to separate OpenCL from X .

0 Likes

# lsmod | grep fglrx

fglrx                3123620  51

# aticonfig --lsa

* 0. 80:00.0 ATI Radeon HD 5800 Series

* - Default adapter

# aticonfig --initial --adapter=all

Found fglrx primary device section

Using /etc/X11/xorg.conf

Saving back-up to /etc/X11/xorg.conf.fglrx-0

# ps ax |grep X

6880 tty7     Ss+    0:01 /usr/bin/X vt7 -background none -nolisten tcp -auth /var/run/kdm/A:0-QuNRtc

6958 pts/0    S+     0:00 grep --color=auto X

# systemctl stop prefdm.service

# ps ax |grep X

6967 pts/0    S+     0:00 grep --color=auto X

# systemctl start prefdm.service

# ps ax |grep X

6974 tty7     Ss+    0:01 /usr/bin/X vt7 -background none -nolisten tcp -auth /var/run/kdm/A:0-5fT3Ca

6993 pts/0    S+     0:00 grep --color=auto X

# export COMPUTE=:0

# clinfo |grep GPU

# export DISPLAY=:0

# clinfo |grep GPU

#

I get results only if:

# systemctl stop prefdm.service

# X &

... x startup texts...

# clinfo |grep GPU

  Device Type:                               CL_DEVICE_TYPE_GPU

#

As you can see your instructions do not really function that well at all... The box is a Fedora 16 with 12.1 drivers.

amd-driver-installer-12-1-x86.x86_64.run

I think you should give some speed to seperating X and OpenCL. I imagine it would also be useful to set a kernel run timeout for recovering from a crash, I am tired of rebooting the machine when there is a problem in the code

0 Likes

please run : lspci .

It shows an enumeration of all the pci devices recognized by linux kernel.

sincerely

Tzachi

0 Likes

$ lspci

00:00.0 Host bridge: Intel Corporation 5400 Chipset Memory Controller Hub (rev 20)

00:01.0 PCI bridge: Intel Corporation 5400 Chipset PCI Express Port 1 (rev 20)

00:05.0 PCI bridge: Intel Corporation 5400 Chipset PCI Express Port 5 (rev 20)

00:09.0 PCI bridge: Intel Corporation 5400 Chipset PCI Express Port 9 (rev 20)

00:10.0 Host bridge: Intel Corporation 5400 Chipset FSB Registers (rev 20)

00:10.1 Host bridge: Intel Corporation 5400 Chipset FSB Registers (rev 20)

00:10.2 Host bridge: Intel Corporation 5400 Chipset FSB Registers (rev 20)

00:10.3 Host bridge: Intel Corporation 5400 Chipset FSB Registers (rev 20)

00:10.4 Host bridge: Intel Corporation 5400 Chipset FSB Registers (rev 20)

00:11.0 Host bridge: Intel Corporation 5400 Chipset CE/SF Registers (rev 20)

00:15.0 Host bridge: Intel Corporation 5400 Chipset FBD Registers (rev 20)

00:15.1 Host bridge: Intel Corporation 5400 Chipset FBD Registers (rev 20)

00:16.0 Host bridge: Intel Corporation 5400 Chipset FBD Registers (rev 20)

00:16.1 Host bridge: Intel Corporation 5400 Chipset FBD Registers (rev 20)

00:1b.0 Audio device: Intel Corporation 631xESB/632xESB High Definition Audio Controller (rev 09)

00:1c.0 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI Express Root Port 1 (rev 09)

00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #1 (rev 09)

00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #2 (rev 09)

00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #3 (rev 09)

00:1d.3 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI USB Controller #4 (rev 09)

00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset EHCI USB2 Controller (rev 09)

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)

00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC Interface Controller (rev 09)

00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev 09)

00:1f.2 RAID bus controller: Intel Corporation 631xESB/632xESB SATA RAID Controller (rev 09)

01:09.0 FireWire (IEEE 1394): Agere Systems FW322/323 (rev 61)

0e:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755 Gigabit Ethernet PCI Express (rev 02)

10:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Upstream Port (rev 01)

10:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to PCI-X Bridge (rev 01)

1e:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Port E1 (rev 01)

1e:01.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream Port E2 (rev 01)

20:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)

80:00.0 VGA compatible controller: ATI Technologies Inc Radeon HD 5870 (Cypress)

80:00.1 Audio device: ATI Technologies Inc Cypress HDMI Audio [Radeon HD 5800 Series]

I was logged in to X from console, but the output of the command is from a remote SSH session (in either case, the outputs are the same independent of where I am).

I am able to connect to card for running programs only when I start X from SSH (if I am not logged in at console) as can be seen from my previous post. I am not sure if this is the intended operation but strange from my point of view.

0 Likes

I have been piggy backing on this, and other related threads started by yurtesen, because I share some of the same problems.  I have both a 6670 and 6870 card in an 8120 machine running OpenSuSE 11.4 with the AMD APP 2.6 SDK installed.  The 6670 card is the default graphics card, and I wish to use the 6870 as a GPU.  Initially neither fglrxinfo nor clinfo recognized the 6870 card, but the instructions by tzachi.cohen to create the xorg.conf file using "aticonfig --initial --adapters=all" have succeeded in making it visible to fglrxinfo.  However, it remains hidden to clinfo, and I assume, therefore, OpenCL.  What is the purpose of the COMPUTE environment variable, and how should it be set in my situation?  Where is this documented, for that matter?  I note that there are no man pages for either aticonfig, fglrxinfo, or clinfo on my system, and they appeared to have been installed from the fglrx package.   Should there have been man pages for these commands?  If not, where can I find them?  I have the impression from another thread that accessing the 6870 requires definition and linking of an entirely new X session (as is created when another user logs in)  to that card before OpenCL will successfully communicate with it.  Is this true?  The current xorg.conf file sets the 6670 as display 0, screen 0, and the 6870 card as display 0, screen 1.  Shouldn't the 6870 card have a separate display to itself?

0 Likes

By default clinfo (or any other CL application) will see only the device associated with the current X-display. The current X-display is determined by the 'DISPLAY' environment variable. If the DISPLAY environment variable is set to ':0' all devices will be exposed. The COMPUTE environment variable is just an override to the DISPLAY environment variable.

Setting an environment variable in the context of a console process can be made by calling:

export COMPUTE=:0 

This is documented in the SDK release notes:

http://developer.amd.com/sdks/AMDAPPSDK/assets/AMD_APP_SDK_Release_Notes_Developer.pdf

As for the second issue, how to expose the GPU on a remote SSH session.

Bear two things in mind:

1.) The remote session has to be configured to use the local X server, e.g no '-X' as a command line argument .

2.) The machine has to be configured to privilege remote sessions to access the local X server. How to set this changes from one Linux distribution to another.

0 Likes

My thanks to Tzachi.Cohen for the instructions on the use of COMPUTE.  clinfo now sees the CPU and both GPUs, and I have been able to run the MatrixMultiplication sample on all three bits of hardware.  I see from another thread, just started, that I can restrict GPU computation to only one of the GPUs by setting COMPUTE=:0.0 or 0.1.  Thanks again.

Laurence Keefe

0 Likes

You can get usage information for aticonfig using

aticonfig --help

Cheers,

Dominic

0 Likes