cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

gbilotta
Adept III

Adventures in headless OpenCL support in Linux

Hello all, this is more of a bug report than a question.

I've noticed that finally since 14.4 (and maybe some driver versions before this), headless OpenCL finally works, under some specific circumstances.

Basically:

  1. if X is not running, but DISPLAY is set (e.g. because the user logged into the machine via ssh -X), then clinfo will not show the GPU as a device, because it will try to connect to the GPU of the DISPLAY
  2. if X is running, and COMPUTE is set to the display where X is running, but the user is a different one from the one logged in, then clinfo will simply segfault somewhere in the AMD ICD stack
  3. in my experience, the only way to reliably access the GPU for compute, especially for remote machines, is to shut off everything X-related (display managers, sessions, and even ssh X forwarding). In this case, it seems to work for any user, or at least those that have access to the /dev/ati/card* devices.

Conclusions: the situation is much improved from one or two years ago, but there are a few things that should be fixed. Particularly, the application segfaulting when not having device access is an absolute no-no, and ideally, when DISPLAY is set to a local forward, other access methods should be tried.

Keep up the good work.

0 Likes
2 Replies
sudarshan
Staff
Staff

Re: Adventures in headless OpenCL support in Linux

Hi,

Thanks for the suggestion. Will pass this to the concerned team.

0 Likes
ajk
Journeyman III

Re: Adventures in headless OpenCL support in Linux

Hello. The problem is still there. For example the following code will segfault with X-forwarding over ssh:

#include <dlfcn.h>

int main() {

  void *lib = dlopen("libOpenCL.so", RTLD_LAZY);

  return 0;

}

=> segfault at ecf ip 00007f781fef038b sp 00007fffcca11cb0 error 4 in libamdocl64.so[7f781f9fe000+223f000]

The workaround for now is to unset DISPLAY variable, create OpenCL context and set it back.

Please, resolve this bug, and, if possible, remove dependency on X at all (aticonfig doesn't show temperature and other info on adapters without X).

0 Likes