mosix0

OpenCL library steals signals

Discussion created by mosix0 on Jan 6, 2010
Latest reply on Feb 18, 2010 by mosix0

I have a program that uses OpenCL, but also does other things.  At times the program sleeps waiting for external signals (using "sigpause()"), but then I discovered that it failed to wake up when an expected signal arrived.

 

I searched and found the reason: the OpenCL library uses full threads, including the clone-flags CLONE_SIGHAND and CLONE_THREAD: this causes the library threads to share signals with the main program, so at random an OpenCL library-thread picks up a signal that is intended to wake the main program, processes the interrupt and returns to whatever it was doing before, but then the main program is never awakened and remains stuck.

 

I believe that although the library shares memory and perhaps also some file-descriptors with the main program, it has no need to share signals as well.

 

In the least, if CLONE_SIGHAND cannot be avoided, the library should block all signals that it does not use - that would direct the Linux kernel to send those signals to the main program, rather than to a library thread.

 

(note that if for any reason the library chooses to continue sharing signals with the main program, but only block them, there can still be a race just between the time that a library-thread is created and when it blocks unused signals - to prevent this race, the library should block all signals before calling CLONE, then unblock them in the parent/main thread, but not unblock them in the new library thread(s)).

 

Hope this is not too complicated, but it is really a bummer to have the OpenCL library which is supposed to be a "black-box" affect unrelated aspects of the calling program.

Outcomes