JohnRandall

Bug Report for Stream SDK v2.0

Discussion created by JohnRandall on Mar 15, 2010
Latest reply on Mar 16, 2010 by nou
linux kernel oops with SDK v2.0

I have encountered a linux issue that occurs with:
 - streams support V2.01.
 - both FirePro V8750 and FireStream 9270 cards
 - user-space environment RHEL 5.4
 - running 2.6.31.12 and 2.6.26.8 kernels
 - the ati fglx driver is ati catalyst 10-2
   (which is kernel module fglrx 8.70.3 [Feb  2 2010])

( This issue was posted as bugzilla #1781 in the Unofficial ATI
Linux Driver Bugzilla database. )

The linux kernel will oops with when trying to wakeup a task that has
exited the system.  The fglrx KCL_WAIT_Wakeup() routine attempts to wakeup
a task that is still in a KCL_WAIT_ObjectHandle/wait_queue_head_t list
even though that task has long since exited the system.

This issue occurs when running a Streams sdk sample test such as NBody, and
then using <ctrl>c to terminate the test.  (Terminating other tests such
as MonteCarloAsian with a <ctrl>c also causes the same issue to occur.)

At this point, the fglrx driver fails to call KCL_WAIT_Remove() to remove
this task from the wait queue when this terminated task exits the kernel.

When another Streams test is executed, this same KCL_WAIT_ObjectHandle
structure is used to wakeup all waiting tasks and we oops in the
kernel attempting to wakeup an non-existent task.

Adding code to detect and longer existing tasks in the
KCL_WAIT_ObjectHandle wait queue, and then removing these KCL_WAIT_Handle
entries from the wait queue before making the wake_up_interruptible()
call gets around this oops issue, but later on when exiting the X session,
the X server ends up looping forever in the fglrx driver (during exit
close processing), appearently waiting for these terminated task(s)
to terminate.

Outcomes