linux kernel oops with SDK v2.0
I have encountered a linux issue that occurs with:
- streams support V2.01.
- both FirePro V8750 and FireStream 9270 cards
- user-space environment RHEL 5.4
- running 2.6.31.12 and 2.6.26.8 kernels
- the ati fglx driver is ati catalyst 10-2
(which is kernel module fglrx 8.70.3 [Feb 2 2010])
( This issue was posted as bugzilla #1781 in the Unofficial ATI
Linux Driver Bugzilla database. )
The linux kernel will oops with when trying to wakeup a task that has
exited the system. The fglrx KCL_WAIT_Wakeup() routine attempts to wakeup
a task that is still in a KCL_WAIT_ObjectHandle/wait_queue_head_t list
even though that task has long since exited the system.
This issue occurs when running a Streams sdk sample test such as NBody, and
then using <ctrl>c to terminate the test. (Terminating other tests such
as MonteCarloAsian with a <ctrl>c also causes the same issue to occur.)
At this point, the fglrx driver fails to call KCL_WAIT_Remove() to remove
this task from the wait queue when this terminated task exits the kernel.
When another Streams test is executed, this same KCL_WAIT_ObjectHandle
structure is used to wakeup all waiting tasks and we oops in the
kernel attempting to wakeup an non-existent task.
Adding code to detect and longer existing tasks in the
KCL_WAIT_ObjectHandle wait queue, and then removing these KCL_WAIT_Handle
entries from the wait queue before making the wake_up_interruptible()
call gets around this oops issue, but later on when exiting the X session,
the X server ends up looping forever in the fglrx driver (during exit
close processing), appearently waiting for these terminated task(s)
to terminate.