What do you want to achieve?
1. to kill OpenCL processes/tasks/threads not responding
Achievment: system and other apps continue normally
2. to release resources (RAM, queues, stack...) reserved by OpenCL dead / not responding/ faulty system
Achievment: necessary resources usable to OS and other apps
3. to continue by restarting just OpenCL system - not OS and other apps
Achievment: no side effects & decreased quality of service to other apps
4. to avoid power off/shutdown
Achievment: power on and system stays online
5. in case of several OpenCL devices to restart just a dead device
Achievment: other devices keep on offering their services
I would imagine that you wouldn't be able to do most of that programmatically. The device is owned by the operating system, after all. I don't know if the driver has any calls to reset it but I doubt there would ever be a wish to put such a thing into the OpenCL standard.
Thanks for your answer Lee !
Referring to Topic Title: Detecting device resets
where TDR = Timeout Detection and Recovery of GPU
* was using windows 7, with TDR enabled
* tried registering a callback when creating the context, but that callback did not seem to be called !
Raistmer: "There is no correct way for program to be informed of driver failure/restart. And this is really bad, especially when app should run unattended".
~ callback is defined in OpenCL specs and should be called in exceptions
~ I agree (so do our clients) with Raistmer - apps should run unattended
My Question: is the user defined callback function called in the latest AMD SDK version in case GPU makes reset ?