Q1. How to restart an OpenCL compatible device provided by AMD ?
a. GPU ?
- warm start instead of power switch on/off ?
b. CPU
- just OpenCL runtime and OpenCL tasks i.e not other apps ?
Q2. How to reset context ?
a. just setting reference count to zero ?
b. killing resources (tasks/processes) using the context ?
What do you want to achieve?
Hi Lee,
1. to kill OpenCL processes/tasks/threads not responding
Achievment: system and other apps continue normally
2. to release resources (RAM, queues, stack...) reserved by OpenCL dead / not responding/ faulty system
Achievment: necessary resources usable to OS and other apps
3. to continue by restarting just OpenCL system - not OS and other apps
Achievment: no side effects & decreased quality of service to other apps
4. to avoid power off/shutdown
Achievment: power on and system stays online
5. in case of several OpenCL devices to restart just a dead device
Achievment: other devices keep on offering their services
I would imagine that you wouldn't be able to do most of that programmatically. The device is owned by the operating system, after all. I don't know if the driver has any calls to reset it but I doubt there would ever be a wish to put such a thing into the OpenCL standard.
Thanks for your answer Lee !
Referring to Topic Title: Detecting device resets
http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=147247&highlight_key=y&keyword1=Detec...
where TDR = Timeout Detection and Recovery of GPU
Otterz:
* was using windows 7, with TDR enabled
* tried registering a callback when creating the context, but that callback did not seem to be called !
Raistmer: "There is no correct way for program to be informed of driver failure/restart. And this is really bad, especially when app should run unattended".
My feedback:
~ callback is defined in OpenCL specs and should be called in exceptions
~ I agree (so do our clients) with Raistmer - apps should run unattended
My Question: is the user defined callback function called in the latest AMD SDK version in case GPU makes reset ?