cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

bubu
Adept II

Aborting long-time kernel, how?

Imagine I execute a very time-consuming kernel using clEnqueueRangeND().

How can I stop it if the user presses a button?

 

I'm currently trying to abort it as:

 

cl_event l_evt;

clEnqueueRangeND( .... &l_evt );

 

void OnCancelButtonClick()

{

    if ( l_evt!=NULL )

    {

       clReleaseEvent ( l_evt );

       clFinish();

    }

}

 

the question is... will that clReleaseEvent(l_evt)+clFinish() abort the kernel execution in a resonable time really?

ps: Before you suggest it... i CANNOT make the kernel simpler+use multiple kernel calls. Assume a BIG time-consuming kernel, pls. That's the whole point of the thread.

Thanks.

0 Likes
8 Replies
MicahVillmow
Staff
Staff

Aborting long-time kernel, how?

Only the OS has the ability to interrupt the GPU, as the GPU is not a pre-emptible device a user app cannot interrupt an execution. The OS has the ability to do this by reseting the device.
0 Likes
bubu
Adept II

Aborting long-time kernel, how?

Originally posted by: MicahVillmow Only the OS has the ability to interrupt the GPU, as the GPU is not a pre-emptible device a user app cannot interrupt an execution. The OS has the ability to do this by reseting the device.


Won't be possible to add some kind of GPU task manager ( like the Window's one ) in your Catalyst's CCC then? That would be fantastic to kill unresponsive GPGPU programs or to change priorities.

And, for this specific case... what would happen if I release the event but the kernel has not finished? A crash? Just by curiosity.

And a thing that procupies me... What if I disable manually the wathdog via registry's TdrLevel? A hacker could use his abilities to completely hang your computer with an infinite loop inside the kernel...

 

0 Likes
laobrasuca
Journeyman III

Aborting long-time kernel, how?

this reminds me this point: what about getting rid of system freezing when application crashes on GPU? Man, system reboot is a paint! Sometimes the system manages to restart the driver, saving me from rebooting, but lot's of time, it doesn't. However, when using the CPU as device, the system never freezes (application crashes, but there's no freeze whatsoever). Maybe Catalyst could auto restart when things go wrong?

0 Likes
MicahVillmow
Staff
Staff

Aborting long-time kernel, how?

laobrasuca,
Most likely your whole system didn't actually hang, what happens is you are hanging the GPU and it no longer is responding to the reset command. Because your GPU runs the GUI, your system GUI hangs and it seems like your system is hung. You should still be able to SSH into your machine at this point.

bubu,
Your kernel would still finish and then the event would get released.

The GPU is not a CPU, so you cannot treat it like one. There is no pre-emption, interruption or graceful error recovery on a lot of bad programs. If you infinite loop on the GPU, you most likely have to reboot your system.
0 Likes
laobrasuca
Journeyman III

Aborting long-time kernel, how?

micah, you're certainly right about not freezing the system itself, although you're quite tied up if you can't restart the GUI, driver or whatever.

I can understand that GPU does not support features like pre-emption, interruption or graceful error recovery, but with the advent of GPGPU the paradigms change. When you give the possibility to program the GPU, you've got to give the "crtl+c" possibility too when things go wrong. Unless it's technically impossible or it kills the GPU performance somehow (?), it would be great if it was supported, just like printf (cl_amd_printf) or advanced debugging features (in the 2.4 SDK maybe ).

0 Likes
MicahVillmow
Staff
Staff

Aborting long-time kernel, how?

laobrasuca,
It is a hardware issue, nothing in software will fix it.
0 Likes
jeff_golds
Staff
Staff

Aborting long-time kernel, how?

If you use Windows 7, then TDR (Timeout Detection and Recovery) can kick in and reset the GPU in many cases.  Of course, this same behavior can kill a perfectly working app too if your kernel takes too long!

This link, TDR info, has info about TDR settings in case you find you need to modify them.  For example, if you kernel takes 5s to run, then you can increase the TDR timeout accordingly.

Jeff

 

 

0 Likes
laobrasuca
Journeyman III

Aborting long-time kernel, how?

@micah: yes, for sure. I only hope HD7000 series have been designed with those features in mind.

@jeff: thx, i'll have a look on it

0 Likes