Does anyone have any hints or tips on how to partition work to avoid triggering the watchdog timer on Windows?
I know it can be disabled / adjusted with various registry settings but I'm obviously reluctant to do that on customer PCs.
I've no control over the customer PCs GPU so I can't use a compile-time-fixed global work size per kernel invocation.
As an added complication, the amount of work performed by each kernel invocation depends on the data supplied by the user with each invocation of the kernel containing a for loop over the users data. [removing this loop is not an option]
My thoughts so far are to do some kind on one-time calibration of the customers GPU and then try to extrapolate from this some kind of global work size per kernel invocation. I can't say I find this solution very satisfying though and it still leaves me open to inconveniently distributed user data pushing me over the watchdog timer limit.
I'm hoping someone has got a better idea?
It seems to me you have no choice but to benchmark the customer's PC in terms of work-items per second and then adjust the global work size to produce NDRanges that result in no more than about 2 seconds per kernel invocation.