The last few days I was chasing a bug which caused all kinds of strange and random driver crashes..
I finally tracked it down to me giving invalid arguments to clEnqueueNDRangeKernel(), namely the global and local work size.
In some cases I accidentally passed a global work size which was not evenly divisable by the local work size. Of course, that is an error on my end, but according to the standard the function should return an error if something like that is attempted. As far as I can see it never actually does that for the not-evenly-divisable error.
The documentation says:
local_work_size is specified and number of work-items specified by
global_work_size is not evenly divisable by size of work-group given by
With all the different limitations on the global and local work size I feel like this must be one the easiest errors to check for. So I wonder why the OpenCL implementation happily accepts it and returns CL_SUCCESS. Can someone clarify whether this is actually a bug in the OpenCL implementation?
My system specs:
Windows 8.1 x64
Catalyst 15.7 (OpenCL 2.0 AMD-APP (1800.5)
Radeon HD 7870
Intel Core i5-2500K