AMD-only, GPU-only buffer overrun in OpenCL driver
By using Microsoft's Application Verifier I've detected two problems which only occur when using AMD devices in an OpenCL application (these do not occur when using NVidia or Intel OpenCL devices). Furthermore, one of the issues on occurs when using a GPU device, as opposed to CPU.
Here is info on the configuration I'm testing with (this is in a machine with multiple GPUs, one from NVidia driving the display #2, and the other from AMD (Radeon HD 6900) driving display #1 and being used as the compute device.
This only occurs when using an AMD device, and only when using a device of type CL_DEVICE_TYPE_GPU. If I create a
CL_DEVICE_TYPE_CPU, or using another vendor's device, it does not occur.
When Application Verifier is enabled with Heap testing, it puts each allocation in its own page with guards before and after the allocation, and aligns the allocation to the end of the page. In the attached sample application, when the kernel finishes executing the application verifier will break the debugger with an access violation; usually before clFinish() exits, or occasionally right after. It outputs the following information to the output window:
This is also does not occur when using an OpenCL driver from another vendor.
This issue is identified when TLS tests are enabled for the application in Application Verifier. Upon launch of the application, during most OpenCL calls, the following exception is thrown in the debugger by the verifier:
VERIFIER STOP 00000301: pid 0x7C8: Invalid TLS index used for current stack trace.