2 Replies Latest reply on Dec 1, 2015 9:07 AM by nibal

    Cards supporting OpenCL 2.0 are executing kernels in random order while there's no CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE flag specified.

    neko

      Hello,

      we found a strange bug, where kernels are executed in wrong order even when command queue was not created with CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE flag. This affects only cards that supports OpenCL 2.0, see attached minimal example for demonstration. Expected output is:

       

      Advanced Micro Devices, Inc., OpenCL 2.0 ...

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

      OpenCL version 1.2, OK

       

      while we are getting:

      Advanced Micro Devices, Inc., OpenCL 2.0 ...

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

      OpenCL version 1.2, FAIL (value[2] is 222 but it should be 333)

       

      ,which clearly shows that 1st kernel is executed after the 2nd one.

       

      NOTE:

      - executed kernels share the same program and name, differs only in parameter values

      - changing build option at line 63 in attached example bug5.cpp from "-cl-std=CL1.2" to "-cl-std=CL2.0" seems to fix the problem, however we are using SPIR as obfuscation so this is not really an option since SPIR 2.0 is only in provisional stage and it's still not working properly

      - moving local array zero[4] from line 85, outside of for loop fixes all kernel executions except first 2

      - removing const from kernel.cl line 1, __global const int * const buffer => __global int * const buffer, also seems to help, sadly this works only for this simple example

      - maybe another bug, if -cl-std= option is followed by junk i.e -cl-std=abc, not white space(s), it get's evaluated as CL2.0 instead of reporting error.

       

      Confirmation and/or workaround suggestions or even better fix would be greatly appreciated.