Finally I made a minimal reproducing example of a bug in OpenCL compilers for Thaiti in Adrenalin Win10 x64 drivers (tested on two workstations with 19.12.2, 20.1.1 and 20.5.1 drivers with -O0 and -O5). Kernel is attached (it is a part of my realization of improved iterative Gauss-Seidel WDK-method for complex polynomial roots finding). As is, it gives wrong result for poly and its parts poly1 and poly2:
l=0, poly=0.0768369+0.00147968i, prod=13+6.62408e-17i, tau=-0.999259-0.0385005i
Changing 1 to 2 in the loop upper limit miraculously gives the right result:
l=0, poly=-1.11022e-16+0i, prod=13+6.62408e-17i, tau=-0.999259-0.0385005i
Another small changes in the code switch right and wrong results in a seemingly random way. For example, commenting out third line of output in printf here gives always right results, but when line of initial code
z[l] -= cdiv(poly, prod);
is added before printf, result starts to be consistently wrong even without printf.
On my old laptop with 15.200.1065.0 drivers this bug is absent.