While testing my assembler (CLRadeonExtender), I found bug in the legacy AMD OpenCL 1.2 compiler. Compiler adds weird instructions to code that should not be added. These instructions were added when compiler tries to compile kernel with tens arguments and when code will be compiled for GCN 1.1 (Bonaire, Hawaii, Spectre,...) GPU's. I reproduced this bug for Windows driver 16.12.2 (OpenCL version 2236.10) and my old Linux drivers 15.12 (1912.5). The generated code can behave inpredictable manner.
I found this bug, while I was trying to assemble compiled code, The assembler reported errors like that:
<stdin>:1305:23: Error: More than one SGPR to read in instruction
A generated code have illegal form of v_cndmask_b32 instruction:
v_cndmask_b32 v0, s32, v1, vcc
In this case, instructins read two scalar registers: S32 and double VCC. This code, ofcourse works incorrectly on GPU, giving an unexpected results.
In attachment are an OpenCL code and a disassembled code.
AFAIK, GCN 1.1 is not supported by the legacy compiler. Only SI cards are supported by the legacy compiler.
BTW, I do not understand the comment: “I found this bug, while I was trying to assemble compiled code”.
Legacy compiler can be called by using '-legacy' option for even new Ellesmere devices.
Just, I compiled an attached source code in OpenCL, I disassembled output binary with my disassembler, and finally I tried to assemble (with my assembler) that output of a disassembler.
I found similar weird code in compiled binaries by current compiler (OpenCL 2.0) for another my OpenCL code. Bug exists both in old compiler and current (OpenCL 2.0) compiler. Later, I will try to deliver a failing code for new compiler.
Thanks. I'll check with the compiler team and get back to you shortly.
Our compiler team don't see a single "cndmask" instruction in the code compiled from the above source, either with latest HSAIL or latest legacy AMDIL compiler. If there a bug in compiler, they need a test-case to reproduce it, not a custom disasm output showing incorrect instruction. I.e. they need a full dump collected with "–save-temps-all" option to see what our own compiler will disassemble from this.
Also, as I've come to know, "–legacy" is not supported, even if one can use it. AMDIL legacy compiler is not supported on anything but Tahiti.