Hi, I was beginning investigation of the GCN instruction set, and I found incorrect working of the S_SUB_I32. Instruction returns correct subtraction's result, but also returns incorrect SCC register (in this register should be overflow, likewise as 'V' flag in x86 flag register). I wrote simple test (currently works under Linux, but is portable) that shows simple examples in which S_SUB_I32 works incorrectly. I checked results only for my Radeon HD 7850 (Pitcairn) graphics card.
Program checks following cases (this is output of that program):
sub_i32 #0: 10213 - 1256: value=8957, scc=0
Expected value=8957, 64bit: 8957, expected scc=0
SUCCESS
sub_i32 #1: 13234 - 42221: value=-28987, scc=1
Expected value=-28987, 64bit: -28987, expected scc=0
FAILED!! for sub_i32 scc #1: expected=0, result=1
sub_i32 #2: 6321 - -5343: value=11664, scc=0
Expected value=11664, 64bit: 11664, expected scc=0
SUCCESS
sub_i32 #3: 2114067115 - -63823599: value=-2117076582, scc=0
Expected value=-2117076582, 64bit: 2177890714, expected scc=1
FAILED!! for sub_i32 scc #3: expected=1, result=0
sub_i32 #4: -5343 - 6321: value=-11664, scc=0
Expected value=-11664, 64bit: -11664, expected scc=0
SUCCESS
sub_i32 #5: -63823599 - 2114067115: value=2117076582, scc=0
Expected value=2117076582, 64bit: -2177890714, expected scc=1
FAILED!! for sub_i32 scc #5: expected=1, result=0
sub_i32 #6: -10213 - -1256: value=-8957, scc=0
Expected value=-8957, 64bit: -8957, expected scc=0
SUCCESS
sub_i32 #7: -13234 - -42221: value=28987, scc=1
Expected value=28987, 64bit: 28987, expected scc=0
FAILED!! for sub_i32 scc #7: expected=0, result=1
In attachment is archive with program (including necessary OpenCL program binaries) and their source codes. Source code for OpenCL program is in bugcheck.gcn file that contains code for all devices (only Pitcairn was tested by me).
Can anyone reproduce that bug (on other GCN device types)?
EDIT: For CapeVerde users: please rename all '*-CapeVerde-*' files to '*-Capeverde-*'. I uploaded fixed package (radeonbug0_1.zip).
EDIT2: Added executables for Windows (Win7).
EDIT3: Fixed windows executable (wrong device detection).
EDIT4: Program after tests, waits for user action (for windows users).