Hi, I was beginning investigation of the GCN instruction set, and I found incorrect working of the S_SUB_I32. Instruction returns correct subtraction's result, but also returns incorrect SCC register (in this register should be overflow, likewise as 'V' flag in x86 flag register). I wrote simple test (currently works under Linux, but is portable) that shows simple examples in which S_SUB_I32 works incorrectly. I checked results only for my Radeon HD 7850 (Pitcairn) graphics card.
Program checks following cases (this is output of that program):
sub_i32 #0: 10213 - 1256: value=8957, scc=0
Expected value=8957, 64bit: 8957, expected scc=0
sub_i32 #1: 13234 - 42221: value=-28987, scc=1
Expected value=-28987, 64bit: -28987, expected scc=0
FAILED!! for sub_i32 scc #1: expected=0, result=1
sub_i32 #2: 6321 - -5343: value=11664, scc=0
Expected value=11664, 64bit: 11664, expected scc=0
sub_i32 #3: 2114067115 - -63823599: value=-2117076582, scc=0
Expected value=-2117076582, 64bit: 2177890714, expected scc=1
FAILED!! for sub_i32 scc #3: expected=1, result=0
sub_i32 #4: -5343 - 6321: value=-11664, scc=0
Expected value=-11664, 64bit: -11664, expected scc=0
sub_i32 #5: -63823599 - 2114067115: value=2117076582, scc=0
Expected value=2117076582, 64bit: -2177890714, expected scc=1
FAILED!! for sub_i32 scc #5: expected=1, result=0
sub_i32 #6: -10213 - -1256: value=-8957, scc=0
Expected value=-8957, 64bit: -8957, expected scc=0
sub_i32 #7: -13234 - -42221: value=28987, scc=1
Expected value=28987, 64bit: 28987, expected scc=0
FAILED!! for sub_i32 scc #7: expected=0, result=1
In attachment is archive with program (including necessary OpenCL program binaries) and their source codes. Source code for OpenCL program is in bugcheck.gcn file that contains code for all devices (only Pitcairn was tested by me).
Can anyone reproduce that bug (on other GCN device types)?
EDIT: For CapeVerde users: please rename all '*-CapeVerde-*' files to '*-Capeverde-*'. I uploaded fixed package (radeonbug0_1.zip).
EDIT2: Added executables for Windows (Win7).
EDIT3: Fixed windows executable (wrong device detection).
EDIT4: Program after tests, waits for user action (for windows users).
Does anybody interest or reproduced/verified that issue in the Radeon GPU? I will appreciate any efforts to verify that bug in other devices (GCN 1.1/1.2). I would like to know whether that bug is in other Radeon GPU's
Now I have 1.0 too, can't test.
Looks like it reports the same unsigned overflow as *_u32, and doesn't check the the 0x7fffffff <-> 0x80000000 transition.