AnsweredAssumed Answered

Are AMD GPUs getting an equivalent for NVidia's LOP3.LUT?

Question asked by optimiz3 on Feb 22, 2016
Latest reply on Jun 23, 2016 by optimiz3

NVidia introduced two instructions that have massive importance for cryptographic and integer compute with their Maxwell architecture:

 

- LOP3.LUT which lets applications execute FPGA style 32-bit lookup tables.  This lets you execute any 3-input logic operation such as SHA-256's Ch (chose/bitselect), SHA-256's Maj (majority), SHA3's Chi, bitslice S-Boxes, etc in a single op.  While AMD does have the bitselect operation (which can be used to compose the general LUT operation), this is a vastly inferior option.

 

- IADD3 which lets applications add 3 32-bit integers.  Simple and effective, but widely applicable due to functions like SHA-256 which have long 32-bit adder chains.

 

Are we getting anything like this with Arctic Islands?  We'd like to get a head start on optimizing assembly if possible.

Outcomes