Archives Discussions

Bdot · ‎10-11-2012

Hi,

the GPUs as well as the AMD-IL have an instruction for mul24_hi (unsigned as an example, but exists also for signed):

Evergreen/Cayman: MULHI_UINT24

GCN: V_MUL_HI_U32_U24

IL: UMUL24_high ( IL_OP_U_MUL24_HIGH )

However, in OpenCL I'm still missing this. If it is so difficult to get this into the OpenCL standard, could you not add an AMD-specific extension to add this important performance-feature?

I found this old thread about it, but it still seems to be unresolved

http://devgurus.amd.com/thread/149862

Is there some feasible way to tweak the IL or ISA code into using this instruction? Has anyone done that?

Archives Discussions

mul24_hi in OpenCL