0 Replies Latest reply on Oct 11, 2012 5:36 AM by Bdot

    mul24_hi in OpenCL

    Bdot

      Hi,

       

      the GPUs as well as the AMD-IL have an instruction for mul24_hi (unsigned as an example, but exists also for signed):

      Evergreen/Cayman: MULHI_UINT24

      GCN: V_MUL_HI_U32_U24

      IL: UMUL24_high ( IL_OP_U_MUL24_HIGH )

       

      However, in OpenCL I'm still missing this. If it is so difficult to get this into the OpenCL standard, could you not add an AMD-specific extension to add this important performance-feature?

       

      I found this old thread about it, but it still seems to be unresolved

      http://devgurus.amd.com/thread/149862

       

      Is there some feasible way to tweak the IL or ISA code into using this instruction? Has anyone done that?