2 Replies Latest reply on Jun 25, 2017 8:56 AM by hyln9

    Optimized half precision gemm assembly kernels on AMD Fiji for deep learning

    hyln9

      This is an optimized half precision gemm assembly kernels on AMD Fiji which utilizes native GCN assembly to achieve much better performance than clBLAS.

      Link: GitHub - hyln9/GCNGEMM: Optimized half precision gemm assembly kernels on AMD Fiji