1 Reply Latest reply on Jun 20, 2017 12:34 PM by volumetricsteve

    Optimized half precision gemm assembly kernels on AMD Fiji for deep learning

    hyln9

      This is an optimized half precision gemm assembly kernels on AMD Fiji which utilizes native GCN assembly to achieve much better performance than clBLAS.

      Link: GitHub - hyln9/GCNGEMM: Optimized half precision gemm assembly kernels on AMD Fiji