hyln9

Optimized half precision gemm assembly kernels on AMD Fiji for deep learning

Discussion created by hyln9 on Jun 16, 2017
Latest reply on Jun 25, 2017 by hyln9

This is an optimized half precision gemm assembly kernels on AMD Fiji which utilizes native GCN assembly to achieve much better performance than clBLAS.

Link: GitHub - hyln9/GCNGEMM: Optimized half precision gemm assembly kernels on AMD Fiji

Outcomes