cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

hyln9
Journeyman III

Optimized half precision gemm assembly kernels on AMD Fiji for deep learning

This is an optimized half precision gemm assembly kernels on AMD Fiji which utilizes native GCN assembly to achieve much better performance than clBLAS.

Link: GitHub - hyln9/GCNGEMM: Optimized half precision gemm assembly kernels on AMD Fiji

2 Replies

I would love to test this.  Could you discuss more of your environment?  How have you been running it?

0 Likes

Hi! I'm using Ubuntu 16.04 and all the other requirements are presented in the README.

0 Likes