Does the current BLIS library (OACL 2.2) have any performance limitations when uses > 16 threads?
Does AMD distribute an optimized version of HPCG in binary form?
Found the authors of HPCG website: https://www.hpcg-benchmark.org/links/index.html
From the above link: GitHub - ROCmSoftwarePlatform/rocHPCG: HPCG benchmark based on ROCm platform
As for the BLAS question from AMD developer's website: https://developer.amd.com/amd-aocl/blas-library/
I would post your question at AMD Deverloper's Forum and see if anyone can answer your question from here: Newcomers Start Here
NOTE: most likely you are already aware of all of the above links and information so best just to post your question at AMD Developer's Forum from the link above.
Thanks! I was wondering if AMD has produced an optimized binary for their Zen2 processors; We do not have ROCm platform available.
Ask AMD Support directly from here: https://www.amd.com/en/support/contact-email-form
Also maybe post your question at AMD Developer's Forum for first time posters from here: Newcomers Start Here .
Maybe someone there might have an answer.