AI Discussions

oortcloudia · ‎12-28-2023

Is there someone who can explain how INT32/INT64 computation will be processed in MI300 XCD as detailed as possible in perspective of the hardware architecture?
Unlike to NVIDIA ampere architectures those have dedicated int32 cores, MI300 looks focusing to give benefits for supporting ML friendly TF32/BF16 or INT8 operations while deprioritizing classical IN32/INT64 operations according to the white paper.
I want to clarify how much efficient my kernels composed of INT32/INT64 computations(non-ML workload) only would be on MI300.

oortcloudia · ‎01-09-2024

I'm meanwhile tracking AMDGPU subbranch at llvm to grasp what new features will come along for nextgen architectures.
The commit below gives some hints about new instructions mapping to 64bit SALUs will be included in GFX12+.
https://github.com/llvm/llvm-project/commit/f465a2c83d3c2a142ceba6104f7e7093859efa54
Are these new SALUs also supported by RDNA4 only? (I'd checked CDNA3 ISA)

AI Discussions

INT32/INT64 computation in MI300 XCD?