With MI200 series accelerators, initially available in a new Open Accelerator Module (OAM) form factor seen in Figure 1 below, scientists can tackle their most pressing challenges—from climate change to vaccine research—using exascale-class supercomputers like the HPE Cray EX Supercomputer or off-the-shelf servers from our partners like ATOS, Gigabyte, Penguin Computing, Supermicro and others.
Figure 1: AMD Instinct™ MI250 accelerator (OAM Module)
When it comes to performance, MI200 series accelerators provide customers with the industries’ fastest accelerator, the MI250X, delivering up to 47.9 TFLOPs peak theoretical double precision (FP64) HPC performance, and up to 383 TFLOPS peak theoretical half-precision (FP16) AI performance as seen in Graph 1 below1. In real-world benchmarks that represent the work that the Frontier supercomputer is expected to do, the AMD Instinct MI200 series accelerators boast up to a 3x speedup over competitive data center GPUs today2. Visit the AMD Instinct benchmark page to learn more on real-world application performance of MI200 GPUs.
Graph 1: AMD Instinct™ MI250X accelerator delivers performance for HPC.2
How did we manage this performance feat? First, we engineered our Next-Gen AMD CDNA™ 2 architecture reducing our process technology to 6 nanometers and expanding our Matrix Core Technology capabilities with new FP64 Matrix Cores. Then we combined two dies in one package using the same multi-die technology that has made AMD EPYC™ processors the fastest x86 server processors in the world3. This allowed us to increase our core density with the MI250X accelerator by 83% over our previous gen GPUs providing 220 Compute Units with 14,080 stream cores and 880 Matrix Cores4. Furthermore, it allows us to provide customers with an industry leading 128GB of HBM2e memory with up to 3.2 TB/s of theoretical memory throughput5. Connectivity is the next challenge—how do you move data in and out of the GPUs and between peers in hives (or groups) of four or eight accelerators? The MI200 series accomplishes this with the 3rd generation AMD Infinity architecture. This allows us to interconnect accelerators within a hive through up to eight AMD Infinity Fabric™ links on the MI200 accelerator, delivering up to 800 GB/s of peer-to-peer transfer bandwidth capability per MI200 accelerator6. Figure 2 below shows what a typical dual AMD EPYC CPU with eight AMD Instinct MI250X accelerators would look like with the AMD Infinity Architecture. This high-speed 3rd Gen Infinity Fabric can connect directly to 3rd Gen AMD EPYC™ CPUs, accelerating data movement among devices. This approach not only speeds data transfer but also enables us to support cache coherency between optimized 3rd Gen AMD EPYC CPUs and MI250X accelerators6.
Figure 2: Typical server platform diagram with dual 3rd Gen AMD EPYC™ CPUs and eight AMD Instinct™ MI250 accelerators.
Just as our packaging meets open standards, so does the AMD ROCm™ 5.0 open software platform. ROCm’s underlying vision has always been to provide open, portable and performant software for accelerated GPU computing. With ROCm 5.0, we’re adding support and optimizations for the MI200, expanding ROCm support to include Radeon™ Pro W6800 Workstation GPUs, and improving developer tools that increase end-user productivity. ROCm continues to provide developers with choice, making accelerated software portable across a range of accelerators and helping ensure alignment with industry standards with our collection of open source software and APIs.
Now you can write your software once and run it practically anywhere. And with the introduction of the AMD Infinity Hub, a collection of advanced GPU software containers and deployment guides for HPC, AI & Machine Learning applications, is available to help speed-up your system deployments and time to science and discovery.
Today you can power discoveries with the most advanced accelerator available anywhere combined with the fastest x86 server CPUs, and powered by the ROCm 5.0 platform1,3,7. If you aren’t already using AMD Instinct accelerators and AMD EPYC processors, today is the time to start.
Learn more about the latest AMD Instinct™ MI200 Series Accelerators
Visit the AMD Infinity Hub to learn about our AMD Instinct™ supported containers.
Learn more about the 2nd Gen AMD CDNA™ architecture
To learn more about the AMD ROCm™ open software platform
To learn more about AMD Instinct™ MI200 series performance
Guy Ludden is Sr. Product Marketing Mgr. for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.