Showing results for 
Search instead for 
Did you mean: 

Evolving system design for High Performance Computing

0 0 4,380

Today, I had the pleasure to address attendees at the 2019 Rice Oil & Gas HPC conference and discuss AMD’s vision for the HPC community and how the required compute power can continue to grow.

With 5.6 million barrels of oil expected to be pumped every day this year, Texas ranks only behind Russia and Saudi Arabia in production. One driver for all that output is technology, including high-performance computing to model oil resources and guide drilling. HPC system architecture has evolved dramatically over the past two decades, from monolithic supercomputers to clusters of industry standard servers to heterogenous nodes incorporating CPUs and accelerators such as GPUs. These new architectures have provided an incredible increase in performance and enabled new application areas beyond traditional HPC, most notably Big Data Analytics, Machine Learning, and Artificial Intelligence (AI).

The problem is the traditional levers used to increase the performance are becoming less effective. A more scalable, powerful, and secure approach is required to meet the ever-growing demands. Pushing the envelope of computing is the bread and butter of AMD, and there are a few key areas where we see innovation making a significant near-term contribution to HPC.


Chiplet design is an example of an area where the industry is moving to continue delivering performance gains even though the pace of Moore’s Law is slowing. Chiplets enable more silicon to be cost effectively used allowing companies, like AMD, to efficiently match processor IP to the best manufacturing process. AMD introduced the chiplet approach in 2017 with AMD EPYC server processors featuring the “Zen” architecture. We are taking it to the next level mid-year with our next generation 7nm, 64-core EPYC processor (codenamed “Rome”) featuring our “Zen 2” core. We demonstrated Rome in a single socket configuration running a popular NAMD benchmark outperforming the 2P Xeon 8180 powered server by an average of up to 15 percent1. (See video of demo here)

Next Generation I/O and Fabrics

The AMD “Zen 2” core is an amazing piece of technology that evolves the already legendary “Zen” design, driving the performance of AMD processors to new heights. But for HPC workloads, you must “feed the beast”, through connections to peripherals, networks, storage and memory. Rome is the first x86 server CPU to support PCIe® Gen 4.0 which doubles the performance of each I/O connection and thus boosts performance. We also joined early in supporting new, open standards for coherent fabrics including CCIX and Gen-Z that have tremendous potential.

Heterogenous Processing

The oil and gas industries were some of the first to see the potential for using different processing architectures for different workloads to maximize performance. Combining serial processing CPUs, like AMD EPYC, with high-performance, parallel GPUs, including AMD Radeon Instinct™, is the new normal for the highest performance HPC systems. Other accelerators, like FPGAs, are another exciting option for specialized workloads. And let’s not forget about software. The key to unlocking this potential is software, and open ecosystems like the one AMD established with ROCm are critical. Expect to hear a lot this year about the continued evolution of heterogeneous computing as the industry rallies around open solutions rather than closed, single vendor options.

I look forward to sharing more perspectives in the year ahead around how AMD views the future of HPC and the datacenter.

  1. Based on AMD internal testing of the NAMD Apo1 v2.12 benchmark. AMD tests conducted on AMD reference platform configured with 1 x preproduction EPYC 7nm 64 core SoC, 8 x 32GB DDR4 2666MHz DIMMs, and Ubuntu 18.04, 4.17 kernel and using the AOCC 1.3 beta compiler with OpenMPI 4.0, FFTW 3.3.8 and Charms 6.7.1, achieved an average of 9.83 ns/day; versus Supermicro SYS-1029U-TRTP configured with 2 x Intel Xeon Platinum 8180 CPUs, 12 x 32GB DDR4 2666MHz DIMMs and Ubuntu 18.04 , kernel 4.15 using the ICC 18.0.2 complier with FFTW 3.3.8 and Charms 6.8.2, achieved an average of 8.4 ns/day.ROM-01
Tags (3)