cancel
Showing results for 
Search instead for 
Did you mean: 

Ecosystem Readiness for 3rd Generation AMD EPYC Processors

raghu_nambiar
0 0 1,722

AMD has a long history of building innovative products that push the boundaries of what is possible. That history continues with the launch of AMD EPYC™ 7003 Series Processors, the fastest server processors in the world. EPYC has once again raised the bar and set a new standard for modern datacenters.

Building upon the innovations of its 2nd generation predecessor, the AMD EPYC™ 7003 Series Processor utilizes x86 architecture and 7nm process technology to deliver performance and efficiency with high core counts, high-speed connectivity with PCIe® Gen4, support for eight channels of high-speed DDR4-3200 memory with a capacity of up to 4 TB per socket.

The AMD EPYC 7003 Series Processors implement the “Zen3” core bringing leadership performance with high frequencies, increased instructions per clock(15) ,and an added layer of protection against control flow attacks. A new architectural layout unifies all cores on a die into a single 8-core complex, accelerating core-to-core communication and allowing every core direct access to 32 MB of L3 cache, helping dramatically improve the performance of latency-sensitive applications. Architectural innovations also enable “secure boot” with AMD Secure Root-of-Trust and “secure execution” with both AMD’s Secure Memory Encryption and Secure Encrypted Virtualization. These features are helping bring confidential computing into the mainstream.

With the introduction of this latest generation of EPYC processors there are now 200+ cloud instances based on the AMD EPYC series of processors with more than 100 new server platforms from our OEM partners that will support the new processor. While the previous generation threw down the gauntlet, the AMD EPYC 7003 Series raised the bar even higher. Today, we have over 200 world records across the EPYC family, and more than 100 solutions across a wide spectrum of business needs, all enabled through our ecosystem partners.

Let’s have a look at how the ecosystem around AMD EPYC 7003 Series Processors enable support for your business.

Our partners make us successful

We are grateful to our broad partner ecosystem who have collaborated with our engineers to deliver a wide range of datacenter solutions:

Alibaba, Altair, Anjuna, Ansys, ATEME, AWS, Microsoft Azure, Beamr, Blackmagic, Broadcom, Cadence, Canonical, Chaos Group, Cisco, Citrix, Cloudera, Cloudian, Couchbase, Dassault Systèmes, Datastax, Docker, Emerson, ESI Group, Exasol, Excelero, F5, FreeBSD, Google, IBM Cloud, Isotropix, Keysight (Ixia-BP), Liqid, Nvidia, Microchip, Micron, MySQL, NetScout, NGNIX, Nokia, Nutanix, Oracle, Pensando, Pivot 3, Quobyte, Red Hat, Redis Labs, Samsung, SAP, SAS, ScaleMP, Schlumberger, Shearwater, Siemens, SkHynix, Splunk, StorMagic, Suse, Synopsys, Tencent , TigerGraph, Transwarp, Velocix, Vertica, VMware, Weka, Western Digital, Xilinx and others.

High Performance Computing (HPC)  

HPC has become more important than ever, touching virtually every aspect of our lives today. Many enterprises have accelerated their use of HPC taking advantage of advances in hardware and software technology that allow engineers to perform virtual simulations instead of making physical prototypes. This approach has proven to be cost effective, being typically both faster and safer than traditional methods; and is in use across a broad array of fields, including: material science, manufacturing, oil and gas exploration, and healthcare, just to name a few.

The EPYC 7003 Series offers a range of high density and high frequency processors. High density processors offer exceptional server level performance to solve problems that are high throughput in nature; while high frequency processors optimize per core performance for lightly threaded application software stacks that are licensed per core. The AMD EPYC 7003 Series Processors offer significant generational and competitive performance for key HPC workloads(18).

A few examples are shown in the charts below.

Gromacs.png

 

Ansys.png

Relational Databases

Relational databases continue to be central to mission-critical applications from transactional operations to decision support systems. Realtime processing of complex transactions and insights through data analytics are crucial to remaining competitive in business today. While the IPC uplift and memory latency improvements help transactional workloads, the memory capacity and IO bandwidth help to improve business query performance for modern decision support systems.

RDB Image 1.png

AMD EPYC 7003 Series processors provide a unique feature RDBMS can take advantage of. Relational databases can make use of large amounts of memory for both transactional and query throughput workloads for caching frequently accessed data and meta data. We are happy to announce new industry leading performance and price-performance results using relational databases with our ecosystem partners.

RDB Image 2.png

Data Analytics

More and more interconnected data is being generated every day. Enterprises of all sizes are realizing the potential for efficiency improvements, cost savings and monetization from data analytics. In today’s data driven economy an enterprise’s ability to harness data, transforming it from meaningless bits to actionable insights is a key differentiator. Research institutions and governments alike are investing in data analytics to solve important problems.

Key to making optimal use of the computing infrastructure for data analytics is the balance of compute power, IO bandwidth and storage capacity. Since out first generation EPYC processors, we have demonstrated our single socket advantage for data analytics by enabling organizations to target the optimal balance for their workloads. The world record benchmark results and validated solutions from our partners clearly demonstrate the performance and low cost of ownership advantages single processor configuration can have compared to dual socket systems from our competition. It is my pleasure to announce new industry leading performance benchmarks in big data analytics with our ecosystem partners.

Taking our demonstration of the scalability of the AMD EPYC family, I am excited to announce the first ever TPCx-HS published result at 100 TB (21).

TCP Image 1.png

Caching data and objects in memory can improve the throughput and often deliver near real-time data access performance. Redis™ Enterprise is a robust in-memory NoSQL database platform built by the people who develop open source Redis™. It maintains the simplicity and high performance of Redis, while adding many enterprise-grade capabilities, including scalability to hundreds of millions of operations per second, active-active global distribution with local latency. AMD EPYC 7003 Series Processor based systems provide high core counts and a large, unified cache resulting in low latency and high performance for Redis enterprise operations. The benchmark test shown demonstrates the operations per second performance of Redis Enterprise comparing the 3rd Gen AMD EPYC™ based server system with the top of stack Intel Xeon server based system.

 Redis.png

 

 

 

 

 

 

 

 

 

 

Virtual Desktop Infrastructure

VDI is a critical enabler in today’s rapidly changing workforce. VDI provides remote access to key business applications and tools while maintaining the company's data security. As the number of remote workers increases, businesses need to have theVDI Image.png ability to scale their VDI infrastructure to support a larger number of users while maintaining low latency to ensure a good user experience. The chart below shows how our top of stack, 3rd gen AMD EPYC processor performs in the VDI workload using Login VSI™. Login VSI is the industry standard virtual desktop load testing tool that simulates a typical workload of a business user. The benchmark adds active VDI sessions to the server and as the number of users goes up, the average response time or latency also increases. The server continues to add sessions until it is fully saturated and can no longer add users and still deliver a good user experience. As shown below, a system using AMD EPYC 7003 Series Processors, with their high-performance cores, industry-leading core density and high memory bandwidth, can deliver more than double the number of virtual desktop users with a good user experience.

 

Public Clouds

According to IDC research, the hardware infrastructure market has reached the tipping point and cloud infrastructure environments will continue to account for an increasingly higher share, reaching 63.6% of total IT infrastructure budget spend in 2024(14). Currently all major public cloud service providers are using AMD EPYC to power many of their cloud infrastructure compute offerings, delivering approximately 200 cloud instances now, and on track to reach 400+ cloud instances in 2021 along with 100’s of internal and consumable cloud services. Leading cloud providers including AWS, Alibaba Cloud, Google Cloud, IBM Cloud, Microsoft Azure, Oracle Cloud and Tencent Cloud, have been making their technology investments in AMD EPYC processors to deliver exceptional performance, scalability, and security features in the cloud. Today, the world’s largest and most important cloud services are powered by AMD EPYC, for example: Office365, Microsoft Teams, Tencent Meetings, Twitter and Zoom.    

Oracle Cloud Infrastructure (OCI) announced AMD EPYC 7003 series based OCI Flexible E4 cloud instances. OCI E4 instances are general purpose compute instances showing double digit generational performance uplift as shown below.

OCI 1.png

Microsoft Azure announced AMD EPYC 7003 series powered HBv3 instances, which deliver leadership-class performance, message passing interface (MPI) scalability, and cost efficiency for a variety of real-world HPC workloads. The HPC-optimized Azure HBv3-series instances also deliver significantly higher generational performance compared to prior generation HBv2 instances, as shown below.

Azure 1.png

Private Clouds

Private cloud is a realization of public cloud experience where data gravity, compliance, and economics limit migration to public clouds. AMD has been working with our ecosystem partners to create optimal private cloud solutions based on traditional models, where compute and storage are separate; as well as full Hyperconverged Infrastructure (HCI), a software abstraction of the traditional enterprise storage architecture. The traditional models offer the best performance and scalability while HCI offers a privately managed cloud experience along with its attendant efficiencies and agility.

VMmark® is a tool used to measure the performance and scalability of a virtualization platform using application workloads and infrastructure tasks commonly found in data centers. The VMmark benchmark combines commonly virtualized applications into predefined bundles called "tiles". The VMmark 3.1 score is determined by the number of VMmark tiles a virtualization platform can run, as well as the cumulative performance of those tiles on a variety of platform-level workloads.

VMware.png

The joint collaboration between VMware and AMD engineering teams recently resulted in additional performance enhancements in vSphere® 7.0U2 on servers powered by AMD EPYC processors. Please see VMware’s blog here and a detailed white paper here.

Confidential Computing

Confidential Computing is a game-changing paradigm shift for computing in the private and public clouds as well as hosted services. It addresses key security concerns many organizations have about hosting their sensitive applications in multi-tenant environments by safeguarding their most valuable information while in-use by their applications. AMD is the first to bring hardware accelerated encryption capability to enable confidential computing with limited performance impact and complete transparency to the applications. Google and Microsoft have announced their Confidential VM availability plans using the 3rd generation AMD EPYC 7003 series processors. In addition, VMware announced Confidential vSphere® Container Pods in vSphere 7.0 U2 bringing confidential computing to private clouds. You can read about it in VMware’s blog here.

Footnotes

  1. As of 3/17/2020, 8-node AMD EPYC™ 75F3 HBase IoTps 1,617,545 http://tpc.org/5760; 5-node AMD EPYC™ 7502P HBase IoTps 742,256 http://tpc.org/5756; 1,617,545/742,256=2.18x
  2. As of 3/17/2020, 10-node AMD EPYC™ 75F3 Spark framework HSph@1TB 19.92 http://tpc.org/5549; 10-node AMD EPYC™ 7542 Spark framework HSph@1TB 11.73 http://tpc.org/5535; 19.92/11.73=1.70x
  3. As of 3/17/2020, 17-node AMD EPYC™ 75F3 HSph@3TB 34.52 http://tpc.org/5548; 17-node Intel Xeon 6262V HSph@3TB 21.52 http://tpc.org/5544; 34.52/21.52=1.60x
  4. As of 3/17/2020, 2P AMD EPYC™ 7763 tpsE 12028 http://tpc.org/4088; 4P Intel Xeon 8280 tpsE 7013 http://tpc.org/4084; 12028/7013=1.71x
  5. As of 3/17/2020; 2P AMD EPYC™ 7763 SAPS 411970 https://www.sap.com/dmc/benchmark/2021/Cert21021.pdf  ;4P Intel Xeon 8280 SAPS 380,280 https://www.sap.com/dmc/benchmark/2019/Cert19045.pdf  411,970/380,280=1.08x
  6. As of 3/17/2020, 1P AMD EPYC™ 7763 SAPS 214,120 https://www.sap.com/dmc/benchmark/2021/Cert21020.pdf; 2P Intel Xeon 6258R SAPS 180,100 https://www.sap.com/dmc/benchmark/2020/Cert20015.pdf; 214,120/180,100=1.19x
  7. As of 3/17/2020, 2P AMD EPYC™ 7763 QphH@10000GB 1,883,497 http://tpc.org/3351; 4P Intel Xeon 8280M QphH@10000GB 1,651,514 http://tpc.org/3337; 1,883,497/1,651,514=1.14x
  8. As of 3/17/2020, 1P AMD EPYC™ 7763 QphH@3000GB 1,346,932 http://tpc.org/3352; 2P Intel Xeon 8280 QphH@3000GB 1,244,450 http://tpc.org/3336; 1,346,932/1,244,450=1.08x
  9. As of 3/17/2020, 2P AMD EPYC™ 7713 TpsV 2800 http://tpc.org/5304; 2P AMD EPYC™ 7702 TpsV 2280 http://tpc.org/5302; 2800/2280=1.40x
  10. MLN-005: Login VSI™ Pro v4.1.40.1 comparison based on AMD internal testing as of 02/01/2021 measuring the maximum “knowledge worker” desktop sessions within VSI Baseline +1000ms response time divided by the number of cores using VMware ESXi 7.0u1 and VMware Horizon 8 on a server using 2x AMD EPYC 7543 versus a server using 2x Intel Xeon Gold 6258R for ~46% more [~1.5x the] performance. Results may vary.
  11. MLN-063: VMmark® 3.1.x SAN score comparison based on highest 4-host, 2-socket systems score 33.58 @ 36 Tiles, https://www.vmware.com/products/vmmark/results3x.0.html?sort=score as of 03/10/2021 on a 4-host, 2x AMD EPYC 7763 versus a 4-host, 2x Intel Xeon Gold 6252 (12.09 Score @ 12 tiles, https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vmmark/2019-11-19-Fujitsu-PRIMERGY...)
  12. MLN-064A: VMmark® 3.1.1 performance score in a vSAN configuration based on highest 4-host, 2-socket systems published at https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vmmark/2021-03-24-DellEMC-PowerEdg... of 29.73 @ 32 tiles, https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vmmark/2021-03-24-DellEMC-PowerEdg... as of 4.1.2021, and the highest Intel based vSAN configuration utilizing a 4-host, 2x Intel Xeon Platinum 8268, 10.63 @ 12 tiles (https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vmmark/2020-06-30-Supermicro-SYS-2...). VMmark® is a product of VMware, Inc.
  13. MLN-084: An AMD EPYC 7713, four node, dual-socket VMmark 3.1.1 on vSAN results are: 29.73 @ 32 tiles. Published on 3/24/2021.https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vmmark/2021-03-24-DellEMC-PowerEdg...; AMD EPYC 7742, four node, dual-socket VMmark 3.1 on vSAN results are: 24.08 @ 28 tiles. Published on 4/28/2020. https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vmmark/2020-04-28-DellEMC-PowerEdg...
  14. https://www.idc.com/getdoc.jsp?containerId=prUS46895020
  15. MLN-003: Based on AMD internal testing as of 02/1/2021, average performance improvement at ISO-frequency on an AMD EPYC™ 72F3 (8C/8T, 3.7GHz) compared to an AMD EPYC™ 7F32 (8C/8T, 3.7GHz), per-core, single thread, using a select set of workloads including SPECrate®2017_int_base,SPECrate®2017_fp_base, and  representative server workloads. SPEC® and SPECrate® are registered trademarks of Standard Performance Evaluation Corporation. Learn more at spec.org.
  16. GROMACS: https://www.amd.com/system/files/documents/gromacs-performance-amd-epyc7003-series-processors.pdf MLN-043: WRF version 4.1.5 forecasts per day comparison based on AMD internal testing completed on 2/17/2021 on a reference platform with 2x EPYC™ 75F3 (32C) compared to an Intel server on a production system with 2x Intel® Xeon® Gold 6258R (28C) processors. Results may vary. OpenFoam: https://www.amd.com/system/files/documents/amd-epyc-7003-sb-hpc-openfoam-cfd.pdf.
  1. MLN-048A: ANSYS® CFX® 2021.1 comparison based on AMD internal testing as of 02/05/2021 measuring the time to run the Release 14.0 test case simulations (converted to jobs/day - higher is better) using a server with 2x AMD EPYC 75F3 utilizing 1TB (16x 64 GB DDR4-3200) versus 2x Intel Xeon Gold 6258R utilizing 384 GB (12x 32 GB DDR4-3200). The External Flow Over a LeMans Car test case individually was 112% [2.1x the] per node or 85% per core performance. Results may vary. MLN-049A: ANSYS® LS-DYNA® version 2021.1 comparison based on AMD internal testing as of 02/05/2021 measuring the time to run 3cars, test case simulation (converted to jobs/day - higher is better) Configurations using a server with 2x AMD EPYC 75F3 versus a server with 2x Intel Xeon Gold 6258R utilizing 384 GB (12x 32 GB DDR4-3200). The 3cars test case gain individually was 126% [~2.26x the] per node or ~98% per core jobs/day performance. Results may vary. ACUSOLVE: https://www.amd.com/system/files/documents/altair-acuSolve-performance-amd-epyc7003-series-processor...
  2. MLN-42: AMD internal testing (2021-02-25) running VCS simulation of simple graphics engine draw and asynchronous compute dispatch 2020.03 using 2P EPYC 72F3 and 2P EPYC 7F32 processors on AMD reference systems. Results may vary. MLN-046: STREAM Triad GB/s comparison based on AMD internal testing as of 02/01/2021 on a server with  2x AMD EPYC 7763 versus the 2x AMD EPYC 7742 processors score. Results may vary.
  3. OCI Generational Performance Uplift: Note: Cloud performance results presented are based on the test date in the configuration and are in alignment with AMD internal bare-metal testing factoring in cloud service provider overhead. Results may vary due to changes to the underlying configuration, and other conditions such as the placement of the VM and its resources, optimizations by the cloud service provider, accessed cloud regions, co-tenants, and the types of other workloads exercised at the same time on the system.
    1. MLNC-001: Redis™ Enterprise Memtier workload results based on AMD internal testing as of 3/8/2021 measuring generational scale-up operations/second median throughput on Flex.Standard.E4.4, 8, and 16 OCPU using 3rd Gen AMD EPYC CPUs powered instances versus the comparable Flex.Standard.E3 4, 8 and 16 OCPU 2nd Gen AMD EPYC CPU powered instances, for a ~31% average uplift across instances tested.
    2. MLNC-002: Redis™ Enterprise Memtier workload results based on AMD internal testing as of 3/8/2021 measuring generational scale-out operations/second median throughput on 1, 3, and 5 node instances using Flex.Standard.E4.4 OCPU 3rd Gen AMD EPYC CPUs powered instances versus 1, 3 and 5 node Flex.Standard.E3.4 OCPU 2nd Gen AMD EPYC CPU powered instances, for a ~30% average uplift across instances tested.
    3. MLNC-003: RocksDB db_bench workload results based on AMD internal testing as of 3/8/2021 measuring the generational operations/second median throughput on one instance using Flex.Standard.E4.4 OCPU 3rd Gen AMD EPYC CPUs powered instances versus Flex.Standard.E3.4 OCPU 2nd Gen AMD EPYC CPU powered instances.
    4. MLNC-004: VP9 encoder FFmpeg using ducks_take_off1080p50 workload results based on AMD internal testing as of 3/8/2021 measuring the generational median encoding frames/hour on one instance using Flex.Standard.E4.16 OCPU powered by 3rd Gen AMD EPYC CPUs versus one instance using Flex.Standard.E3.16 OCPU powered by 2nd Gen AMD EPYC CPU.
    5. MLNC-005: MySQL database HammerDB TPROC-C workload results based on AMD internal testing as of 3/8/2021 measuring the generational median tpm on Flex.Standard.E4.4, 8, and 16 OCPU using 3rd Gen AMD EPYC CPUs powered instances versus the comparable 4, 8 and 16 OCPU Flex.Standard.E3 2nd Gen AMD EPYC CPU powered instances for a ~12% average uplift across instances tested. The HammerDB TPROC-C workload is an open source workload derived from the TPC-C Benchmark Standard and as such is not comparable to published TPC-C results, as the results comply with a subset rather than the full TPC-C Benchmark Standard.
    6. MLNC-006: MySQL database HammerDB TPROC-H workload results based on AMD internal testing as of 3/8/2021 measuring the generational median qph on Flex.Standard.E4.4, 8, and 16 OCPU using 3rd Gen AMD EPYC CPUs powered instances versus the comparable Flex.Standard.E3 2nd Gen AMD EPYC CPU powered instances for a ~13% average uplift across instances tested. The HammerDB TPROC-H workload is an open source workload derived from the TPC-H Benchmark Standard and as such is not comparable to published TPC-H results, as the results do not comply with the TPC-H Benchmark Standard.
  4. https://azure.microsoft.com/en-us/blog/more-performance-and-choice-with-new-azure-hbv3-virtual-machi... Data provided by Microsoft and not independently verified by AMD.
  5. As of 3/23/2020, 17-node AMD EPYC™ 75F3, 43.76 HSph@100TB - http://tpc.org/5552.
  6. MLN-052: Redis™ NoSQL comparison based on AMD internal testing as of 2/12/2021 measuring the Memtier throughput (median ops/sec) test on a server with 2x AMD EPYC 7763 versus 2x Intel Xeon Gold 6258R for ~99.4% more [~2x the] performance.

 

About the Author
Raghu Nambiar is the Corporate Vice President of Datacenter Ecosystems and Solutions at AMD. In this role, he leads engineering teams and their collaboration with ecosystem partners. Raghu has more than 20 years of technology industry experience across a number of engineering organizations. He was previously the CTO of the Cisco UCS business and played an instrumental role in accelerating the growth of the Cisco UCS to a top data center compute platform. He has spent his entire career working on software and hardware ecosystems for data centers, both on in research and business use cases.
Labels