Instinct Accelerators

A deep technical overview of the new MoE Align & Sort algorithm. By fully enabling concurrent multiple blocks execution with arbitrary expert numbers, and with aggressive usage of shared memory and registers, the MoE Align & Sort significant performance gains on AMD hardware, providing up to a 10x acceleration on AMD Instinct^TM MI100 GPUs and 7x on AMD Instinct MI300X/MI300A GPUs.

0 0 851

3298000_MLPerf_5.0_Instinct_Promotions_Graphic_FNL.png

Customers evaluating AI infrastructure today rely on a combination of industry-standard benchmarks and real-world model performance metrics—such as those from Llama 3.1 405B, DeepSeek-R1, and other leading open-source models—to guide their GPU purchase decisions.

At AMD, we believe that delivering value across both dimensions is essential to driving broader AI adoption and real-world deployment at scale. That’s why we take a holistic approach—optimizing performance for rigorous industry benchmarks like MLperf while also enabling Day 0 support and rapid tuning for the models most widely used in production by our customers. This strategy helps ensure AMD Instinct™ GPUs deliver not only strong, standardized performance, but also high-throughput, scalable AI inferencing across the latest generative and language models used by customers.

In this blog, we explore how AMD’s continued investment in benchmarking, open model enablement, software and ecosystem tools helps unlock greater value for customers—from MLPerf Inference 5.0 results to Llama 3.1 405B and DeepSeek-R1 performance, ROCm software advances, and beyond.

0 0 3,169

Nathan Nadarajah, Senior Fellow and Security Architect at AMD, recently sat down with me to answer some questions about GPU security. In his two decades with the company, Nadarajah has worked on GPU drivers, GPU firmware and security firmware, and his expertise spans both the use of GPUs in enterprise data centers and in consumer gaming workstations.

0 0 1,697

Cloud computing providers and leading technology companies are investing in cutting-edge AI chips to power the next generation of innovation.

0 0 1,186

El Capitan_Full_00354_sm.jpg

Today, we continue to celebrate the dedication of the El Capitan supercomputer with Lawrence Livermore National Laboratory (LLNL), in collaboration with the National Nuclear Security Administration (NNSA) and Hewlett Packard Enterprise (HPE).

2 0 7,196

The AI era is here, and it's hungry—hungry for performance, scalability, and efficiency. Whether you're building next-gen data centers, fine-tuning your AI/ML workloads, or crafting cutting-edge HPC solutions, one thing is clear: the right ingredients matter. This blog will guide you through the AMD recommended ingredients, the secret sauce, and the cooking techniques needed to create an AI/HPC infrastructure that’s as efficient as it is powerful. Let’s get cooking.

1 0 4,362

AMD Instinct Accelerators 19
AMD Instinct Accelerators Blog 10
Instinct AI 13
Other 1
ROCm 28
ROCm Blogs 10

Instinct Accelerators

Instinct Accelerators

Revolutionizing Mixture of Experts Performance: 10x Speedup on AMD Instinct GPUs with Optimized Align & Sort

AMD Instinct GPUs Continue AI Momentum Across Industry Benchmarks and Today’s Most Demanding AI Models

Helping Secure GPUs That Advance AI

How AI Chips are Changing the Shape of the Cloud

El Capitan Takes Exascale Computing to New Heights

Scalable AI, Served Fresh: The AMD Blueprint for High-Performance AI/HPC Infrastructure

Competitive performance claims and industry leading Inference performance on AMD Instinct MI300X

New ROCm™ 5.6 Release Brings Enhancements and Optimizations for AI and HPC Workloads

Available now: new HIP SDK helps democratize GPU computing