Instinct Accelerators Blog

Showing results for 
Search instead for 
Did you mean: 

Instinct Accelerators Blog

With a rapid shift to blended education and both synchronous and asynchronous remote learning, we have seen a surge in demand for AMD-powered Microsoft Azure NVv4 instances within education. These virtual machines are specified with 2nd Gen AMD EPYC CPUs and AMD Radeon Instinct GPUs to economically support the typical graphical challenges of digitized curriculum delivery.


0 1 537

PTC Creo 3D CAD software has been certified to run on NVv4 instances of Microsoft Azure, an announcement that opens the door to bring the many benefits of virtualization and the Cloud to design and manufacturing.


1 0 1,082

The predominance of remote work we’re seeing now is more than just a reaction to near-term global challenges. It is actually the acceleration of a trend that has been building momentum for years and has been reflected in a multitude of changes in technology and behavior. Stated simply though, it is the idea that organizations need to provide their end-users with the ability to access the essential digital tools they rely on daily, no matter where they are, using an ever-growing assortment of devices, and often while connecting over unpredictable networks. 

With that as a backdrop, AMD, together with partners Citrix and Microsoft, recently hosted a webinar that I think will be of interest to anyone who wants a better understanding of how to deliver secure, highly capable, modern desktop and workstation experiences from the Cloud. Even more exciting may be the realization of just how much is possible today compared to a few years ago. 

This collaboration has special significance as it reflects the intertwining of technologies that, when taken together, delivers for businesses a seamless user experience regardless of location, device and workload.  


Powering modern desktop and workstations from the cloud

With its new AMD-powered NVv4 instances, Microsoft Azure has created the cloud platform needed to support work-from-anywhere flexibility, while also delivering the IT management and security needed to preserve an organization’s integrity. For the first time, it is possible to remotely deliver the computing resources needed by any type of user, whether they require basic desktop for everyday office productivity tasks, the performance of a full professional workstation, or anything in between.

Most modern applications rely on some GPU processing capability to perform their best, so the use of AMD EPYC processors and Radeon Instinct GPUs in NVv4 is important because it enables NVv4 to offer the ability to split GPU and CPU processing to accommodate different workloads, even sharing the capacity of one GPU among multiple users. SR-IOV technology ensures predictable performance for each user, while also ensuring that each VM is physically isolated from the other for further security confidence.

Having Citrix in the mix provides the remoting protocols and tools needed to ensure users enjoy the kind of effortless experience they’d expect from an expensive PC in their company headquarters, even while sitting on the couch in their living room. Citrix is an expert in overcoming challenges such as low bandwidth and high latency remote connections that would otherwise degrade productivity. Combined with mature and robust management tools and controls, Citrix cloud services ensure the utmost flexibility and functionality for IT managers deploying with Azure. 

Check out the webinar to learn more about delivering the seamless remote desktop and workstation experiences needed by today’s business workloads.

For more resources:

  • Citrix cloud services: Link
  • Azure NVv4 Microsoft GA blog: Link
  • Azure NVv4 pricing: Link
  • Link

George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

0 0 629

Amazon recently announced a new family of GPU accelerated Virtual Machine (VM) instances available soon on AWS. Powered by AMD 2nd Gen EPYCTM processors and new AMD RadeonTM Pro V520 graphics, the Amazon EC2 G4ad instance is designed to support demanding video and 3D graphical applications and workloads - supplied with free use of Amazon’s industry leading NICE DCV technologies.

We really have been very impressed with the NICE DCV functionality internally and thought we’d share a little more insight into them today.

What is NICE DCV?

Amazon themselves describe NICE DCV rather well and succinctly: “NICE DCV is a high-performance remote display protocol that provides customers with a secure way to deliver remote desktops and application streaming from any cloud or data center to any device, over varying network conditions. With NICE DCV and Amazon EC2, customers can run graphics-intensive applications remotely on EC2 instances, and stream their user interface to simpler client machines, eliminating the need for expensive dedicated workstations. Customers across a broad range of HPC workloads use NICE DCV for their remote visualization requirements. The NICE DCV streaming protocol is also utilized by popular services, like Amazon Appstream 2.0 and AWS RoboMaker.”

NICE DCV is a mature, efficient and sophisticated remoting protocol providing comparable functionality to protocols such as Citrix HDX/ICA, Microsoft RDP and standalone Teradici PCoIP Ultra. Being provided by Amazon themselves, there is no additional charge to use NICE DCV on Amazon EC2. You pay only for the EC2 resources you use to run and store your workloads and can avoid the need for a third-party high-end protocol unless your needs are exceptionally niche.

Some History - How Amazon have invested in graphical protocol technologies

It’s now several years since Amazon bought NICE and their DCV and EnginFrame products. NICE were extremely good at what they did. For a long time, they were one of the few vendors who could offer a decent VDI solution that supported Linux VMs, with a history in HPC and Linux they truly understood virtualization and compute as well as graphics. They’d also developed their own remoting protocol and it was one of the first to leverage GPUs for tasks like H.264 encode.

Because they supported Linux VMs and at a time when most VDI vendors were Windows only, NICE had a strong lead in the wider Cloud market. Amazon acquiring one of the best and most experienced protocol teams and the heavy investments they have subsequently made have allowed AWS to support a very compelling platform for the remote delivery of heavyweight graphical and CAD software titles. Back in December 2016, Amazon announced that they’d throw NICE DVC in for free on AWS instances, previously NICE DCV was a well-proven product with standalone customers and for many users has long offered an alternative to other Windows only VDI offerings.

Graphical Applications need GPU Support

Including a high-end protocol NICE technologies are often used in association with demanding video usage, graphically demanding 3D and CAD/AEC applications. These are exactly the type of applications that demand and benefit a GPU and the release of the Amazon EC2 G4ad instances and include AMD’s SR-IOV based GPU visualization technologies, enabling GPU support on AWS at an unmatched low cost. Being a bandwidth-adaptive streaming protocol allows NICE DCV to provide near real-time responsiveness for your applications without compromising on the accuracy of the image, whilst avoiding unnecessary bandwidth use.

More than just graphics!

There’s far more to NICE DCV with a full portfolio of enterprise features to support a fully interactive experience including:

  • Audio codecs and technologies
  • USB device remotization to support 3D pointing devices, Space mice, and similar
  • Graphics tablet and stylus support with pressure sensitivity
  • File transfer protocols
  • Printing support
  • Smart card support
  • Session to client cut-n-paste support and control
  • Native clients can support of two monitors at 4K resolution each

What’s New

Amazon recently (Nov 2020) released their latest version of NICE DCV including significant performance optimizations for high frame usage, typically associated with GPU accelerated VMs.

Looking forward with Amazon EC2 G4ad instances

As we explore more the capabilities and benefits of AMD-powered Amazon EC2 G4ad instances, we will continue to write our blogs to provide more insights into the different ways it can help businesses and industries improve efficiencies and modernize IT systems for remote and work from home (WFH) deployments. Keep up to date with the latest news and follow the AMD Instinct channel.   



William Myrhang is a Sr. Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.



0 0 748

The world looks very different from a year ago, but one thing remains the same. Whether people are remote or in the office, employees including professional designers, game developers, and engineers need access to workstation-class performance to run their applications and to be their most productive.


0 0 2,669

This week at Supercomputing ’20, AMD unveiled our new AMD Instinct™ MI100 accelerator, the world’s fastest HPC GPU accelerator for scientific workloads and the first to surpass the 10 teraflops (FP64) performance barrier. 



0 0 541

Learn how AMD EPYC Processors are powering the new HPE Apollo 6500 Gen10 Plus System: the GPU workhorse for HPC and AI workloads.


0 0 570

Ever since Microsoft’s introduction of its new NVv4 instances in Microsoft Azure, a lot of attention has been rightly focused on the underlying technology. And to be sure, the 2nd Gen AMD EPYC processors and AMD Radeon Instinct GPUs underpinning the NVv4 physical platform enable state-of-the-art virtualization solutions for GPU-accelerated workloads in the public cloud.

But while everybody has been busy looking at the “speeds and feeds”, I think the experiences and functionality made possible by NVv4 are more interesting to talk about. With that in mind, I thought it would be useful to begin a series of blogs to help answer the questions, “what can you do with NVv4,” and “what does that experience look like,” from the perspective of end-users;  the workers, makers, do-ers, and creators who will most directly make daily use of the offering.


To be clear, GPU acceleration in the cloud is not new, however, NVv4 rewrites the rules in substantial ways. With the arrival of NVv4, GPU acceleration (and virtualization!) in the public cloud is finally coming of age. NVv4 allows for fine-grained provisioning of virtual machines in a golden zone of matched price and performance across the broadest range of requirements for cloud-based Windows 10 desktops, as well as interactive and immersive applications. In short, NVv4 is the multitool of the modern visual cloud.


In the past, GPUs in the cloud were limited to one GPU to one user, and they were dang expensive! While it was possible to provide GPU support this way, an enterprise had to pay quite a bit to reserve the needed cloud resources. NVv4 is different because it enables one to right-size cloud-based GPU capacity and performance to fit the job. This flexibility makes a wider range of options feasible to better support both office productivity and power users.


Why does GPU support matter?

One of the challenges for cloud-based operations has been the fact that many, and frankly most, modern applications require GPU acceleration to run smoothly and effectively. That goes not only for high-end design software. The everyday business productivity applications used by millions of workers need GPU support too including applications such as Microsoft Office, word processing, spreadsheets, Microsoft PowerPoint, and video conferencing.


Added to that, many offices have a population of power users who need to work with media, from a bit of light video/photo editing to desktop design. They expect the application experience to be snappy. In fact, application responsiveness is vital for people to stay in their creative flow, whether that is crunching numbers, drafting a document, or designing a newsletter. GPU support is the secret ingredient that can make sure they all enjoy a great user experience, which leads to better productivity and happy workers.

What is a great user experience?

In the context of the Cloud, a great user experience is one that is comparable to a modern local desktop experience. Of course, that starts with basic application responsiveness. But it also refers to must-have capabilities such as support for high-resolution monitors, multiple monitors, and rich multimedia capabilities. Simply put, when the work moves to the Cloud, users should not be asked to make tradeoffs to get there.


How does NVv4 deliver?

To support workers from the cloud, NVv4 for Azure uses Windows Virtual Desktop. Since the applications used are not going to demand all the resources of a high-performance GPU, the platform is designed so that a number of people can happily share the resources of a single CPU and a single GPU. This session-based, desktop as a service (DaaS) solution is well suited to supporting large numbers of workers while making efficient use of processing resources and, therefore, budgets.


NVv4 would be compelling even if it only migrated existing office workflows to the Cloud. Not only can this approach deliver a great experience to the worker’s desktop, but it opens remarkable opportunities for flexibility and mobility. No longer needing a high-powered desktop or laptop, NVv4 desktop virtualization combined with ubiquitous access to broadband services at home and on the road, means more people can be as productive away from their desks as they’ve been in the office.

Things are a lot different today than they were even a year ago in terms of options, pricing, and capabilities for virtualized environments for the majority of office work. Ultimately NVv4 presents an opportunity to revisit and challenge our preconceptions about what is possible for GPU accelerated DaaS from the Cloud.

Side by side comparision of a session-based deployment - with and without GPU enabled 

In future blogs, we will dig more deeply into both the user experience and the IT management considerations. In particular, I am excited to explore how NVv4 and its AMD-powered GPU support can solve challenges for workers in different industries such as design, manufacturing, architecture, engineering, construction, finance, and others. The opportunities extend far and wide!


Other resources to consider: 


Adam Glick is a Technical Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied. Use of third-party marks / logos/ products is for informational purposes only and no endorsement of or by AMD is intended or implied.

1 0 1,519

With the arrival of NVv4 instances for Microsoft Azure, decision-makers in many industries, education and government are asking themselves whether cloud-based virtualized desktops can meet their stringent requirements, both in terms of high productivity and financial feasibility.


With the further announcement of NVv4 being successfully tested and recommended by Esri for its flagship ArcGIS Pro applications, the answer is now a clear, yes! After undergoing rigorous testing and detailed evaluation, IT managers and users have the assurance of reliability they need to take their trusted workstation and desktop working environments to the Cloud. This verification and validation is critical because it provides affirmation that NVv4 has been carefully evaluated by Esri to ensure that it is fully optimized to meet the expectations of Esri users and that they can rely on a fully vendor supported solution


So what is NVv4?

NVv4 instances for Azure are virtualization solutions that use the power of 2nd generation AMD EPYC processors and Radeon Instinct GPUs from the Cloud. The close, balanced interplay between these resources is the key to making affordable, fully cloud-based desktop environments capable of addressing the computing needs of a wide variety of workers, from those using everyday office productivity applications to full-blown high-performance workstation tools.


The Opportunity for Esri Workflows

Complex GIS (Geographic Information System) software such as ArcGIS Pro requires GPU support to deliver a smooth, reliable user experience. However, not all applications or use cases can make use of the capabilities of a complete server GPU. In the past, this has been a limiting factor to mass adoption as the only available option was to dedicate an entire server GPU in Azure to each user’s GPU (16GB). This was an inefficient and costly approach. While the most demanding visualization power users, data analysts or geophysicists may very well require an NVv4 option of a full, dedicated GPU to support their workflow, a desktop user viewing and modifying data may only require one-eighth of a GPU (2GB) to have a great experience from the Cloud.


One of the significant innovations found in NVv4 is fractional GPU capability. Made possible by AMD’s implementation of SR-IOV technology in its AMD Radeon Instinct GPUs, fractional GPU means that individual AMD GPUs in Azure can be shared among multiple users. With NVv4, each individual user enjoys an experience comparable to that which they would expect from a locally installed GPU, even when the GPU they access is shared among multiple users. Hardware resources are physically isolated, separating each VM from others even when a GPU is shared, which helps ensure security within the environment. Optimizations resulting from the collaborative effort of Microsoft, Esri, and AMD further underpin the powerful experience for the user.


Further Information

With demand for validated remote and home working solutions rising, Esri have released a number of resources documenting their support for the NVv4 instances including a detailed whitepaper, ArcGIS Pro Virtualization and a collection of resources targeted at Higher Education including architectures to support remote working and online classes and labs, as well as on campus, Virtualization of ArcGIS from the Cloud and On-Premise platforms to support Higher Education”


Esri have release a detail guide to the performance, functionality and benchmarking tests they performed upon NVv4 alongside resource planning advice to aid those wanting to choose between NVv4 for specific use cases on their own site, seeArcGIS Pro on the Azure NVv4-series


Esri testing and endorsement may rewrite the rules that dictate where and how people work. Even the most demanding application requirements can be addressed from wherever the user is located and using whatever device is available to them. One need no longer be shackled to high-performance workstations: engineers, geologists, data analysts and data visualization experts can access their Esri tools whenever and wherever work or life takes them.


For more resources:

  • NVv4 Microsoft GA blog: Link
  • NVv4 pricing: Link
  • com/Nvv4: Link
  • NVv4 for Education: Link
  • NVv4 for Design and Manufacturing: Link
  • NVv4 for Architecture, Engineering and Construction (AEC): Link
  • ESRI NVv4 blog: Link
  • ESRI in higher education: Link


George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.


0 0 1,565

With the arrival of NVv4 instances for Microsoft Azure, decision-makers in many industries and universities are asking themselves whether cloud-based virtualized desktops can meet their stringent requirements, both in terms of high productivity and financial feasibility.


With the further announcement of NVv4 certification by Autodesk for its AutoCAD, Revit, and Inventor applications, the answer is now a clear yes. After undergoing rigorous testing and detailed evaluation, IT managers and users have the assurance of reliability and the support from Autodesk they need to take their trusted workstation and desktop working environments to the Cloud. This certification is critical because it provides affirmation that NVv4 has been carefully evaluated by Autodesk and is fully supported meeting the expectations of Autodesk’s 3D CAD, AEC and VFX users.


So what is NVv4?

NVv4 instances for Azure are virtualization solutions that use the power of 2nd generation AMD EPYC processors and Radeon Instinct GPUs from the Cloud. The close, balanced interplay between these resources is the key to making cost-effective, fully cloud-based desktop environments capable of addressing the computing needs of a wide variety of workers, from those using everyday office productivity applications to full-blown high-performance workstation tools.


The Opportunity for Autodesk Workflows

The AutoCAD, Revit and Inventor applications from Autodesk require GPU support to deliver a smooth, reliable user experience. However, not all applications or use cases make use of the capabilities of a complete server GPU. In the past, this has been a limiting factor as the only available option was to dedicate an entire server GPU (often 16GB per user) in Azure to each user’s VM. This was an inefficient and costly approach, limiting server density. While the most demanding design visualization power users may very well require a full, dedicated GPU to support their workflow, a desktop user preparing technical publications may only require one-eighth of a GPU to have a great experience from the Cloud. 


One of the significant innovations found in NVv4 is fractional GPU capability. Made possible by AMD’s implementation of SR-IOV technology in its Radeon Instinct GPUs, fractional GPU means that individual AMD GPUs in Azure NVv4 instances can be shared among multiple users. With NVv4, each individual user can enjoy an experience comparable to that which they would expect from a locally installed professional grade GPU including professional drivers, even when the GPU they access is shared among multiple users. Hardware resources are physically isolated, separating each VM from others even when a GPU is shared, which helps ensure security within the environment. Optimizations resulting from the collaborative effort of Microsoft, Autodesk, and AMD further underpin the powerful experience for the user.


Autodesk certification means the offer of full vendor support, regardless of the users location and the device being used. One need no longer be shackled to high-performance workstations: architects, designers, engineers and visual effects (VFX) experts can access their Autodesk tools whenever and wherever work or life takes them.

The certifications for 2019 and 2020 versions of AutoCAD, Revit and Inventor can be found here and are listed as "Radeon Instinct MI25 MxGPU". 2021 certifications are coming soon.


For more resources:

  • NVv4 Microsoft GA blog: Link
  • NVv4 pricing: Link
  • Link
  • NVv4 for Education: Link
  • NVv4 for Design and Manufacturing: Link
  • NVv4 for Architecture, Engineering and Construction (AEC): Link


George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

0 0 2,909

The Oil and Gas sector places tremendous demand upon IT infrastructure. Extraction, mining, and drilling projects may cost billions, span multiple years, and are often geographically distributed. With the arrival of Microsoft Azure’s latest GPU-enabled NVv4 instances, oil and gas companies now have a new virtual desktop option that offers significant potential benefits to their workflows, productivity, and IT costs when considering how to address the breadth of their IT requirements.

At the high-end, these companies rely on some of the most demanding workloads in existence, processing massive datasets with 2D and 3D simulation and modelling software in order to plan and manage vast engineering sites, rigs, and construction projects. Combining AMD Radeon InstinctTM MI25 GPUs with up to 16GB of dedicated memory and 64-core AMD EPYCTM 7742 CPUs, NVv4 instances in Azure delivers virtual machines capable of reviewing, processing and analyzing large datasets while delivering workstation-class experiences from the Cloud.

NVv4 is a compelling new virtual desktop and workstation solution that enables geologists and engineers to prototype, scale, and adapt rapidly without the usual risks of long-term commitment or project changes that may render hardware and infrastructure decisions invalid. 

Oil and gas companies also rely on huge numbers of people using office applications, collaboration and communication software (such as Microsoft Teams, Jabber, Hangouts), PLM, SAP, and CRM systems. These applications require a small, but critical, amount of GPU processing to deliver a modern user experience. NVv4 fractional GPU capability makes it possible to support these use cases using virtual desktops, partitioning the GPU resources to satisfy performance, mobility, security, and budget requirements while addressing IT management and security requirements.

Let’s explore in detail some of the features and benefits NVv4 offers:

Secure Remote Access
Migrating workstation-class workloads and user access to the Cloud ensures data is centralized, managed, and secured. Application and graphical and compute processing all take place in the data center. Users receive only a stream of display pixels, protecting against scenarios such as losing a laptop loaded with sensitive data, data loss caused by the failure of local workstations, or viruses. Azure portal-based access removes the need for VPNs and other insecure security measures that can be compromised on an unmanaged endpoint.

Now, key user groups like geologists, engineers, project managers can access files at the office, on-site, or while at home or traveling. Geographically dispersed teams can collaborate on files confident that data is protected and that they’re all working on a single master file. Significantly, the NVv4 portfolio offers a range of performance options that can deliver rich, modern desktop computing experiences to nearly any internet-connected device including tablets, mobile phones and PCs. Now  key user groups can access and work with data while in the harshest physical environments or most sensitive political regions, all while the data remains secure in the data center.

The NVv4 instance is fully supported by Windows® Virtual Desktop, Citrix® Cloud and Teradici® Cloud access.  This broad support gives IT managers the ability to choose their preferred remote protocols, management, and admin tools. This flexibility helps to mitigate the challenges of moving from a private data center to Microsoft Azure by enabling IT managers to work with familiar, preferred solutions and tools.

Shift from CAPEX to OPEX to Manage Costs
With new oil and gas projects already costing tens of billions of dollars, it is important that their associated IT operations deliver infrastructure requirements while being efficient and flexible. By shifting to a Desktop-as-a-Service (DaaS) deployment, a third-party provider like Microsoft Azure provides the IT infrastructure, tests and helps provision and deploy resources for the customer while managing all the necessary hardware in the cloud as-a-service. This makes it possible for IT operations to switch from a rigid CAPEX spending model, requiring the purchase of server hardware, to a more flexible OPEX model, renting cloud-based services on a monthly basis and adjusting as dictated by needs.

Scalability and Rapid Project Initiation

Faced with managing multiple, simultaneous projects distributed around the world, IT departments need effective and reliable tools to scale and deploy infrastructure across different settings. Azure facilitates remote troubleshooting, application updates, and delivery of security patches throughout a project’s lifecycle. Rapid scaling and management of IT resources accelerates production schedules, ensures productivity is enhanced from day one, and makes it possible to eliminate ongoing costs when projects are complete.  

Azure Guaranteed High-Availability to Reduce Costly Downtime
The scale of investment in oil and gas projects means that downtime and delay can quickly accumulate into millions lost. A virtual IT infrastructure provides redundancy, stability, and flexibility that protects against the unforeseen, from minor disruptions to significant man-made or natural occurrences. Vital data resources and applications can remain on-line and accessible to staff who can continue to work remotely and securely. Azure’s guaranteed Service Level Agreements (SLAs) for VMs (Virtual Machines) typically guarantee in excess of 99.9% availability offering organizations assurance of high availability. More information on SLA: click here 


Secure Flexibility through SR-IOV
Microsoft Azure fractional GPU capability (GPU-P) is built on Single-root input/output virtualization (SR-IOV) standards, unique to AMD powered NVv4 instances, enhances security capabilities when the GPU resources are shared among multiple users in a public environment. This cloud-native, SR-IOV-based virtualization provides improved security compared to software-based GPU virtualization standards as it enables isolation of PCIe® hardware resources, helping prevent unauthorized access to the data of one VM by users of other VMs sharing the GPU. 


License Management
The high cost of specialized software can be a barrier to increasing the number of Geologists/Geophysicists assigned to a project. Leveraging DaaS allows for concurrently licensed software to be brokered and rationalized, e.g., in scenarios where some analysts only require access occasionally. IT managers can maintain greater overall awareness of a distributed environment that may include offices and staff around the world. With greater visibility, IT administrators can better optimize usage of costly software licenses, manage costs, and widen access to precious licenses.


NVv4 GPU options - optimized resourcing
Analysts, geologists and engineers rely on a range of 3D and graphical applications to perform complex analysis, including Schlumberger Petrel E&P and INTERSECT; Halliburton DecisionSpace and Nexus; CGG GeoSoftware; Ansys Fluent; Autodesk AutoCAD; Dassault SOLIDWORKS and CATIA; Siemens NX and Teamcenter; ESRI ArcGIS; and Spatial Energy Petra. The requirements of individuals users can vary considerably. Some only view specific datasets, lighter weight or 2D CAD models, while others may undertake GPU-intensive CFD simulations. The range of GPU sizes offered in the NVv4 series provides an opportunity for cost savings by enabling IT managers to adjust VMs to fit the needs of different workloads, upsizing or downsizing resources to adjust to users’ real production workloads.  

Industry Certification and Professional Graphics Drivers Included
The AMD EPYC™ 7002 Series processors have robust compatibility with virtually all software available in the market today. AMD works with the open source community and major software vendors to help ensure key industry applications and enabling software will work exceptionally well with the AMD EPYC™ processors.. All AMD supported Azure instances include professional GPU drivers with no licensing cost. ISV certifications and optimizations for professional industry visualization applications including ERSI help to assure a reliable, productive user experience.

Matching NVv4 to Requirements:






For remotely viewing and editing massive datasets and complex 2D/3D images

For remotely viewing and editing 2D and 3D mechanical images

For general purpose Windows 10 virtual desktops and office productivity applications





Standard_NV4as_v4 or


Other resources to consider: 


George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied. Use of third party marks / logos/ products is for informational purposes only and no endorsement of or by AMD is intended or implied.

0 0 1,316

Welcome developers to the first in a series of blogs about AMD ROCm. Im Terry Deem, Product Manager for ROCm. In these blogs, I will let you know about upcoming new releases, features, training, and case studies surrounding ROCm. The ROCm SDK is a set of tools, libraries, and API for developing HPC applications using GPUs for computing. You can learn more about ROCm with this introduction video located here 

After watching the introduction video, you might want to know more about HIP. HIP is the API used to develop your application to run on either an AMD or NVIDIA GPU. This powerful API makes it easy to, with minimal effort, let the same source code compile for both AMD and NVIDIA GPU’s. If your application is already in CUDA and you want to expand it to work on AMD GPU’s, use the HIPIFY tool. This tool will automatically convert the source from CUDA to HIP.   

In this blog, I am happy to announce our first set of on demand videos on the ROCm technology. You can find them here below. In these videos you will learn about AMD GPUs and how to develop applications that can utilize their compute power to accelerate your applications. You will learn how the GPU works, how threading works on them and how to write your programs using the HIP API in the ROCm SDK.  


ROCm Video Series 

1) Introduction to AMD GPU Hardware: Link 

2) GPU Programming Concepts Part 1 - Porting with HIP: Link 

3) GPU Programming Concepts Part 2 - Device Management, Synchronization and MPI Programming: Link 

4) GPU Programming Concepts Part 3 - Device Code, Shared Memory and Thread Synchronization: Link 

5) GPU Programming Software - Compilers, Libraries and Tools: Link 

6) Porting CUDA to HIP: Link 


ROCm and HIP are foundational to the applications that will run on the two Exascale systems that was recently announced, Frontier and El Capitan. You can learn more about ROCm on our documentation site located here. We are excited to see what you can do with HIP and look forward to hearing from you.  




Terry Deem is a Sr. Product Manager for ROCm at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied. 

1 1 3,456

With all the excitement around the general availability of Microsoft’s Azure NVv4 instances, I wanted to reshare this MxGPU white paper that AMD’s Tonny Wong created when we first launched the SR-IOV based GPU virtualization architecture. This is a great paper for anyone wanting to understand and learn more about the underlining technology within our GPU architecture.  (Note: we have made a few updates to the paper below to keep it current.)



Originally created by Tonny Wong, Radeon Technologies Group







Virtual Desktop Infrastructure (VDI) has evolved over the last few years, enabling richer user experiences and improved manageability and deployment ease. Many traditional VDI enterprise customers have gained productivity and lowered Total Cost of Ownership (TCO) for their desktop users. The growth of VDI needs to address the needs of “greenfield” users, those organizations that want the benefits of secure hosted desktops but with a deployment model that is more consistent with their traditional desk-side workstations. These deployments need to abide to existing datacenter standards for hypervisors while leveraging capabilities that match traditional workstations.


The Trend Toward VDI

Remote graphics protocols have greatly improved user experiences, delivering the feel of a local workstation computing resource for LAN users and optimizing multimedia and graphics capabilities for WAN users. These remote protocols can deliver GPU-rendered content from the datacenter allowing Virtual Machines with standard desktop OS’s to be the main deployment method for users of all types. From demanding workstation applications with high 3D GPU needs all the way to standard enterprise desktop users who want GPU-enriched desktop experiences, this range of users can take advantage of a vast array of VDI solutions now in the market.


VDI is a great way to help improve desktop security by hosting out of an enterprise private cloud (on-premise datacenter) or via offerings from cloud service providers either fully public or via hybrid public/private clouds.  However, the capabilities should match what users expect from their local workstation systems and not be limited to a subset of features. Enterprise VDI deployments should have access to GPU resources in the datacenter or service provider that deliver 3D capabilities across many users while still making all graphics API and compute API standards are available, just like on local workstation systems.


What AMD GPUs bring to the Virtual Desktop GPU technology for VDI allows users migrating from physical workstation desktop systems or notebooks to capture the same or better graphics capabilities as their desktop workstation, with good productivity while enabling more user types to migrate to VDI. In supporting this migration to VDI, GPU vendors need to ensure that, when enabling a GPU for virtualization across many users, this GPU must deliver deterministic performance, helping to better gauge user types and numbers of GPU resources needed.


AMD has spent the last few years implementing features in our GPU hardware to prepare for virtualized platforms.  Implementation in our silicon allows our new AMD Multiuser GPU technology to share the GPU resource across multiple users or virtual machines while giving the expanded capabilities users expect from local workstations utilizing discrete GPUs. The AMD Multiuser GPU products can provide enterprise customers with a choice for their GPU and 3D processing needs that can help make GPU use more pervasive on VDI deployments.


VDI with GPUs: Lifting Performance and User Experience

With Virtual Desktop Infrastructure (VDI), one can gain the benefits of security, manageability, and remote access to deploy and support enterprise desktop users and may additionally experience lower total cost of ownership (TCO). For the knowledge worker and task worker user types, VDI deployments help apply better control of user environments while enabling increased performance by virtue of virtual machines being closer to datacenter, hosted datasets or applications.


Users who required higher computing power specifically around GPU technology for 3D and GPU compute applications were either left on physical desktop systems or deployed with comparatively expensive pass through GPU technology, losing the benefit of distributing the graphics card cost among multiple users.  Early virtualized GPU technologies addressed some of these areas by adapting a standard GPU architecture to virtualization via software in the hypervisor, but this isn’t the ideal solution to mimic true discrete GPU-like performance. Features like GPU compute functionality are not available, limiting some applications to fallback to CPU usage when a desktop workstation would have leveraged a GPU.  Initial pricing for these virtualized GPU solutions was compelling compared to multiple pass-thru GPU devices but they can still have much greater costs than multiple desktop discrete GPUs. Standard VDI technologies utilize software-emulated GPUs, specifically in VMware vSphere with Horizon View, where the base level graphics capabilities are limited.  This works fine for knowledge workers where enabling software 3D emulation with Virtual Shared Graphics Acceleration (vSGA) allows basic applications to run, albeit with higher CPU utilization. vSGA performance is further enhanced by leveraging a hardware GPU with appropriate vSGA drivers from graphics vendors. Even with hardware vSGA support, however, it does not necessarily meet the requirements for more intensive 3D Graphics and Compute user needs. Certifications (CAD/CAE as an example) for applications are not available due to limited support level in graphics APIs like OpenGL® or DirectX®.


Virtualized GPUs allow workstation and power user categories to migrate to VDI with acceptable GPU performance. Workstation users from CAD/CAE, M&E and specialized segments can leverage workstation-class drivers on applicable platforms to support applications with certification requirements.  Power users who rely on DTP/Desktop Publishing, or internal enterprise applications who need GPU support can migrate to VDI environments.


AMD Multiuser GPU – Technology Foundation

Rather than repurposing an existing GPU and adding a software layer to accommodate virtualization requirements, AMD’s Multiuser GPU approach is to create an entirely new class of GPU architecture with virtualization capabilities built into the silicon. AMD challenged the notion that the support of GPU virtualization required a proprietary software solution. Compliant with the well-established PCIE® virtualization standard SR-IOV (Single Root I/O Virtualization) specification, AMD has implemented a hardware-based GPU architecture. The culmination of these efforts resulted in the creation of the industry’s first hardware virtualized GPU. 


The SR-IOV specification defines a virtualized PCIE device to expose one or more physical functions

(PF) plus a number of virtual functions (VFs) on the PCIE bus. The specification also defines a standard method to enable the virtual functions by the system software such as the hypervisor or its delegate. These VFs may inherit the same graphics capabilities of the physical GPU, allowing each to become fully capable of supporting the GPU’s graphics functionality. Through the PF, system software controls enablement and access permissions of the VFs, internally mapping resources such as the graphics cores and GPU local memory.


The task of GPU virtualization management can therefore leverage the existing standard PCIE device management logic in the hypervisor, unburdening the hypervisor from proprietary and complex software implementations. To further simplify the deployment, an optional driver can be loaded to help the hypervisor to enable/disable virtual functions and to manage the Multiuser GPU’s resources.


The PF manages sharing of graphical resources by scheduling the GPU cores across VFs and allocating graphics memory to each of these VFs. The PF also assigns internal register spaces to each VF ensuring an orderly and structured method for the VFs to access hardware resources and data, at the same time helping keep that data secure. Because each GPU VF is designed to inherit the attributes of the physical GPU, it supports full GPU capabilities allowing the support of graphics and compute features.


When these VFs are passed through to their assigned virtual machines, they will appear as full-featured graphical devices to the virtual machine’s guest OS. Since the guest OS sees the VFs as a native graphics device, AMD’s native Radeon™ Pro™ graphics driver that are designed for professional graphics devices can be loaded within the virtual machine to unlock the GPU’s graphics and compute capabilities.


A number of Radeon Pro graphics products already support passthrough mode, allowing remote users the ability to access a GPU installed on a host server from a client device. AMD Multiuser GPUs evolved this architecture to support from 1 to 16 VFs, allowing each to appear as a passthrough device with added security and quality of service. Mapping one VF to a virtual machine allows the creation of up to 16 independent guest OSs that are accelerated by a single GPU. User density is limited only by the availability of PCIE slots.

Key Benefits

Predictable Performance

A key benefit of hardware-based virtualization is that hardware-controlled scheduling cycles deliver predictable quality of service (QoS). The fixed scheduling cycles apportioned to each VF ensure that each VF receives its fair share of GPU services.


Predictable performance or deterministic QoS results in smooth transitions from proof-of-concept pilots to organization-wide deployments. Pilot managers determine the capabilities of the GPU during the proof-of-concept phase and scale up or scale down user density (number of users per GPU) as required. 

Being able to determine the GPU needs of the user base ties back to an organization’s ability to forecast and plan its resources. Under-forecasting results in failing to meet users’ performance expectations; over-forecasting results in under-utilizing a configuration. The predictable nature of AMD’s Multiuser GPU solution helps avoid these unwanted outcomes.


Secure Implementation

The push towards virtualization is in part driven by the needs of centralizing and securing data and resources. The cornerstone of AMD’s Multiuser GPU technology is its ability to preserve the data integrity of virtualized desktops and their application data. The hardware-enforced memory isolation logic provides strong data security among the VFs, which helps prevent one VM from being able to access another VM’s data.


With security being a bare minimum requirement for any virtualization solution, AMD’s hardware-based virtualized GPU solution offers a strong deterrent to unauthorized users who traverse the software or application layers seeking means to extract or corrupt GPU user data from the virtual machines. Although a VF can access full GPU capabilities at its own GPU partition, it does not have access to the dedicated local memory of its sibling VFs.


Uncompromising Support for APIs and Features

The AMD Multiuser GPU technology exposes all graphics functionality of the GPU to the VF at its partition allowing for not only full support for graphics APIs like DirectX and OpenGL but also GPU compute APIs like OpenCL™.  Code written in these standards for the physical device need not be adapted or altered to function in the virtual environment. AMD is the first GPU vendor to support hardware-based native GPU compute features within the virtual environment. Since VFs are allowed access to all of the GPU’s rendering resources during their respective time slices, the need to perform post-processing operations to partition data or tasks is not necessary.


AMD operates on the principle of creating customer-centric designs, offering useful features and allowing customers to build usages around these features. Limits are added to control quality, not to constrain utility. Radeon Pro professional graphics, AMD’s workstation brand of graphics products, can drive up to six displays per GPU as a standard offering on select AMD Radeon Pro W-series products. Because the Multiuser GPU resides among the FirePro brand of products, the ability to drive up to six displays is an inherent feature. Multiuser GPU products extend this feature by allowing each VF to drive up to six displays within the virtual machine (note that this may be dependent on the remoting protocol and client being used).




The desire to share storage and network resources sparked innovation of technologies for these devices. The need to centralize all these resources and to secure them in a remote datacenter continues to drive the migration to virtualization. GPU virtualization is a relatively late participant in this migration with early proprietary software-based solutions offering limited GPU capabilities. To become ubiquitous, GPU virtualization technology has to be transparent and standardized, giving users near-desktop experiences without alerting to the fact that they are in a virtualized environment.


AMD Multiuser GPUs push GPU virtualization closer to complete transparency and ubiquity by innovating with a hardware-based solution with conformance to the virtualization industry standard, making it easy 

to be adopted and integrated into the existing hypervisor ecosystems.

0 0 2,421

The financial services industry is no stranger to virtualization, having already come to appreciate the advantages it offers for satisfying important IT requirements such as centralized data security, enhanced mobility, and improved disaster recovery capability. The advent of Microsoft’s new NVv4 instance for Microsoft Azure with fractional GPU capability now has the potential to make it feasible to expand the use cases, practicality, and opportunities to use virtual machines (VMs) to support finance operations.


The Virtualization Challenge

One of the barriers to broad adoption of virtualization across many more essential financial applications has been the fact that most widely used software solutions such as trading consoles and visual analytics workstations require GPU support to ensure responsive interactivity under real-time demands. Prior to NVv4, this was only possible by providing each user’s computer or workstation with access to a full, dedicated GPU in the data center. This was highly inefficient, as many applications really only require a small, but nonetheless critical, amount of GPU processing to deliver a great user experience. Thus, the approach was expensive on a per-user basis and did not sufficiently improve the maintenance burden on IT departments. The need to offer the highest level of security for these environments has further complicated the switch to virtualized topologies.


NVv4 Changes the Virtualization Equation

Azure NVv4 instances powered by AMD 2nd Gen EPYCTM Processors and AMD Radeon InstinctTM GPUs tackles these challenges. Financial services organizations can deploy cost-effective, fully cloud-based desktop environments that meet the performance, flexibility, security, and cost requirements of their critical applications. NVv4 also addresses the management requirements and security standards demanded by IT management and corporate governance. Specific benefits include:

  • AMD’s SR-IOV technologies enable IT managers to deliver the right amount of GPU service to individual desktops and workstations based on application needs while sharing a high-powered GPU among multiple users.
  • Four AMD powered NVv4 options make it possible to provide configurations that align with the particular computing workloads of different users.
  • VMs such as Azure control data because data never leaves the datacenter; only pixel information is sent to the device.
  • With AMD’ SR-IOV-based GPU virtualization architecture, each virtual desktop is physically isolated, even when a single GPU is shared by multiple users.
  • Based in the Cloud, Azure can reduce reliance and expenditure on physical IT infrastructure such as on-premises data centers.
  • NVv4 offers instances that can support 4K displays, 60hz screen refresh rates, and multi-monitor support for up to 4 monitors. 

Let’s consider just a few of the use cases that are now possible to the financial services sector.


Branch offices

Azure is centralized in the Cloud, so it enables IT departments of large financial organizations to remotely deliver and update applications and roll-out security patches. This can also help IT retain greater situational awareness of their entire distributed environment, which may include hundreds or thousands of branch offices, affording improved control and compliance oversight. With greater visibility, IT administrators can better optimize usage of costly software licenses and better manage costs. 


Azure supports end-users with an ultra-low-latency global data backbone that delivers a highly productive experience. The combination of AMD enterprise-grade CPU and GPU hardware with the NVv4 Windows® 10 virtual instance helps ensure optimal compression for remote protocols that can overcome local limitations in networking and bandwidth, relieving IT of the need to install and modify leased offices. As tablets and other portable devices become common in local banks, a virtualized approach makes it possible for such devices to access powerful tools, enabling staff to assist customers from convenient, comfortable locations rather than behind a bulky workstation at a fixed desk.  


Trading environments

The Windows 10 environment and key business applications such as Bloomberg, Capital IQ, FactSet, and Thomson Reuters Eikon, all require GPU support to deliver the responsive, low-latency interactive experience users such as traders demand. Powered by the combination of AMD 2nd Gen EPYC processors and AMD Radeon Instinct GPUs, NVv4 instances address that challenge while providing IT managers with flexibility to choose the right-sized configuration for different types of users.  Unlike on-premises data centers, where IT managers must purchase hardware and licenses, then install and service servers, NVv4 enables IT managers to simply and quickly provision resources from the Cloud when adding new users to the workforce.  


Data Security and Regulatory Compliance

Secure remote access provides financial services companies the knowledge that data is locally replicated and can be backed up centrally in the data center avoiding unmanaged end-points.


Business Continuity and Disaster Recovery 

In today’s electronic trading environments, downtime can lead to missed opportunities and significant financial loss. If an office, municipality or large region is impacted by a natural or man-made disruption, a virtualized infrastructure can provide critical redundancy.  It can help ensure that vital data sources, compute/simulation resources, real-time analytics tools, and trading desktops remain online and accessible, enabling staff to work remotely and securely. Azure guaranteed Service Level Agreements (SLAs) for VMs typically guarantee in excess of 99.9 percent availability. 


Channel Partner Access

Financial products are often sold via brokers or agents, particularly in the consumer insurance and mortgage sectors. Virtualization can allow financial institutions to provide sales channel partners with secure, limited, ring-fenced access to applications or data as needed. This is critical to maintaining compliance with FSA and GDPR legislation. Azure has a proven track record of supporting the compliance needs of enterprise, global financial services, and banking organizations.


The Financial services industry faces some of the most challenging IT configuration and management issues. The flexibility of NVv4 is well worth a look by those looking to effectively streamline some of that complexity and better control costs, without sacrificing performance.


Other resources to consider:

George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

0 0 1,858

Microsoft’s announcement of its new NVv4 virtual desktop instances got me thinking about the many industries that may benefit from expanding virtualization. With fractional GPU functionality built on AMD Radeon GPUs, NVv4 suddenly makes it feasible to apply Desktop as a Service (DaaS) to use cases previously burdened with compromises. So, over my next few blogs, I’ll explore some of those industries, beginning with a favorite of mine, Education.

IT Managers in education work magic, forever balancing technical progress, rising user expectations, and, above all, cost. Microsoft Azure NVv4 is exciting because it addresses the breadth of those challenges. By making it possible to share GPU resources in a third-party, cloud-based managed data center, NVv4 enables education IT to:

  • reduce the need to invest in, manage, and upgrade expensive private data centers
  • define and scale virtual data centers to deal with the evolving demands
  • optimize usage of computing resources
  • deliver a custom-fit, great user experience to the differing needs of students and faculty
  • increase security and accessibility on- and off-campus

DaaS--The Right-Sized Approach to Education IT Needs

DaaS shares the appealing capabilities of on-premises VDI (Virtual Desktop Infrastructure), but with the massive added benefit that a third-party provider like Azure now designs, procures, deploys, and manages all the necessary hardware and VDI software. Education facilities instead rent cloud-based services on a monthly basis. 

IT operations can switch from a rigid CAPEX spending model to a flexible OPEX model, paying for only what they use. This may be the answer to the reduced demand of summer holidays, term breaks, and variations in teaching and learning hours. 

Device Flexibility

Virtual desktops are accessible from students’ own devices, regardless of technical specifications. This is possible because all performance and data are in the Cloud. Only the final info needed for display is sent to the user. This can extend the life of devices and make it possible to support affordable low-power PCs, Chromebooks, or tablets without the concern of performance or application compatibility issues. In fact, students can generally choose between Macs, PCs, or Chromebooks for courses without compatibility concerns. IT administrators can be freed from maintaining physical PCs and workstations while centralization also simplifies the management of software licenses. 

Fractional GPU with AMD Changes the Equation for Education

Until NVv4 it was only possible to choose between expensive full-GPU, high-specification VMs or non-GPU VMs. Configurations without any GPU don’t meet the demands of even a basic modern web browser. While a full GPU made sense for high-end workstation applications, that level of service was costly overkill for users of basic productivity software and collaboration who require only a small portion of a GPU to enjoy a great experience. 

GPU partitioning in  Azure NVv4 instances allows IT administrators to fit the needs of application and course requirements. For example, initial undergraduate courses using SolidWorks are unlikely to have the same demanding requirements as professionals in CAD/CAM industries. An NVv4 option with 4GB of GPU is usually sufficient to provide a high-quality experience at a lower cost for many engineering applications as well as Windows 10 and video streaming. Larger GPU options are also available to support heavyweight users and researchers doing more intensive CAD work or sophisticated CFD (Computational Fluid Dynamics) simulations.  

The Tools for Great User Experiences

Remote display application and protocols are key to good user experiences with VDI/DaaS in the Cloud and the NVv4 does not disappoint with Windows Remote Desktop (RDP) 10, Teradici PCoIP, and Citrix HDX 3D Pro for remoting flexibility, regardless of the intended use case. The AMD Radeon GPUs also support native graphics APIs like DirectX 9 through to 12, OpenGL 4.6, and Vulkan 1.1 ensuring a true graphics experience in the Cloud. AMD Radeon Pro professional graphics drivers are included license-free with all AMD GPU enabled Azure instances, with no restrictions on the number of users for multi-user Windows Virtual Desktop and Remote Desktop Session Host, providing IT departments with administrative freedom. 

Addressing the Modern Education Environment

Data Security
Virtual desktop environments are essentially sandboxed and centralized, with Azure running the Hyper-V hypervisor. IT administrators no longer need to worry about the security patching of BYOD laptops and can be assured that educational resources are not abused for gaming, bitcoin mining, or accessing inappropriate material. Azure’s regions and data controls are already proven and trusted for handling sensitive research projects and data in collaboration with military, government, and industrial collaborators.

Increased Access with Virtualized Classrooms, Labs, and Distance Learning

Students can work anywhere--in libraries, residence halls, off-site, or around the globe. NVv4 helps schools overcome weather, distance, time, and increase their capacity to remove barriers to access through online programs. Curricula can be rapidly refreshed, centrally deployed, and managed to enable universities and high schools to provide online courses, and to deploy new course materials and resources instantly. Azure’s high-availability guarantees and regional data centers to provide low latency access globally.  Courses in other time zones may also rely on Microsoft supported infrastructure avoiding not only the need for hardware but also out of hours IT support. 

Support demanding graphical, collaborative and processing-intensive curricula

The new NVv4 instances are powered by the 64-core AMD EPYC 7742 CPU and the AMD Radeon Instinct MI25 GPU, with GPU sizes between 2GB-8GB available and full AMD Radeon Pro professional graphics drivers. By removing the need for students to be tied to high-performance workstations, even design, engineering, animation, and visual effects courses can be supported virtually and use professional 3D software applications including Dassault Systèmes SolidWorks and Catia; Autodesk, PTC, Siemens NX, and Adobe Creative Cloud.  NVv4 similarly delivers a great foundation for modern collaboration applications with rich media. 


I believe that NVv4 has the potential to dramatically reshape the IT landscape for education. It creates remarkable new opportunities for IT managers to better balance what have been competing demands for up-to-date technology, security, cost management, and great user experiences for faculty and students.  

Find out more

If you’d like to find out more, please visit [hyperlink] 

Additional links 


George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

0 0 5,226

3rd party advertisement

Are you interested in deploying in the cloud? Do you want to learn more about AMD-powered desktop and workstations in the cloud? Well join us Wednesday, April 1st for a live webinar as we launch the all new AMD-powered desktops and workstations on Azure with Workspot’s turnkey enterprise-ready cloud desktop platformWe’ll also discuss how IT organizations can quickly deploy this turnkey VDI solution to their users to work remotely whether at home, in the office or onsite.



Watch the webinar recording now

What: Live webinar broadcast AMD-Powered Workspot workstations on Azure
Who: Hear from the following cloud experts:

  • Adam Glick, DaaS Cloud Tech Marketing at AMD
  • Kevin Raines, HPC Specialist at Microsoft
  • Brad Peterson, VP at Workspot
  • Andy Knauf, CIO at Mead & Hunt
  • Doug Dahlberg, Dir of IT at ASTI


A few of the topics examined: 

  • Why move to Azure (the Cloud)?
  • Why choose Workspot cloud desktops and the new AMD-powered offering
  • How do I quickly deploy Workspot cloud desktops & workstations on Azure to address remote working


other resources: landing page - Click here

AMD blog - Click here

MSFT blog - Click here

The information contained in blog represents the view of AMD or the third-party presenter as of the date presented. AMD and/or the third-party presenters have no obligation to update any forward-looking content in the above presentations. AMD is not responsible for the content of any third-party presentations and does not necessarily endorse the comments made therein.


George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

1 0 1,261

Virtualized environments can pose some challenges for companies. In order to bring a more consistent and user friendly experience to virtual environments, AMD and Microsoft have been working together to offer a whole new cloud experience for desktop and workstation users.

Microsoft Azure NVv4 instances are the first desktop as a service (DaaS) Virtual Machines (VMs) powered by the combination of 2nd Gen AMD EPYC processors and AMD Radeon Instinct GPUs. The NVv4, as of today, is now generally available to the public.


NVv4 represents a convergence of innovative technologies to make modern desktop experiences possible from the cloud. Enterprises can deploy affordable, cloud-native GPU-accelerated desktop environments that meet the performance and flexibility demands needed for high productivity of their employees. Just as important, NVv4 also offers state-of-the-art IT management tools to help drive success of IT organizations.


How is this possible? NVv4 instances are built on three fundamental pillars to enable cloud-native modern desktop and workstation experiences.


GPU-Accelerated Performance

Today’s digital workforce relies on modern applications. Modern applications are built with GPU acceleration at their core. From the most powerful 3D design tools, to common office productivity tools, and even web browsing, everyday applications are designed to require or benefit from graphics acceleration support built in. In other words, virtual machines without GPU acceleration will often struggle with some of the most common desktop tasks.


As the first VMs on Azure to take advantage of AMD’s SR-IOV technology to enable GPU partitioning, NVv4 provides IT decision-makers with four VM options calibrated to meet the variety of use cases in the modern workplace. Whether they are a professional running a workstation-class design application or support staff using Microsoft Office 365, all users receive the performance and reliability of 2nd Gen AMD EPYC processors and Radeon Instinct GPUs. ISV certifications and optimizations for professional 3D applications further reinforce the user experience.


Support for the latest Windows 10, Windows Server and Windows 10 Enterprise multi-session operating systems provides IT with the flexibility to specify single- or multi-session configurations as needs dictate. Even when the GPU is partitioned, the individual user’s experience is indistinguishable from the experience of a locally installed GPU to which they are accustomed.


IT managers can continue to rely on the traditional remote protocols, management, and administration tools they prefer. NVv4 instances are fully supported by Windows Virtual Desktop, Citrix Cloud, Teradici Cloud Access and Workspot Cloud VDI so the migration to Azure is both smooth and familiar.

“The flexibility that Azure NVv4 with AMD-powered GPU partitioning provides for users to share and access GPU resources as needed is a valuable feature that we see will benefit many Teradici customers. We are excited to be working with Microsoft and AMD to enable more flexible, cost-effective GPU options for virtual desktop and virtual workstation use cases such as AEC.”

– Ziad Lammam, Vice President of Product Management at Teradici

Uncompromised Security

Security is at the core of nearly every IT conversation. In an infrastructure where resources are shared across users and services, companies need to be confident that individual users data is fully protected. While Azure is built on world-class security technologies, traditional GPUs


Security runs deep into the hardware of AMD-powered Azure environments. While traditional GPUs rely on software techniques for security in virtualized environments, NVv4 is powered by SR-IOV-based GPU virtualization, enabling isolation of PCIe hardware resources to prevent unauthorized access to the data of one VM by users of other VMs. Each VM can only access the physical resource that has been allocated to it. Each VM is physically isolated from others, even when a single GPU is shared by multiple users. SR-IOV is recognised and established in the industry as one of the key standards for resource isolation – that’s why Microsoft is including  this technology as part of its comprehensive plan to keep its customers safe and protected when virtualised.

"The diversity of the new AMD-based Workspot cloud desktops on Microsoft Azure is a huge deal for us. Based on the application requirements of each engineer, we can dedicate all or a fraction of the AMD GPU to their Workspot workstation on Azure. This finer resolution of control gives us the financial edge we need to move more people to Workspot cloud desktops on Azure and increase our overall productivity."

– Eric Quinn, CTO at C & S Companies

Cloud-like Affordability

One of the biggest promises of cloud is that businesses can reduce their cost by renting exactly what they need. Yet for businesses looking to deploy GPU-accelerated VMs, this was not possible. Prior to NVv4, users could only choose between more expensive full-GPU VMs or non-GPU VMs. Even if the user didn’t need the entire performance headroom of a full GPU, they would be required to rent it. While the cost of a full GPU could be justified for the highest-end workstation workloads, most desktop experiences need a fraction of the GPU for optimal experience.


One of the key benefits of AMD-powered GPU partitioning in Azure is the ability to deliver fractions of a GPU at more affordable price points. Four AMD-powered NVv4 options are available to IT managers, making it possible to provide virtual desktop configurations that perfectly meet the particular computing workloads of different users. NVv4 instances can deliver GPU-powered desktop experiences that enable the GPU to be configured to be used by eight, four, two, or a single user as dictated by their application needs.

“As more organizations start migrating Citrix workloads to Microsoft Azure, they want to ensure that they’re delivering that same level of experience as their previous on-prem deployments. We’re excited to be partnering with AMD and Microsoft with the release NVv4 instance, as this ensures organizations can deliver graphically accelerated Citrix Workspaces with superior user experiences while also optimizing their costs.”

– Nitin Sharma, Sr Product Marketing Manager for Workspace Services at Citrix

Promises Fulfilled

AMD CPU and GPU powered NVv4 instances are the first GPU-accelerated virtual desktops for Azure and provides businesses with productivity, the absolute requirement for security, and the ever-present pressure to manage costs, all while providing users with an adaptable, flexible, high-performance cloud-based work environment that addresses the breadth of expectations of the modern workplace.


Businesses interested in assessing and testing DaaS environments for their operations can work with Microsoft partners like Cloud Jumper and Workspot to ensure professional and experienced teams who can help assess your business needs every step of the way from POC to deployment and migration.


Find out more:

George Watkins is a Product Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

0 0 4,220

A few months ago at Microsoft Ignite in the AMD booth, I had the opportunity to showcase the first GPU partitioned and shared instances (NVv4) available for Microsoft’s Azure cloud featuring the AMD Radeon Instinct MI25 accelerator, along with AMD’s other EUC (End User Computing) and data center products. News about the Microsoft and IGEL partnership relating to WVD (Windows Virtual Desktop) also attracted interest from our Cloud, Citrix and related customers. Although WVD has been available in preview, no Linux-based WVD client had been available which resulted in increased interest in the IGEL offering. And at the recent Disrupt 2020 event, IGEL announced the first Linux client to support WVD. The Microsoft SDK that makes this integration possible has the potential to enable other thin-client vendors to offer their own solution.


While the use of AMD CPUs and server GPUs is well-known, AMD is also a major player in providing the CPU and graphics/GPU hardware within many of the most popular thin clients.


The joint IGEL and Microsoft announcement was particularly satisfying for me as it heavily featured IGEL’s flagship UD7 client targeted at graphical use cases which is built around AMD technologies. For example, the technical specifications for the UD7 client features the AMD Embedded RX-216GD 1.6 GHz (Dual-Core) up to 3.0 GHz (boost mode), system on a chip (SoC). With the option for an additional graphics card, the AMD Embedded Radeon™ E9173 discrete GPU can extend the UD7 to support the simultaneous use of up to four digital monitors at 60 Hz by DisplayPort (two in 4K and two in 2K). The flagship UD7 client also features IGEL’s latest security enhancements -- a benefit for scenarios when security is a concern for thin clients. 


Last week at IGEL Disrupt Munich, a new version of the UD3 client was announced on The UD3 is supported by a specially optimized AMD Ryzen Embedded R1505G that: uses less power (about 10 watts); features hardware optimizations for PCoIP (PC over IP) Ultra; and leverages the AMD Secure Processor feature checks to help assure the UEFI is signed by IGEL. The availability is expected May 2020, but in the meantime information currently exists about the specifications and IGEL solution architect blogs, including a blog by Fredrik Brattstig.


My role at AMD is largely associated with evaluating the performance of our Data Center and Cloud products including AMD Radeon Pro V340 and AMD Radeon Instinct MI25 server GPUs. The evaluations are conducted within the context of the protocols and EUC/VDI environments used in scenarios featuring Azure, RDP, Citrix, VMware, and Teradici. Most remoting protocols have a feature often referred to as “Back-pressure” – a process whereby the end-client is aware of whether it is keeping up with the server frame rate and alerts the server accordingly. It’s widely known that there’s no point churning out frames if the end-point can’t handle the rate. So it’s important to have a suitably powerful end-point that can become the most significant factor in the overall user experience. IGEL, supported by AMD solutions, has proved very popular, You can discover from IGEL about the use cases and features of the UD3 and UD7.


The IGEL and Microsoft partnership plus WVD support along the AMD enabled NVv4 Azure instances were all featured by the independent blogger, Bas Van Kaam. The recommended blog offers a suitable summary of Ignite and can be found here.


Now that these major events have concluded, I’m eager to get back in the AMD lab to “kick the tires” of WVD and the NVv4 Azure instances with the WVD supported IGEL UD7. My goal is to blog about my findings, but I’m eager to discover others’ experiences with thin clients, especially if there are additional factors for consideration. If you want to try out NVv4 with WVD, I recommend a useful video guide available from Microsoft’s Stefan Georgiev on  YouTube.


Recommended Links


Joe DaSilva is a Cloud Graphics Solutions Architect for AMD. His/her postings are his/her own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

0 0 1,259

AMD GPUs deliver the first shared GPU instances for Microsoft Azure – NVv4 instances

Today, the first Azure instances utilising GPU partitioning technology became available. These instances effectively enable a large server GPU to be partitioned, supplying VMs with an appropriately sized GPU, and opening the way for potential savings in the cost for GPU-enabled cloud VMs.

Key to adoption of AMD GPUs by Microsoft Azure was the alignment of our SR-IOV based MxGPU hardware-sharing technologies to Microsoft Hyper-V’s own GPU-P technologies. This is clear validation of our strategy at AMD to work with Microsoft over many years to align with their roadmap resulting in the first GPU sharing solution on Azure, acceptable in terms of user segregation, security features and quality of service. Our virtualised GPU sharing technologies have already been proven with other hypervisors including VMware ESXi and the Citrix Hypervisor (XenServer). This is however the first time GPU sharing has been enabled for a Hyper-V based platform with Azure.

The result is a portfolio of instances leveraging both AMD CPUs and GPUs that are sized to the realistic needs of users; ranging from smaller instances that align to the needs of Office workers or Mobile CAD workstations (2 and 4 GB equivalent GPU resource) to larger instances that can support heavier graphical needs and session sharing like needs. AMD professional GPU drivers are offered free along with these instances.

vCPUMemoryGPU memoryAzure network
Standard_NV4as_v4414 GB2 GB50 Gbps
Standard_NV8as_v4828 GB4 GB50 Gbps
Standard_NV16as_v41656 GB8 GB50 Gbps
Standard_NV32as_v432112 GB16 GB50 Gbps


Initially NVv4 instances will be available in Azure Regions early next year in the South Central US and West Europe Azure regions.


Sign-up for preview using this link:


AMD Technology enables Microsoft Azure at Ignite 2019 Microsoft the preview; the interest it attracted in the booth but also in the End User Computing (EUC) and similar communities was fantastic, and it was great to speak to so many users about their enthusiasm for the options. I was cheered to see a blog by cloud community expert Marius Sandbu that covered the announcement but also caught the spirit of what we had hoped to convey.


Useful Links:


AMD at Microsoft Ignite


Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

George Watkins is a Datacenter GPU Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

1 0 4,590

AMD based Microsoft Azure virtual desktops deliver a workstation-class experience in the Cloud


Autodesk University is the place to be for professional architects, designers, engineers, and media creators. Of course, AMD will be there, returning as a Gold Sponsor of this important event to provide demonstrations of our most powerful desktop processors and graphics cards, to discuss your biggest challenges, and to reveal the latest technology innovations that enhance the workstation experience.


Taking centre stage in our booth, AE310, will be live demonstrations of the Microsoft Azure stack, leveraging the new NVv4 instances. This is the first Windows Azure virtual desktop to be supported by both 2nd Gen AMD EPYC processors and Radeon Instinct MI25 GPUs. If you are one of those people who designs, makes and builds the world around us and relies on the highest performance from applications like Autodesk to make things happen, then you owe it to yourself to learn more about the NVv4 instance.


Wondering what NVv4 stands for? “N” = GPU Accelerated VM family in Azure. “V” = Visualization. “4” = Generation 4 – which means the NVv4 is the current latest generation of GPU-enabled virtual desktops services from Azure.


Be more productive and collaborate by extending workstations to the Cloud

Modern-day designers, architects, and engineers demand the most of their critical tools. Whether in the office or at home, traveling or onsite, they need a workstation-class experience that provides flexibility and reliability no matter where in the world a project might take them. The NVv4 virtual desktops bring the full power of a traditional workstation configuration to bear whenever and wherever it’s needed. AMD GPU-enabled NVv4 virtual desktops make it possible to finally overcome the difficulty of balancing performance, mobility, and cost when addressing traditional Architecture, Engineering, and Construction (AEC) workloads.


Just what are Microsoft Azure NVv4 instances?

The NVv4 is a new, virtual desktop solution in Microsoft Azure that takes advantage of SR-IOV technologies (Single-root input/output virtualization) to introduce, for the first time, GPU-partitioning (or GPU-P). This gives customers maximum flexibility and choice by providing dedicated CPU/GPU-supported virtual desktops that best suit their workloads and price points. In fact, NVv4 will offer four distinct instance options to choose from, scaled to share a single GPU’s resources among as many as eight Virtual Machines. 


Alternatively, IT managers can maximize the user density of NVv4 with Windows 10 EVD, supported by Windows Virtual Desktop and available plug-ins from Citrix and Teradici. Anyone interested in trying the NVv4 experience for themselves can do so by signing up for AMD’s customer preview.


What will AMD be showing at AU?

Throughout Autodesk University, we will be showcasing our preliminary test environment, based on the planned NVv4 hardware and software stack, available in Microsoft Azure. You will get the chance to see a variety of the latest Autodesk applications for AEC and CAD workloads. AU19 will be a great opportunity to speak to the AMD team and explore how AMD-enabled virtual desktops in Microsoft Azure may help your organization. 

George Watkins is a Datacenter GPU Marketing Manager for AMD. His/her postings are his/her own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied. 

0 0 1,341

AMD technology makes GPU enabled virtual desktops possible across the entire Enterprise!  

Talk about being in the right place at the right time! My first opportunity to participate in the 2019 Microsoft Ignite conference, promises to set a new highwater mark for impactful demonstrations, learning opportunities, and meaningful collaboration between AMD technology and the Microsoft ecosystem.

This year the AMD booth will be packed with technologies and demonstrations of many of the latest AMD solutions with Microsoft, including the latest high-performance laptops and virtual desktops. For me though, the highlight at Ignite is the exciting news around Microsoft Azure NVv4 instances; the first Windows Azure virtual desktop supported by 2nd gen AMD EPYCTM processors and Radeon InstinctTM GPUs.

Wondering what does NVv4 stands for? “N” = GPU Accelerated VM family in Azure “V” = Visualization “4” = Generation 4 – which means the NVv4 is the latest generation of GPU enabled virtual desktops services from Azure.

Modern day applications want more

This is an important distinction because many modern productivity applications like Office 365, video conferencing and web browsing are designed to harness the GPU to deliver the best possible application experience. Many non-GPU VMs however struggle to deliver that experience while previous GPU-accelerated VMs could only be configured, and priced, to deliver a full GPU as a workstation experience – making them too costly for everyday users.   

Re-evaluate GPU enabled Virtual desktops

The introduction of AMD powered NVv4 instances is shifting the expectations for VM deployments and is sure to have IT managers taking note. What’s changed? Well, The NVv4 instance is the first VM on Microsoft Azure to take advantage of SR-IOV technologies (Single-root input/output virtualization) and introduces GPU partitioning across four new options. This gives customers greater flexibility, enabling the entire enterprise to enjoy dedicated CPU/GPU virtual desktops, delivering the best application experience regardless of the workloads. In fact, NVv4 will offer four distinct instance options to choose from, scaled to share a single GPU’s resources among as many as eight Virtual Machines. Alternatively, IT managers can maximize the user density of NVv4 with Windows 10 multi-sessions, supported by Windows Virtual Desktop with available plug-ins from Citrix and Teradici. Anyone interested in trying the NVv4 experience for themselves can do so by signing up to customer preview.

Attending Ignite?

During Ignite, there will be a great opportunity to speak to our team about the benefits of all the AMD supported Azure instances and have the chance to sign up to the NVv4 customer preview at the AMD booth #249. If you want to learn more about the technologies powering NVv4 you might like to join these AMD sessions: Technical (BRK1114, Thursday 7th Nov, 11:30am),Hub (THR1086, 9am, Tuesday 5th Nov) and a NVv4 dedicated session by Microsoft (BRK3121) if you are lucky enough to be there in person.

From Azure to Windows, we love Microsoft! Come visit AMD  Booth #249 and experience all of our technology demonstrations and discuss how we can address your business needs!

0 0 2,993

 virtualization, single root input/output virtualization or SR-IOV (Single-root input/output virtualization) is a specification that allows the isolation of PCI Express resources between different users. It is already the standard used to share networking resources (NICs) and secure network traffic. Each resource has Virtual Functions (VF) associated and each VM (Virtual machine) can only access the physical resource via its own allocated VF.

The AMD MxGPU (GPU sharing technology) is the industry’s first SR-IOV based GPU sharing technology designed for cloud and datacenter. So why did we choose SR-IOV?

  • Industry standard. SR-IOV is the long-established industry standard for virtualising PCIE devices. As such, the standards are openly scrutinised for security.
  • The isolation provided by VFs helps ensure each VM is isolated from other e.g. memory is secured and not shared.
  • Long-term we believe SR-IOV is a base technology that will allow for scalability and higher user densities long term as a technology that minimises context switching overheads.
  • Stability and reliability. SR-IOV allows us to provide each VM with its own dedicated share of a GPU and it does not compete with other users, helping ensure the resource available is consistent and the same; users can avoid the unreliability associated with noisy neighbours and experience deterministic QoS.

SR-IOV a technology that has evolved with and for cloud

Back in 2009. veteran blogger Scott Lowe wrote an introduction to SR-IOV predicting it would become mainstream, it’s great context to the environment and technology of the time. Whilst we could have accelerated to market using a bespoke proprietary memory management unit (MMU), we instead chose to work with the major hardware, hypervisor and operating system vendors to evolve the technologies to an industry wide fit for our long-term needs.

The evolution of SR-IOV was carefully managed and in2016 was able AMD to release the world’s first SR-IOV based GPU sharing solution for cloud and virtualisation. Beyond the obvious security and quality benefits of aligning to the core technology, the standards offer potential long-term scalability that a bespoke implementation wouldn’t have offered us.

We are seeing increasing rewards from this approach now, as other vendors -- particularly Microsoft -- have placed SR-IOV at the core of their technologies and infrastructure. This alignment has streamlined our joint projects, leading to the announcement of MXGPU into the Azure cloud to enable cost-effectively sized and priced GPU enabled VMs. (You can register interest with Microsoft in the release availability, here.) MxGPU SR-IOV support is also available and proven for Citrix XenServer, XenDesktop and XenApp, VMware ESXi, Horizon View and open source KVM. Read more, here.

SR-IOV and MXGPU at Ignite

Our product management team will be at Microsoft Ignite (4-8 November), and you can find us on booth #249. You might also like to join these AMD sessions: technical session (BRK1114, Friday 8th Nov, 9am) and hub session (THR1086, 9am, Tuesday 5th Nov) if you are lucky enough to be there in person.

Learn More

  • Microsoft high commitment and investment in integrating the SR-IOV standards into the core of their platforms such as Windows and Hyper-V is significant and as such they’ve published significant information on this approach including overviews and architectural deep-dives.
  • Our hypervisor and virtualisation partners have also been investing in core SR-IOV technologies, as well as releasing information as to the benefits and reasons for this approach. In September 2018, Citrix released XenServer 7.6; the release notes are available to read, amongst other features they cover Citrix’s and XenServer’s adoption of SR-IOV for networking (NICs – Network Interface Cards).  

The SR-IOV standard

The SR-IOV standard is controlled and maintained by the PCI-SIG foundation. The regulation and scrutiny of the standard is maintained with cross-industry membership and funding, alongside a compliance programme and certified integrator list.

MXGPU more than SR-IOV

Of course, there is more to MXGPU than SR-IOV, it is just one of core technologies on top of which we have built our GPU sharing and virtualisation products.  We are however pleased that we were the first vendor to achieve GPU sharing the SR-IOV ‘gold-standard’.

0 0 13K

There have been numerous opinions offered from all corners of the gaming community about the impact of Google Stadia. Gaming and business journalists, bloggers, and avid gamers all have opinions to share. And while a few revert to familiar hardware “spec” comparisons to gauge the value of new technology, the introduction of Google Stadia is clearly about much more. Google Stadia marks an evolution of the gaming landscape that’ll rapidly reshape the industry.


In the short time since Stadia was announced, several themes have emerged that will likely drive increased cloud gaming adoption.

  • Consistent Premium Performance
  • Transparent Maintenance
  • On-Demand Gaming & Social Integration
  • Device Access & Mobility
  • Cloud gaming value chain
  • Subscription based services

While this discussion is primarily based on Google Stadia, many of the value propositions introduced can be applied more generally to cloud gaming services such as Microsoft xCloud and Sony’s PlayStation Now.


Below we’ll introduce each theme and in future blogs dive deeper into each to explore their impact on the industry.


Consistent Premium Performance

Performance and hardware specs will continue to drive conversation near term for two reasons: the industry is familiar with it, and it can be can measured. While this understanding is important, moving forward the conversation will likely shift to focus on delivering a consistent premium experience. Stadia allows gamers to reconsider the entirety of the gaming experience and the context within which we view performance. 


The choice of custom AMD “Vega”-based GPUs as a starting point for this service launch reflects Google’s strong commitment to what makes gamers happy and a deep understanding of what makes datacenters tick. Gaming is a part of the AMD DNA, delivering high performance GPUs for the latest game consoles, high-end gaming PCs, and the datacenter.  The AMD “Vega”-based GPUs for Stadia are a proven platform featuring 56 compute units, up to 10.7 teraflops, integrated HBM2 memory, and with the Vulkan® high-performance real-time 3D graphics API part of the driver. That’s easily more power than the top two previous generation consoles combined and a foundation for success that can deliver a next-generation console experience today1.

But for the player, all that matters is the experience, which at resolutions up to 4K and 60 frames per second, with HDR and surround sound, promises to be fantastic and substantially better than what many gamers enjoy today.


Transparent Maintenance

How many times has a user tried to launch a game only to be met with a time-consuming multi-gigabyte patch? With cloud gaming, software maintenance happens in the background, transparent to the user. In addition, the centralized design of Stadia also means they will not have to worry about hardware upgrades. The datacenter can be upgraded to keep pace with changing requirements, transparent to the user. In short, more play, less hassle.


On-Demand Gaming and Social Integration

Stadia will enable the ~200 million people who watch game-related content such as trailers and live streams on YouTube to lean into their enthusiasm and join the action with just a tap on their phone, tablet, or computer. Social integration allows for instant broadcasting, archiving, and sharing you and your teams’ latest achievements. E-sports fans and stream audiences can simply click a link on their favorite social media site and instantly launch into the latest titles.


Game downloads are a thing of the past, like music and movies before it, many games are now available “on-demand”. 


Device Access and Mobility

Google Stadia delivers the AAA gaming experience to the widest audience. That means great games, streamed via standard Internet connections, to a variety of devices, and all while enhancing the social aspects of and accessibility to those experiences to better match the preferences of today’s consumer.  


This vision is made possible by shifting focus of the gaming world to the datacenter. The organizing principle of gaming is the datacenter rather than the individual’s device. Google’s 7500 edge nodes worldwide will put powerful gaming hardware essentially everywhere and in reach of virtually everyone.

With cloud gaming, if you need to take your gaming on the go, no need to start over. You can simply save state on your home theater or Chromebook and pick up seamlessly on your mobile device. That flexibility promises to change how players weave gaming into our everyday lives.

Evolving Business Model 

The transition of gaming to the Cloud will impact many companies including console providers, game developers, and publishers. Traditionally, publishers have had a variety of platform options on which to distribute their game titles and reach their audience. One challenge they have faced however is the large fees required to gain access to each distinct platform. The introduction of new, high-performance cloud platforms like Stadia gives more choice for the game publishers.

Another interesting consideration which Stadia has introduced for many game developers and publishers is the access to nearly unlimited resources to build their games on. In the past, console hardware has tended to follow a slower refresh rate than gaming PCs. As a result, AAA games that appear later in a console cycle had to be developed to support both older console technologies as well as more recent platforms.  The resource demands sometimes restricted what a game developer could create. It could be proposed that the datacenter is the console when speaking about cloud; better still, it can be continuously updated to maintain the highest levels of performance removing the need to buy the latest GPUs. By default they have access to the best gaming platform for their next blockbuster title. 

Manufacturers building game-specific hardware including consoles have also recognized the potential of cloud gaming. They can see a future where they have an opportunity to shift their efforts away from developing hardware with costly components and fighting expensive PR battles centered on hardware superiority, and instead drive wholeheartedly at creating the best games environment. Datacenter-based gaming provides a new, more cost-efficient and sustainable direction that can consolidate and balance costs. It means the business of games can stop competing on specs and instead compete on content. That’s something every gamer can appreciate.

Subscription Based Gaming

Google Stadia breathes new life into the gaming conversation, triggering a dialogue about the liberation offered by cross-platform play, blurring the lines between gameplay viewers and players, and establishing a flexible infrastructure that adapts to the innovation of developers.

This is a conversation I'm excited to continue over the coming months.

George Watkins I Marketing Manager I Datacenter GPU BU

These views are my own and do not reflect that AMD.

©2019 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Radeon, and combinations thereof are tradema##rks of Advanced Micro Devices, Inc. Thunderbolt is a trademark of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.



1) 4th September 2019, based on PS4 Pro GPU performance (4.2 TFLOPS) and Xbox One X GPU performance (6 TFLOPs) compared to With Google Stadia GPU performance (10.7 teraflops)

1 0 1,502


[Originally posted on 11/06/18]

Today in San Francisco, California, AMD held a special event where we announced the newest additions to the Radeon Instinct™ family of compute products. The AMD Radeon Instinct™ MI60 and Radeon Instinct™ MI50 accelerators are the first GPUs in the world that are based on the advanced 7nm FinFET process technology. The ability to go down to 7nm allows us to put more transistors on to an even smaller package than was possible before – in this case, the MI60 contains 13.2 billion transistors on a package size of 331.46mm2, while the previous generation Radeon Instinct™ MI25 had 12.5 billion transistors on a package size of 494.8mm2 – a 58% improvement in number of transistors per mm2. This allows us to provide a more powerful and robust product, capable of tackling a wide range of workloads from training and inference, to high performance computing.

Supercharged Deep Learning Operations – Ideal for Training and Inference

We’ve made numerous improvements on these new products, including optimized deep learning operations. In addition to native half-precision (FP16) performance, the MI60 and MI50 now support INT8 and INT4 operations, delivering up to a whopping 118 TFLOPS of INT4 peak performance on the MI60. The supercharged compute capabilities of these new products are designed to meet today’s demanding system requirements of handling large data efficiently for training complex neural networks and running inference against those neural networks used in deep learning.


World’s Fastest Double Precision PCIe® Based Accelerator

On the other end of the compute spectrum are FP64 calculations primarily used in high performance compute workloads. These types of workloads require extreme accuracy and speed, which the MI60 and MI50 deliver. The Radeon Instinct MI60 is the fastest double precision PCIe® based accelerator1, delivering up to 7.4 TFLOPS of FP64 peak performance, while the MI50 is not far behind at 6.7 TFLOPS. In addition to fast FP64 performance, the MI60 and MI50 both sport full-chip ECC memory3 as well as RAS4. This allows scientists and researchers across several industries including life sciences, energy, automotive and aerospace, government and more to achieve results with both speed and accuracy.


Finely Balanced, Ultra-Scalable Datacenter Solution

Most of the improvements we’ve talked about so far have been at the chip level, but we didn’t stop there. We also have a number of new benefits found beyond the chip as well. We meticulously designed the MI60 and MI50 to deliver finely tuned and balanced performance. We took a look at some of the common bottlenecks found in previous generations and made improvements to ensure your data is processed in the most efficient manner possible. This includes making these cards PCIe® Gen 4* capable, delivering up to 2x more bandwidth (64 GB/s vs. 32 GB/s) than PCIe® Gen 3 when communicating over the bus. In addition to improved performance between GPU and CPU, we’ve also built in to these products a peer-to-peer GPU communication feature called Infinity Fabric™ Link technology. Each card includes two physical Infinity Fabric™ Links allowing you to directly connect four GPUs together in a GPU hive ring and up to two of these hives in an 8 GPU server. Each GPU card provides up to 200 GB/s bandwidth between peer GPUs, which is up to 6x faster than PCIe Gen 3 alone2. We have also doubled memory bandwidth speeds from our previous generation Radeon Instinct MI25 accelerator5, delivering up to 1TB/s memory bandwidth on both the MI50 and MI60 accelerator – the first GPUs to achieve this speed.


With improved performance from both within the GPU and between GPUs and CPUs, these new finely-balanced, ultra-fast and scalable solutions are the ideal datacenter compute solution for all your needs whether they’re inference, training or HPC related.

Learn More About the AMD Radeon Instinct MI60

Learn More About the AMD Radeon Instinct MI50

Learn More About AMD’s “Vega 7nm” Technology

Learn More About ROCm

Warren Eng is a Product Marketing Manager for professional graphics and compute at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied. GD-5

0 0 1,693


[Originally posted on 11/21/17]

This year at SC17, AMD showcased Radeon Instinct™ accelerators, AMD EPYC™ processors and the ROCm open software platform – a complete ecosystem to drive a new era in the datacenter. Our booth was packed with server racks from partners like Inventec, Gigabyte, Supermicro and BOXX. Attendees had the opportunity to check out Project 47, both on display and running demos, offering 1 PetaFLOPS of compute power.

The much anticipated TensorFlow support with ROCm 1.7 was revealed in our booth alongside a demo of deep learning inference from a trained Caffe model. AMD also offered hourly Tech Talks, diving into a wide range of topics – from AMD EPYC™ performance to Radeon technology powering the exploration of dark energy with the CHIME radio telescope.

Thank you to everyone that joined us at SC17. For those that were unable to attend, check out our photo gallery below. We hope to see you next year at SC18!


Daniel Skrba, Marketing and Communications Specialist for the Radeon Technologies Group at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies, or opinions. Links to third party sites and references to third party trademarks are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

0 0 1,131


[Originally posted on 10/27/17]

Visit AMD at our SC17 booth #825 and learn how AMD together with our partners is bringing about a new era in the datacenter that is revolutionizing High Performance Computing with our new AMD EPYC™ processors and Radeon Instinct™ accelerators. On top of this year’s show stopping demos, you will have the opportunity to attend one of our interactive and educational booth Tech Talks – check out the schedule below.

Featured AMD Tech Talks

Tuesday, Nov. 14th, 2017

  • 11AM: Reconfigurable Acceleration at Cloud Scale, Manish Muthal, Vice President of Data Center Marketing, Xilinx
  • 1PM: Introducing AMD EPYC™: A New Standard of Performance and Innovation, Girish Kulkarni, Director of Product Marketing, AMD Server Group, AMD
  • 2PM: Exploring Dark Energy with the CHIME Radio Telescope, powered by Radeon™ Technology, Andre Renard, Chime Computing Specialist, Dunlap institute for Astronomy & Astrophysics, University of Toronto
  • 3PM: AMD EPYC™ for HPC, Joshua Mora, PhD, Manager Field Application Engineering, AMD
  • 4PM: AMD Radeon Instinct™ Accelerators, Niles Burbank, Sr. Product Manager, AMD
  • 5PM Redefining HPC Performance with EPYC-based Supermicro Servers, Super Micro Computer, Inc.

Wednesday, Nov. 15th, 2017

  • 11AM: Interconnect Your Future with Mellanox “Smart” Interconnect, Gilad Shainer, Vice president of Marketing, Mellanox Technologies
  • 1:00 PM: Accelerating 3D Acoustics With HCC-C++, Reid Atcheson, Accelerator Software Engineer, NAG
  • 2PM: AMD EPYC™ for HPC, Joshua Mora, PhD, Manager Field Application Engineering, AMD
  • 3PM: Advances in GPU Networking at AMD, Michael Lebeane, Sr. Design Engineer, AMD Research
  • 4PM: Running TensorFlow on AMD’s ROCm software platform with HIP, Ben Sander, Sr. Fellow, Software Engineer, AMD
  • AMD Booth # 825 Tech Talks November 14 – 15, 2017


We hope to see you in Denver!

0 0 1,509


[Originally posted on 10/10/17 - by Gregory Stoner]

AMD is excited to see the emergence of the Open Neural Network Exchange (ONNX) format which is creating a common format model to bridge three industry-leading deep learning frameworks (PyTorch, Caffe2, and Cognitive Toolkit) to give our customers simpler paths to explore their networks via rich framework interoperability.

The ONNX format, via its extensible computation graph model, built-in operators, and standard data types will allow our team to focus on more in-depth optimization with our Radeon Instinct Hardware and more productive solution set via our open source MIOpen deep learning solver library and ROCm Compiler technology. It also gives us the path to explore new foundation production beyond traditional frameworks for production to bring lighter weight more optimized solutions for our hardware.

It is great to see the collaboration of Facebook and Microsoft continuing to also follow in the path of open software development practice with ONNX, building on their open source projects PyTorch, Caffe2, and Cognitive Toolkit. Open Software development aligns with our philosophy of bringing out open source software platform, tools, and driver to allow the research community to have more powerful ability to explore broader deep learning design space.

We feel this is an excellent step for the community to open up these platform to a broader set of diverse architectures. We look forward to working with the project and help it grow in the coming months.

Gregory Stoner, is Sr. Director of Radeon Open Compute. Links to third-party sites and references to third-party trademarks are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third-party endorsement of AMD or any of its products is implied. Use of third-party names or marks is for informational purposes only and no endorsement of or by AMD is intended or implied.

0 0 1,853
Community Manager
Community Manager


[Originally posted on 09/08/17 by Albert J. De Vera]

Deep Learning, an advanced form of machine learning, has generated a lot of interest due to the wide range of applications on complex data sets. Current technologies and the availability of very large amounts of complex data have made analytics on the latter more tractable.

With deep neural networks as basis for deep learning algorithms, GPUs are now being used in deep learning applications because they provide many processing units. These processing units simulate a neural network that does the computation on data. Neural networks can therefore scale and improve the extraction of information from data.

ROCm and The AMD Deep Learning Stack

The AMD Deep Learning Stack is the result of AMD’s initiative to enable DL applications using their GPUs such as the Radeon Instinct product line. Currently, deep learning frameworks such as Caffe, Torch, and TensorFlow are being ported and tested to run on the AMD DL stack. Supporting these frameworks is MIOpen, AMD’s open-source deep learning library built for the Radeon Instinct line of compute accelerators.

AMD’s ROCm platform serves as the foundation of this DL stack. ROCm enables the seamless integration of the CPU and GPU for high performance computing (HPC) and ultra-scale class computing. To achieve this, ROCm is built for language independence and takes advantage of the Heterogenous System Architecture (HSA) Runtime API.3 This is the basis of the ROCr System Runtime, a thin user-mode API providing access to graphics hardware driven by the AMDGPU driver and the ROCk kernel driver.


For now, OS support for ROCm is limited to Ubuntu 14.04, Ubuntu 16.04, and Fedora 23. For these OSs, AMD provides a modified Linux version 4.6 kernel with patches to the HSA kernel driver (amdkfd) and the AMDGPU (amdgpu) kernel driver currently in the mainline Linux kernel.5

Using Docker With The AMD Deep Learning Stack

Docker Containers

Software containers isolate the application and its dependencies from other software installed on the host. They abstract the underlying operating system while keeping its own resources (filesystem, memory, CPU) and environment separate from other containers.

In contrast to virtual machines, all containers running on the same host share a single operating system without the need to virtualize a complete machine with its own OS. This makes software containers perform much faster than virtual machines because of the lack of overhead from the guest OS and the hypervisor.

Docker is the most popular software container platform today. It is available for Linux, macOS, and Microsoft Windows. Docker containers can run under any OS with the Docker platform installed.6

Installing Docker and The AMD Deep Learning Stack

The ROCm-enabled Linux kernel and the ROCk driver, together with other needed kernel modules, must be installed on all hosts that run Docker containers. This is because the containers do not have the kernel installed inside them. Instead, the containers share the host kernel.7

The installation procedure described here is for Ubuntu 16.04. Ubuntu 16.04 is currently the most tested OS for ROCm.

Installing ROCm

The next step is to install ROCm and the ROCm kernel on each host. The procedure described below is based on instructions found in

Grab and install the GPG key for the repository:

wget -qO – | sudo apt-key add –

You should get the message ‘OK’. You can check if it’s there using apt-key:

apt-key list

In /etc/apt/sources.list.d, create a file named rocm.list and place the following line in it:

deb [arch=amd64] xenial main

Update the repository information by running ‘apt update’. If you get a warning because of the key signature, you may ignore it since the repository administrator will update this in the future.

Install the ROCm Runtime software stack using ‘apt install rocm’:

[root@pegasus ~]# apt install rocm

Reading package lists… Done

Building dependency tree

Reading state information… Done

The following packages were automatically installed and are no longer required:

hcblas hcfft hcrng miopengemm

Use ‘sudo apt autoremove’ to remove them.

The following additional packages will be installed:

hcc hip_hcc linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-148 linux-image-4.11.0-kfd-compute-rocm-rel-1.6-148 rocm-dev

rocm-device-libs rocm-profiler rocm-smi rocm-utils

Suggested packages:


The following NEW packages will be installed:

hcc hip_hcc linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-148 linux-image-4.11.0-kfd-compute-rocm-rel-1.6-148 rocm rocm-dev

rocm-device-libs rocm-profiler rocm-smi rocm-utils

0 upgraded, 10 newly installed, 0 to remove and 0 not upgraded.

Need to get 321 MB of archives.

After this operation, 1,934 MB of additional disk space will be used.

Do you want to continue? [Y/n]

Get:1 xenial/main amd64 rocm-utils amd64 1.0.0 [30.7 kB]

Get:2 xenial/main amd64 hcc amd64 1.0.17312 [255 MB]

Get:3 xenial/main amd64 hip_hcc amd64 1.2.17305 [876 kB]

Get:4 xenial/main amd64 linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-148 amd64 4.11.0-kfd-compute-rocm-rel-1.6-148-1 [10.8 MB]

Get:5 xenial/main amd64 linux-image-4.11.0-kfd-compute-rocm-rel-1.6-148 amd64 4.11.0-kfd-compute-rocm-rel-1.6-148-1 [46.5 MB]

Get:6 xenial/main amd64 rocm-device-libs amd64 0.0.1 [587 kB]

Get:7 xenial/main amd64 rocm-smi amd64 1.0.0-25-gbdb99b4 [8,158 B]

Get:8 xenial/main amd64 rocm-profiler amd64 5.1.6400 [7,427 kB]

Get:9 xenial/main amd64 rocm-dev amd64 1.6.148 [902 B]

Get:10 xenial/main amd64 rocm amd64 1.6.148 [1,044 B]

Fetched 321 MB in 31s (10.1 MB/s)

Selecting previously unselected package rocm-utils.

(Reading database … 254059 files and directories currently installed.)

Preparing to unpack …/rocm-utils_1.0.0_amd64.deb …

Unpacking rocm-utils (1.0.0) …

Selecting previously unselected package hcc.

Preparing to unpack …/hcc_1.0.17312_amd64.deb …

Unpacking hcc (1.0.17312) …

Selecting previously unselected package hip_hcc.

Preparing to unpack …/hip%5fhcc_1.2.17305_amd64.deb …

Unpacking hip_hcc (1.2.17305) …

Selecting previously unselected package linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-148.

Preparing to unpack …/linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-148_4.11.0-kfd-compute-rocm-rel-1.6-148-1_amd64.deb …

Unpacking linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-148 (4.11.0-kfd-compute-rocm-rel-1.6-148-1) …

Selecting previously unselected package linux-image-4.11.0-kfd-compute-rocm-rel-1.6-148.

Preparing to unpack …/linux-image-4.11.0-kfd-compute-rocm-rel-1.6-148_4.11.0-kfd-compute-rocm-rel-1.6-148-1_amd64.deb …

Unpacking linux-image-4.11.0-kfd-compute-rocm-rel-1.6-148 (4.11.0-kfd-compute-rocm-rel-1.6-148-1) …

Selecting previously unselected package rocm-device-libs.

Preparing to unpack …/rocm-device-libs_0.0.1_amd64.deb …

Unpacking rocm-device-libs (0.0.1) …

Selecting previously unselected package rocm-smi.

Preparing to unpack …/rocm-smi_1.0.0-25-gbdb99b4_amd64.deb …

Unpacking rocm-smi (1.0.0-25-gbdb99b4) …

Selecting previously unselected package rocm-profiler.

Preparing to unpack …/rocm-profiler_5.1.6400_amd64.deb …

Unpacking rocm-profiler (5.1.6400) …

Selecting previously unselected package rocm-dev.

Preparing to unpack …/rocm-dev_1.6.148_amd64.deb …

Unpacking rocm-dev (1.6.148) …

Selecting previously unselected package rocm.

Preparing to unpack …/rocm_1.6.148_amd64.deb …

Unpacking rocm (1.6.148) …

Setting up rocm-utils (1.0.0) …

Setting up hcc (1.0.17312) …

Setting up hip_hcc (1.2.17305) …

Setting up linux-headers-4.11.0-kfd-compute-rocm-rel-1.6-148 (4.11.0-kfd-compute-rocm-rel-1.6-148-1) …

Setting up linux-image-4.11.0-kfd-compute-rocm-rel-1.6-148 (4.11.0-kfd-compute-rocm-rel-1.6-148-1) …

update-initramfs: Generating /boot/initrd.img-4.11.0-kfd-compute-rocm-rel-1.6-148

W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.

Generating grub configuration file …

Found linux image: /boot/vmlinuz-4.11.0-kfd-compute-rocm-rel-1.6-148

Found initrd image: /boot/initrd.img-4.11.0-kfd-compute-rocm-rel-1.6-148

Found linux image: /boot/vmlinuz-4.4.0-93-generic

Found initrd image: /boot/initrd.img-4.4.0-93-generic

Found memtest86+ image: /memtest86+.elf

Found memtest86+ image: /memtest86+.bin


Setting up rocm-device-libs (0.0.1) …

Setting up rocm-smi (1.0.0-25-gbdb99b4) …

Setting up rocm-profiler (5.1.6400) …

Setting up rocm-dev (1.6.148) …

Setting up rocm (1.6.148) …

KERNEL==”kfd”, MODE=”0666″

Reboot the server. Make sure that the Linux ROCm kernel is running:

Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.11.0-kfd-compute-rocm-rel-1.6-148 x86_64)

* Documentation:

* Management:

* Support:

0 packages can be updated.

0 updates are security updates.

Test if your installation works with this sample program:

cd /opt/rocm/hsa/sample



You should get an output similar to this:

Initializing the hsa runtime succeeded.

Checking finalizer 1.0 extension support succeeded.

Generating function table for finalizer succeeded.

Getting a gpu agent succeeded.

Querying the agent name succeeded.

The agent name is gfx803.

Querying the agent maximum queue size succeeded.

The maximum queue size is 131072.

Creating the queue succeeded.

“Obtaining machine model” succeeded.

“Getting agent profile” succeeded.

Create the program succeeded.

Adding the brig module to the program succeeded.

Query the agents isa succeeded.

Finalizing the program succeeded.

Destroying the program succeeded.

Create the executable succeeded.

Loading the code object succeeded.

Freeze the executable succeeded.

Extract the symbol from the executable succeeded.

Extracting the symbol from the executable succeeded.

Extracting the kernarg segment size from the executable succeeded.

Extracting the group segment size from the executable succeeded.

Extracting the private segment from the executable succeeded.

Creating a HSA signal succeeded.

Finding a fine grained memory region succeeded.

Allocating argument memory for input parameter succeeded.

Allocating argument memory for output parameter succeeded.

Finding a kernarg memory region succeeded.

Allocating kernel argument memory buffer succeeded.

Dispatching the kernel succeeded.

Passed validation.

Freeing kernel argument memory buffer succeeded.

Destroying the signal succeeded.

Destroying the executable succeeded.

Destroying the code object succeeded.

Destroying the queue succeeded.

Freeing in argument memory buffer succeeded.

Freeing out argument memory buffer succeeded.

Shutting down the runtime succeeded.

Installing Docker

We are installing the Docker Community Edition (also called Docker CE) on the host by using Docker’s apt repository. Our procedure is based on documentation published by Docker.8 There may be some slight differences from the original documentation. Note that the installation is done as the superuser. You can also use sudo to install Docker.

First, remove old versions of Docker:

apt remove docker docker-engine

If they are not installed, you will simply get a message that they are missing.

Install the following prerequisite packages using apt:





Add the Docker GPG key to your host:

curl -fsSL |

sudo apt-key add –

The GPG fingerprint should be 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88. Use the command

apt-key fingerprint 0EBFCD88

to verify this.

Now add the repository information:

add-apt-repository \

“deb [arch=amd64] \

$(lsb_release -cs) \


Finally, issue the command ‘apt update’.

Installing Docker CE should be done with ‘apt install docker-ce’. After the installation is complete, verify that Docker is properly configured and installed using the command ‘docker run hello-world’.

Running ROCm Docker Images

AMD provides a Docker image of the ROCm software framework.9 The image can be pulled from the official Docker repository:

sudo docker pull rocm/rocm-terminal

The image is about 1.5 GB in size and contains the necessary libraries to run ROCm-based applications. Create a container out of this image and look at the installed software in /opt/rocm:

sudo docker run -it –rm –device=/dev/kfd rocm/rocm-terminal

You can check for the ROCm libraries using ldconfig:

ldconfig -NXv

The command above should list all the libraries in the library path including the ROCm libraries.

The ROCm-docker source is available from GitHub:

mkdir ~/tmp

cd ~/tmp

git clone

Creating A ROCm Application Docker Image

We can use the rocm/rocm-terminal Docker image to build our own ROCm application Docker image. In the following examples, we use a couple of the sample applications that come with the

ROCm development package. One of them shall be /opt/rocm/hip/samples/1_Utils/hipInfo.

Assuming the host has the complete ROCm development tools, we just do the following:

cd /opt/rocm/hip/samples/1_Utils/hipInfo


The outcome of the make command shall be a binary called hipInfo.

If the compiler complains because of a missing shared library called libsupc++, we will need to install that somewhere in the host’s library path. In our case, we shall place the shared library in /usr/local/lib and make sure that ldconfig can find it. You can simply create a shared library from the installed static library /usr/lib/gcc/x86_64-linux-gnu/4.8/libsupc++.a:

mkdir -p ~/tmp/libsupc++

cd ~/tmp/libsupc++

ar x /usr/lib/gcc/x86_64-linux-gnu/4.8/libsupc++.a

ls -l *.o

gcc -shared -o *.o

sudo cp -p /usr/local/lib/

sudo ldconfig -v

Make sure that /usr/local/lib is seen by ldconfig. You may have to specify this directory in /etc/ if it is not found. Simply add a file named local_lib.conf with the line /usr/local/lib by itself.

Check the output of hipInfo by running it. You should get something like this (it will be slightly different from the literal output below depending on what type of GPU configuration you have):

$ ./hipInfo

compiler: hcc version=1.0.17312-d1f4a8a-19aa706-56b5abe, workweek (YYWWD) = 17312


device# 0

Name: Device 67df

pciBusID: 1

pciDeviceID: 0

multiProcessorCount: 36

maxThreadsPerMultiProcessor: 2560

isMultiGpuBoard: 1

clockRate: 1303 Mhz

memoryClockRate: 2000 Mhz

memoryBusWidth: 256

clockInstructionRate: 1000 Mhz

totalGlobalMem: 8.00 GB

maxSharedMemoryPerMultiProcessor: 8.00 GB

totalConstMem: 16384

sharedMemPerBlock: 64.00 KB

regsPerBlock: 0

warpSize: 64

l2CacheSize: 0

computeMode: 0

maxThreadsPerBlock: 1024

maxThreadsDim.x: 1024

maxThreadsDim.y: 1024

maxThreadsDim.z: 1024

maxGridSize.x: 2147483647

maxGridSize.y: 2147483647

maxGridSize.z: 2147483647

major: 2

minor: 0

concurrentKernels: 1

arch.hasGlobalInt32Atomics: 1

arch.hasGlobalFloatAtomicExch: 1

arch.hasSharedInt32Atomics: 1

arch.hasSharedFloatAtomicExch: 1

arch.hasFloatAtomicAdd: 0

arch.hasGlobalInt64Atomics: 1

arch.hasSharedInt64Atomics: 1

arch.hasDoubles: 1

arch.hasWarpVote: 1

arch.hasWarpBallot: 1

arch.hasWarpShuffle: 1

arch.hasFunnelShift: 0

arch.hasThreadFenceSystem: 0

arch.hasSyncThreadsExt: 0

arch.hasSurfaceFuncs: 0

arch.has3dGrid: 1

arch.hasDynamicParallelism: 0


non-peers: device#0 8.00 GB 7.75 GB (97%)

Now that hipInfo is compiled and has been tested, let us create a Docker image with it. Create a directory for building an image with Docker.

mkdir ~/tmp/my_rocm_hipinfo

cd ~/tmp/my_rocm_hipinfo

Copy the necessary files for the Docker image to run properly:

cp -p /usr/local/lib/ . # If hipInfo needs this

cp -p /opt/rocm/hip/samples/1_Utils/hipInfo/hipInfo .

Create a file named Dockerfile in the current directory. It should contain this:

FROM rocm/rocm-terminal:latest

COPY /usr/local/lib/

COPY hipInfo /usr/local/bin/

RUN sudo ldconfig

USER rocm-user

WORKDIR /home/rocm-user

ENV PATH “${PATH}:/opt/rocm/bin:/usr/local/bin”

ENTRYPOINT [“hipInfo”]

Build the Docker image:

sudo docker build -t my_rocm_hipinfo .

Create and run a container based on the new image:

sudo docker run –rm –device=”/dev/kfd” my_rocm_hipinfo

The device /dev/kfd is the kernel fusion driver. You should be getting a similar output as if you ran the hipInfo binary directly on the host.

Without the –rm parameter, the container will persist. You can then run the same container again and get some output:

sudo docker run –device=”/dev/kfd” –name nifty_hugle my_rocm_hipinfo

The Docker container shall persist:

sudo docker ps -a

You may get an output that looks like this:

Now, try this command and you should see the output from hipInfo again:

sudo docker start -i nifty_hugle

The second Docker image we shall create will contain the sample binary called vector_copy. The source is in /opt/rocm/hsa/sample. As done with hipInfo, use make to build the binary. Note that this binary also depends on the files with the .brig extension to run.

We do the following before we build the image:

mkdir ~/tmp/my_rocm_vectorcopy

cd ~/tmp/my_rocm_vectorcopy

mkdir vector_copy

cp -p /usr/local/lib/ . # Do this if necessary

cd vector_copy

cp -p /opt/rocm/hsa/sample/vector_copy .

cp -p /opt/rocm/hsa/sample/vector_copy*.brig .

cd .. # Back to ~/tmp/my_rocm_vectorcopy

For our Dockerfile, we have this:

FROM rocm/rocm-terminal:latest

COPY /usr/local/lib/

RUN sudo mkdir /usr/local/vector_copy

COPY vector_copy/* /usr/local/vector_copy/

RUN sudo ldconfig

USER rocm-user

ENV PATH “${PATH}:/opt/rocm/bin:/usr/local/vector_copy”

WORKDIR /usr/local/vector_copy

ENTRYPOINT [“vector_copy”]

Building the Docker image for vector_copy should be familiar by now.

As an exercise, run the Docker image to see what output you get. Try with or without –rm and with the ‘docker start’ command.

For our last example, we shall use a Docker container for the Caffe deep learning framework. We are going to use the HIP port of Caffe which can be targeted to both AMD ROCm and Nvidia CUDA devices.10 Converting CUDA code to portable C++ is enabled by HIP. For more information on HIP, see

Let us pull the hip-caffe image from the Docker registry:

docker pull intuitionfabric/hip-caffe

Test the image by running a device query on the AMD GPUs:

sudo docker run –name my_caffe -it –device=/dev/kfd –rm \

intuitionfabric/hip-caffe ./build/tools/caffe device_query -gpu all

You should get an output similar to the one below. Note that your output may differ due to your own host configuration.

I0831 19:05:30.814853 1 caffe.cpp:138] Querying GPUs all

I0831 19:05:30.815135 1 common.cpp:179] Device id: 0

I0831 19:05:30.815145 1 common.cpp:180] Major revision number: 2

I0831 19:05:30.815148 1 common.cpp:181] Minor revision number: 0

I0831 19:05:30.815153 1 common.cpp:182] Name: Device 67df

I0831 19:05:30.815158 1 common.cpp:183] Total global memory: 8589934592

I0831 19:05:30.815178 1 common.cpp:184] Total shared memory per block: 65536

I0831 19:05:30.815192 1 common.cpp:185] Total registers per block: 0

I0831 19:05:30.815196 1 common.cpp:186] Warp size: 64

I0831 19:05:30.815201 1 common.cpp:188] Maximum threads per block: 1024

I0831 19:05:30.815207 1 common.cpp:189] Maximum dimension of block: 1024, 1024, 1024

I0831 19:05:30.815210 1 common.cpp:192] Maximum dimension of grid: 2147483647, 2147483647, 2147483647

I0831 19:05:30.815215 1 common.cpp:195] Clock rate: 1303000

I0831 19:05:30.815219 1 common.cpp:196] Total constant memory: 16384

I0831 19:05:30.815223 1 common.cpp:200] Number of multiprocessors: 36

Let us now run Caffe in a container. We begin by creating a container for this purpose.

sudo docker run -it –device=/dev/kfd –rm intuitionfabric/hip-caffe

Run the MNIST example in the container. Once the above command is executed, you should be inside the container.

First, get the raw MNIST data:


Make sure you format the data for Caffe:


Once that’s done, proceed with training the network:


You should get an output similar to this:

I0831 18:43:19.290951 37 caffe.cpp:217] Using GPUs 0

I0831 18:43:19.291165 37 caffe.cpp:222] GPU 0: Device 67df

I0831 18:43:19.294853 37 solver.cpp:48] Initializing solver from parameters:

test_iter: 100

test_interval: 500

base_lr: 0.01

display: 100

max_iter: 10000

lr_policy: “inv”

gamma: 0.0001

power: 0.75

momentum: 0.9

weight_decay: 0.0005

snapshot: 5000

snapshot_prefix: “examples/mnist/lenet”

solver_mode: GPU

device_id: 0

net: “examples/mnist/lenet_train_test.prototxt”

train_state {

level: 0

stage: “”


I0831 18:43:19.294972 37 solver.cpp:91] Creating training net from net file: examples/mnist/lenet_train_test.prototxt

I0831 18:43:19.295145 37 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist

I0831 18:43:19.295169 37 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy

I0831 18:43:19.295181 37 net.cpp:58] Initializing net from parameters:

name: “LeNet”

state {

phase: TRAIN

level: 0

stage: “”


layer {

name: “mnist”

type: “Data”

top: “data”

top: “label”

include {

phase: TRAIN


transform_param {

scale: 0.00390625


data_param {

source: “examples/mnist/mnist_train_lmdb”

batch_size: 64

backend: LMDB



layer {

name: “conv1”

type: “Convolution”

bottom: “data”

top: “conv1”

param {

lr_mult: 1


param {

lr_mult: 2


convolution_param {

num_output: 20

kernel_size: 5

stride: 1

weight_filler {

type: “xavier”


bias_filler {

type: “constant”




….….layer {

name: “loss”

type: “SoftmaxWithLoss”

bottom: “ip2”

bottom: “label”

top: “loss”


I0831 18:43:19.295332 37 layer_factory.hpp:77] Creating layer mnist

I0831 18:43:19.295426 37 net.cpp:100] Creating Layer mnist

I0831 18:43:19.295444 37 net.cpp:408] mnist -> data

I0831 18:43:19.295478 37 net.cpp:408] mnist -> label

I0831 18:43:19.304414 40 db_lmdb.cpp:35] Opened lmdb examples/mnist/mnist_train_lmdb

I0831 18:43:19.304760 37 data_layer.cpp:41] output data size: 64,1,28,28

I0831 18:43:19.305835 37 net.cpp:150] Setting up mnist

I0831 18:43:19.305842 37 net.cpp:157] Top shape: 64 1 28 28 (50176)

I0831 18:43:19.305848 37 net.cpp:157] Top shape: 64 (64)

I0831 18:43:19.305851 37 net.cpp:165] Memory required for data: 200960

I0831 18:43:19.305874 37 layer_factory.hpp:77] Creating layer conv1

I0831 18:43:19.305907 37 net.cpp:100] Creating Layer conv1

I0831 18:43:19.305912 37 net.cpp:434] conv1 <- data

I0831 18:43:19.305940 37 net.cpp:408] conv1 -> conv1

I0831 18:43:19.314159 37 cudnn_conv_layer.cpp:259] Before miopenConvolution*GetWorkSpaceSize

I0831 18:43:19.319051 37 cudnn_conv_layer.cpp:295] After miopenConvolution*GetWorkSpaceSize

I0831 18:43:19.319625 37 cudnn_conv_layer.cpp:468] Before miopenFindConvolutionForwardAlgorithm

I0831 18:43:19.927783 37 cudnn_conv_layer.cpp:493] fwd_algo_[0]: 1

I0831 18:43:19.927809 37 cudnn_conv_layer.cpp:494] workspace_fwd_sizes_[0]:57600

I0831 18:43:19.928071 37 cudnn_conv_layer.cpp:500] Before miopenFindConvolutionBackwardWeightsAlgorithm

….….I0831 18:43:23.296785 37 net.cpp:228] mnist does not need backward computation.

I0831 18:43:23.296789 37 net.cpp:270] This network produces output loss

I0831 18:43:23.296799 37 net.cpp:283] Network initialization done.

I0831 18:43:23.296967 37 solver.cpp:181] Creating test net (#0) specified by net file: examples/mnist/lenet_train_test.prototxt

I0831 18:43:23.296985 37 net.cpp:322] The NetState phase (1) differed from the phase (0) specified by a rule in layer mnist

I0831 18:43:23.296995 37 net.cpp:58] Initializing net from parameters:

name: “LeNet”

state {

phase: TEST


layer {

name: “mnist”

type: “Data”

top: “data”

top: “label”

include {

phase: TEST


transform_param {

scale: 0.00390625


data_param {

source: “examples/mnist/mnist_test_lmdb”

batch_size: 100

backend: LMDB



I0831 18:44:12.620506 37 solver.cpp:404] Test net output #1: loss = 0.0299084 (* 1 = 0.0299084 loss)

I0831 18:44:12.624415 37 solver.cpp:228] Iteration 9000, loss = 0.011652

I0831 18:44:12.624441 37 solver.cpp:244] Train net output #0: loss = 0.011652 (* 1 = 0.011652 loss)

I0831 18:44:12.624449 37 sgd_solver.cpp:106] Iteration 9000, lr = 0.00617924

I0831 18:44:13.055759 37 solver.cpp:228] Iteration 9100, loss = 0.0061008

I0831 18:44:13.055778 37 solver.cpp:244] Train net output #0: loss = 0.0061008 (* 1 = 0.0061008 loss)

I0831 18:44:13.055800 37 sgd_solver.cpp:106] Iteration 9100, lr = 0.00615496

I0831 18:44:13.497696 37 solver.cpp:228] Iteration 9200, loss = 0.00277705

I0831 18:44:13.497715 37 solver.cpp:244] Train net output #0: loss = 0.00277706 (* 1 = 0.00277706 loss)

I0831 18:44:13.497720 37 sgd_solver.cpp:106] Iteration 9200, lr = 0.0061309

I0831 18:44:13.941920 37 solver.cpp:228] Iteration 9300, loss = 0.0111398

I0831 18:44:13.941941 37 solver.cpp:244] Train net output #0: loss = 0.0111398 (* 1 = 0.0111398 loss)

I0831 18:44:13.941946 37 sgd_solver.cpp:106] Iteration 9300, lr = 0.00610706

I0831 18:44:14.386647 37 solver.cpp:228] Iteration 9400, loss = 0.0179196

I0831 18:44:14.386667 37 solver.cpp:244] Train net output #0: loss = 0.0179195 (* 1 = 0.0179195 loss)

I0831 18:44:14.386672 37 sgd_solver.cpp:106] Iteration 9400, lr = 0.00608343

I0831 18:44:14.828459 37 solver.cpp:337] Iteration 9500, Testing net (#0)

I0831 18:44:14.983165 37 solver.cpp:404] Test net output #0: accuracy = 0.9884

I0831 18:44:14.983183 37 solver.cpp:404] Test net output #1: loss = 0.0393952 (* 1 = 0.0393952 loss)

I0831 18:44:14.987198 37 solver.cpp:228] Iteration 9500, loss = 0.00496538

I0831 18:44:14.987211 37 solver.cpp:244] Train net output #0: loss = 0.00496537 (* 1 = 0.00496537 loss)

I0831 18:44:14.987217 37 sgd_solver.cpp:106] Iteration 9500, lr = 0.00606002

I0831 18:44:15.433176 37 solver.cpp:228] Iteration 9600, loss = 0.00308157

I0831 18:44:15.433193 37 solver.cpp:244] Train net output #0: loss = 0.00308157 (* 1 = 0.00308157 loss)

I0831 18:44:15.433200 37 sgd_solver.cpp:106] Iteration 9600, lr = 0.00603682

I0831 18:44:15.878787 37 solver.cpp:228] Iteration 9700, loss = 0.00220143

I0831 18:44:15.878806 37 solver.cpp:244] Train net output #0: loss = 0.00220143 (* 1 = 0.00220143 loss)

I0831 18:44:15.878813 37 sgd_solver.cpp:106] Iteration 9700, lr = 0.00601382

I0831 18:44:16.321408 37 solver.cpp:228] Iteration 9800, loss = 0.0108761

I0831 18:44:16.321426 37 solver.cpp:244] Train net output #0: loss = 0.0108761 (* 1 = 0.0108761 loss)

I0831 18:44:16.321432 37 sgd_solver.cpp:106] Iteration 9800, lr = 0.00599102

I0831 18:44:16.765200 37 solver.cpp:228] Iteration 9900, loss = 0.00478531

I0831 18:44:16.765219 37 solver.cpp:244] Train net output #0: loss = 0.00478531 (* 1 = 0.00478531 loss)

I0831 18:44:16.765226 37 sgd_solver.cpp:106] Iteration 9900, lr = 0.00596843

I0831 18:44:17.204908 37 solver.cpp:454] Snapshotting to binary proto file examples/mnist/lenet_iter_10000.caffemodel

I0831 18:44:17.208767 37 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_10000.solverstate

I0831 18:44:17.211735 37 solver.cpp:317] Iteration 10000, loss = 0.0044067

I0831 18:44:17.211750 37 solver.cpp:337] Iteration 10000, Testing net (#0)

I0831 18:44:17.364528 37 solver.cpp:404] Test net output #0: accuracy = 0.9902

I0831 18:44:17.364547 37 solver.cpp:404] Test net output #1: loss = 0.0303562 (* 1 = 0.0303562 loss)

I0831 18:44:17.364552 37 solver.cpp:322] Optimization Done.

I0831 18:44:17.364555 37 caffe.cpp:254] Optimization Done.


In this article, we provided with you a guide on how to use AMD’s ROCm framework with Docker container technology. This should serve as a good jumpstart to begin your Deep Learning development using AMDs platform.

Docker has become an essential technology in containing the complexity of Deep Learning development. Deep Learning frameworks and tools have many dependencies. By leveraging Docker to isolate these dependencies within a Linux container leads to not only greater reliability and robustness but also to greater agility and flexibility. There are many frameworks and tools that are emerging and it is best practice to have a robust solution to the management of disparate parts. Docker containers have become a standard practice in Deep Learning and this technology is well supported by AMD’s ROCm framework.


1. Andrew Ng, Chief Scientist at Baidu, 2015.

2. Smith, Ryan. “AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming In 2017.” AnandTech: Hardware News and Tech Reviews Since 1997, 12 Dec. 2016,

3. “ROCm. A New Era in GPU Computing.” ROCm, A New Era in Open GPU Computing, 16 Dec. 2016,

4. “RadeonOpenCompute/ROCR-Runtime.” GitHub,

5. “ROCK-Kernel-Driver/ at Roc-1.6.0.”, 16 Nov. 2016,

6. “What Is Docker?” Docker - Build, Ship, and Run Any App, Anywhere,

7. “ROCm-Docker.” GitHub - ROCM-Docker, Accessed 24 Mar. 2017.

8. “Get Docker for Ubuntu.” Docker - Build, Ship, and Run Any App, Anywhere, Accessed 27 Mar. 2017.

9. “ROCm-Docker.” GitHub - ROCM-Docker, Accessed 24 Mar. 2017.

10. “hipCaffe: The HIP Port of Caffe.”, Accessed 01 Jun. 2017.

0 0 10.1K


[Originally posted on 06/20/17 by Ogi Brkic]

Back in December 2016, we first announced our Radeon Instinct initiative, combining our strength in compute with our dedication to open software. We later announced our Radeon Vega Frontier Edition, an enabler of Radeon Instinct.

Today, we’re excited to tell you about the next chapter in our vision for instinctive computing. AMD’s Radeon Instinct™ accelerators will soon ship to our partners (including Boxx, Colfax, Exxact Corporation, Gigabyte, Inventec and Supermicro, among others) and power their deep learning and HPC solutions starting in Q3 2017.

Artificial intelligence and machine learning are changing the world in ways we never could have imagined only a few years ago, enabling life-changing breakthroughs that can solve previously unsolvable problems. Radeon Instinct™ MI25, MI8, and MI6, together with AMD’s open ROCm 1.6 software platform, can dramatically increase performance, efficiency, and ease of implementation, speeding through deep learning inference and training workloads. We’re not just looking to accelerate the drive to machine intelligence, but to power the next era of true heterogeneous compute.

New Radeon Instinct Accelerators

Through our Radeon Instinct server accelerator products and open ecosystem approach, we’re able to offer our customers cost-effective machine and deep learning training, edge-training and inference solutions, where workloads can take the most advantage of the GPU’s highly parallel computing capabilities.

We’ve also designed the three initial Radeon Instinct accelerators to address a wide range of machine intelligence applications, which includes data-centric HPC-class systems in academics, government labs, energy, life science, financial, automotive and other industries:


The Radeon Instinct™ MI25 accelerator, based on the new “Vega” GPU architecture with a 14nm FinFET process, will be the world’s ultimate training accelerator for large-scale machine intelligence and deep learning datacenter applications. The MI25 will deliver superior FP16 and FP32 performance in a passively-cooled single GPU server card with 24.6 TFLOPS of FP16 or 12.3 TFLOPS of FP32 peak performance through its 64 compute units (4,096 stream processors). With 16GB of ultra–high bandwidth HBM2 ECC GPU memory and up to 484 GB/s of memory bandwidth, the Radeon Instinct MI25’s design is optimized for massively parallel applications with large datasets for Machine Intelligence and HPC-class systems.


The Radeon Instinct™ MI8 accelerator, harnessing the high-performance, energy-efficiency of the “Fiji” GPU architecture, is a small form factor HPC and inference accelerator with 8.2 TFLOPS of peak FP16|FP32 performance at less than 175W board power and 4GB of High-Bandwidth Memory (HBM) on a 512-bit memory interface. The MI8 is well suited for machine learning inference and HPC applications.


The Radeon Instinct™ MI6 accelerator, based on the acclaimed “Polaris” GPU architecture, is a passively cooled inference accelerator with 5.7 TFLOPS of peak FP16|FP32 performance at 150W board power and 16GB of ultra-fast GDDR5 GPU memory on a 256-bit memory interface. The MI6 is a versatile accelerator ideal for HPC and machine learning inference and edge-training deployments.

Radeon Instinct hardware is fueled by our open-source software platform, including:

  • Planned for June 29th rollout, the ROCm 1.6 software platform with performance improvements and now support for MIOpen 1.0 is scalable and fully open source providing a flexible, powerful heterogeneous compute solution for a new class of hybrid Hyperscale and HPC-class systems. Comprised of an open-source Linux® driver optimized for scalable multi-GPU computing, the ROCm software platform provides multiple programming models, the HIP CUDA conversion tool, and support for GPU acceleration using the Heterogeneous Computing Compiler (HCC).

  • The open-source MIOpen GPU-accelerated library available June 29th with the ROCm platform and supports machine intelligence frameworks including planned support of Caffe®, TensorFlow® and Torch®.

Revolutionizing the Datacenter with “Zen”-based Epyc™ Servers and Radeon Instinct Accelerators

The Radeon Instinct MI25, combined with our new “Zen”-based Epyc servers and the revolutionary ROCm open software platform, will provide a progressive approach to open heterogeneous compute and machine learning from the metal forward.

We plan to ship Radeon Instinct products to our technology partners in Q3 for design in their deep learning and HPC solutions, giving customers a real choice of vendors for open, scale-out machine learning solutions.

For more details and specifications on these cards, please check out the product pages below.

Radeon Instinct MI25

Radeon Instinct MI8

Radeon Instinct MI6

0 0 1,296


[Originally posted on 07/30/17 - by Mark Hirsch]

1 PetaFLOPS of Performance for the Ultimate Virtualization and Machine Intelligence Solution

Today at Capsaicin SIGGRAPH, AMD showcased what can be achieved when the world’s greatest server CPU is combined with the world’s greatest GPU, based on AMD’s revolutionary “Vega” architecture. Developed by AMD in collaboration with Inventec, Project 47 is based on Inventec’s P-series massively parallel computing platform, and is a rack designed to excel in a range of tasks, from graphics virtualization to machine intelligence.

Project 47 boasts 1 PetaFLOPS of compute power at full 32-bit precision delivering a stunning 30 GigaFLOPS/W, demonstrating dramatic compute efficiency.1 It boasts more cores, threads, compute units, IO lanes and memory channels in use at one time than in any other similarly configured system ever before. The incredible performance-per-dollar and performance-per-watt of Project 47 makes supercomputing a more affordable reality than ever before, whether for machine learning, virtualization or rendering.


Project 47 is made up of a rack of individual servers, each harnessing one EPYC™ 7601 processor to drive up to four “Vega”-based Radeon Instinct™ MI25 accelerators using 128 PCIe® lanes, in contrast to the costly dual-CPU and PLX switch setups typically needed on competing platforms in order to run four GPUs. With Project 47, AMD showcased the ease with which multiple servers can be daisy-chained, demonstrating a rack of 20 servers running 20 EPYC SoCs and 80 Radeon Instinct MI25 accelerators.

To bring Project 47 to life, AMD worked closely with Samsung Electronics with respect to the HBM2 memory used across the “Vega”-based product lines including the Radeon Instinct MI25 accelerators. Samsung also provided high-performance NVMe SSD storage and high-speed DDR4 memory to enable the 1 PetaFLOPS of performance. AMD also collaborated with Mellanox Technologies, leveraging their InfiniBand solution to deliver 100Gb connectivity through the rack.

Project 47 is expected to be available from Inventec and their principal distributor AMAX in Q4 of this year.

Mark Hirsch, Corporate Vice President, Systems & Solutions for the Radeon Technologies Group at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies, or opinions. Links to third party sites and references to third party trademarks are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.


1. Project 47 has a total rack power of 34,200 Watts and delivers a performance of 1,027,600 GigaFLOPS for 30.05 GigaFLOPS/W in single precision performance.

0 0 2,581