Logo.jpg

By Mark Papermaster, Senior Vice President and CTO, AMD

 

The streets of downtown Austin, just cleared of music festival attendees and auto racing fans, are now filled with enthusiasts of a different sort. This year the city is host to SC15, the largest event for supercomputing systems and software, and AMD is on site to meet with customers and technology partners.  The hardware is here, of course, including industry-leading AMD FirePro™ graphics and the upcoming AMD Opteron™ A1100 64-bit ARM® processor. However, the big story for AMD at the show this year is the “Boltzmann Initiative”, delivering new software tools to take advantage of the processing power of our products, including those on the future roadmap, like the new “Zen” x86 CPU core coming next year.  Ludwig Boltzmann was a theoretical physicist and mathematician who developed critical formulas for predicting the behavior of different forms of matter. Today, these calculations are central to work done by the scientific and engineering communities we are targeting with these tools.

 

First though, just a quick review of what ties this story together: Heterogeneous Computing. The Heterogeneous System Architecture (HSA) Foundation was created in 2012, with AMD as a founding member, to make it dramatically easier to program heterogeneous computing systems. Heterogeneous computing takes advantage of CPUs, GPUs, and other accelerators such as DSPs and other programmable and fixed-function devices to help increase performance and efficiency with the goal of reduced energy use. The GPU in particular is a critical component since general purpose computing on a GPU (GPGPU) makes large performance gains achievable for certain applications through parallel execution. However, while effectively harnessing the GPU for computing has become easier, AMD is taking a huge leap forward today with the announcement of the Boltzmann Initiative and its three key new tools for developers.

 

The first innovation is our new, heterogeneous compute compiler (HCC) for C++ programming. Over the last several years, it’s been possible to program for GPU compute through the use of OpenCL™, an open industry standard language, or the proprietary CUDA language. Both provide a general-purpose model for data parallelism as well as low-level access to hardware. And while both are significant improvements in both ease and functionality compared to previous methods, they still require unique programming skills. This is a problem because the potential for leveraging the GPU is so great and so diverse. Applications ranging from 3D medical imaging to facial recognition, from climate analysis to human genome mapping can all benefit, to name a few.

 

Ultimately, for heterogeneous computing to become a mainstream reality, these technologies will need to become accessible to a majority of the programmers in the world through more familiar languages such as C++. By creating a logical model where heterogeneous processors fully share system resources such as memory, HSA promises a standard programming model that allows developers to write code that can run seamlessly on whatever processor block is best able to execute it. The idea of matching the right workload to the right processor is compelling and being embraced by many hardware and software companies. The new AMD C++ compiler makes that idea a whole lot easier to execute.

 

Second is our new Linux® driver. While the Windows® operating system is fantastic and supports billions of consumer client devices and commercial servers, Linux is highly popular in technical and scientific communities where collaboration on application development is the traditional model to maximize performance. By making an all new Linux driver available, AMD is helping expand the developer base for heterogeneous computing even further. Important benefits for the programmer of this new, headless Linux driver include low latency compute dispatch, peer-to-peer GPU support, Remote Direct Memory Access (RDMA) from InfiniBand™ interconnects directly to GPU memory, and Large Single Memory Allocation support. Combined with the new C++ compiler, the Linux driver is a powerful addition to the Boltzmann Initiative.

 

Finally, for applications already developed in CUDA, they can now be ported into C++. This is achieved using the new Heterogeneous-computing Interface for Programmers (HIP) tool that ports CUDA runtime APIs into C++ code. AMD testing shows that in many cases 90 percent or more of CUDA code can be automatically converted into C++ by HIP. The remainder will require manual programming, but this should take a matter of days, not months as before. Once ported, the application could run on a variety of underlying hardware, and enhancements could be made directly through C++. The overall effect would enable greater platform flexibility and reduced development time and cost.

 

The availability of the new C++ compiler, Linux driver and HIP tool means that heterogeneous computing will be available to many more software developers, substantially increasing the pool of programmers. That’s a tremendous amount of brain power that can now create applications that more readily take advantage of the underlying hardware. It also means many more applications can take advantage of parallelism, when applicable, enabling better performance and greater energy efficiency. I encourage you to stop by booth #727 at the Austin Convention Center this week to learn more!

 

Mark Papermaster is Senior Vice President and Chief Technology Officer, AMD. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos. Windows is a registered trademark of Microsoft Corporation in the US and other jurisdictions.