1 2 3 Previous Next

AMD Gaming

367 Posts

As the PC graphics industry continues down the path of low-overhead graphics APIs, today I wanted to bring you some new details on two significant features of DirectX® 12. These features are called “multi-threaded command buffer recording” and “async shaders,” and they are poised to make a significant difference for gamers everywhere. Let’s take a look at what they do and why they matter.

 

ASYNC SHADERS

This feature allows a game engine to execute GPU compute or memory activities during “gaps” in the graphics workload presented by a game.

While it seems sensible to allow the graphics, compute and memory functions of a GPU to operate simultaneously, past versions of DirectX® did not provide for this functionality. Past versions of DirectX® were essentially limited to a single, serial graphics queue for processing all types of workloads. Therefore graphics, compute and memory copy operations had to wait for other parts of the graphics queue to finish processing before springing to life and doing their work. This would often result in idle hardware for some portions of time, and idle hardware is squandered performance.

Pipeline_behavior.gif

 

In contrast, DirectX® 12 Async Shaders supercharge work completion in a compatible AMD Radeon™ GPU by interleaving these tasks across multiple threads to shorten overall render time. Async Shaders are materially important to a PC gamer’s experience because shorter rendering times reduce graphics pipeline latency, and lower latency equals greater performance. “Performance” can mean higher framerates in gameplay and better responsiveness in VR environments. Further, finer levels of granularity in breaking up the workload can yield even greater reductions in work time. As they say: work smarter, not harder.

 

 

 

Finally, it must be understood that AMD’s Graphics Core Next architecture is specifically equipped to enable incredibly fine DirectX® 12 Async Shader granularity with dedicated hardware known as the Asynchronous Compute Engine (ACE). Many ACEs serve as fundamental building blocks in modern AMD graphics hardware, and they are specifically tuned to accommodate significant parallelization of complex jobs with superb performance.

 

Hawaii-Block-Diagram.jpg

This diagram of the AMD Radeon™ R9 290X GPU’s architecture shows eight Asynchronous Compute Engines (ACEs) ready to handle Async Shader work. Each AMD product based on GCN has a certain amount of these ACEs.

 

MULTI-THREADED COMMAND BUFFER RECORDING

The command buffer is a game’s “to-do list,” a list of things that the CPU must reorganize and present to an AMD Radeon™ graphics card so that graphics work can be done. Things on this to-do list might include lighting, placing characters, loading textures, generating reflections and more.

 

Modern PCs often ship with multi-core CPUs like AMD FX processors or AMD A-Series APUs. One notable characteristic of DirectX® 11-based applications is that many of these CPU cores in any multi-core CPU go partially or fully unutilized. This lack of utilization is owed to DirectX® 11’s relative inability to break a game’s command buffer into small, parallel and computationally quick chunks that can be spread across many cores.

 

In addition to modest multi-threading in DirectX® 11, a disproportionate amount of CPU time is frequently spent on driver and API interpretation (“overhead”) under the DirectX® 11 programming model, which leaves lesser time for executing game code that delivers quality and framerates.

 

In DirectX® 12, however, the command buffer behavior is radically overhauled in five key ways:

  1. Overhead is significantly reduced by moving driver and API code to any available CPU thread
  2. The absolute time required to complete complex CPU tasks is notably reduced
  3. Game workloads can be meaningfully distributed across >4 CPU cores
  4. New “bandwidth” on the CPU allows for higher peak draw calls, enabling more detailed and immersive game worlds
  5. All available CPU cores may now “talk” to the graphics card simultaneously

 

Much like going from a two-lane country road to an eight-lane superhighway, the shift to DirectX® 12 allows more traffic from an AMD FX processor to reach the graphics card in a shorter amount of time. The end result: more performance, better image quality, reduced latency, or a blend of all three (as the developer chooses).

 

cmd_buffer_behavior.gif

The benefit of this feature is already being seen in real games. Oxide Games and Stardock have collaborated with AMD for Ashes of the Singularity™, an upcoming strategy game that already utilizes all 8 cores of an AMD FX-8370 processor to deliver performance, image quality and resolutions that—in the words of the developer’s CEO Brad Wardell—are “not even a possibility” under DirectX® 11.

 

results.jpg

In other words, platforms with AMD Radeon™ GPU and multi-core AMD CPUs using DirectX® 12 are literally allowing developers to explore game designs previously considered impossible.

 

WRAP-UP

Multi-threaded command buffer recording and async shadersare two big features of the base DirectX® 12 specification, each harboring great potential to extract significantly more performance and image quality out of existing hardware.

 

But many gamers also know that game devs must commit to using a feature before it is seen in the real world—we’re taking care of that. Our collaboration with developers like Oxide/Stardock (and others unannounced) to get cool tech into great games is a guiding light for the AMD Gaming Evolved Program, and we’re already seeing healthy interest in these features. That bodes well for everyone!

 

Before we part ways, you might be interested to know which AMD products are compatible with DirectX® 12. Presuming you’ve installed Windows® 10 Technical Preview Build 10041 (or later) and obtained the latest driver from Windows Update, here’s the list of DirectX® 12-ready AMD components. We think you’ll agree that it’s an excitingly diverse set of products!

 

  • AMD Radeon™ R9 Series graphics
  • AMD Radeon™ R7 Series graphics
  • AMD Radeon™ R5 240 graphics
  • AMD Radeon™ HD 8000 Series graphics for OEM systems (HD 8570 and up)
  • AMD Radeon™ HD 8000M Series graphics for notebooks
  • AMD Radeon™ HD 7000 Series graphics (HD 7730 and up)
  • AMD Radeon™ HD 7000M Series graphics for notebooks (HD 7730M and up)
  • AMD A4/A6/A8/A10-7000 Series APUs (codenamed “Kaveri”)
  • AMD A6/A8/A10 PRO-7000 Series APUs (codenamed “Kaveri”)
  • AMD E1/A4/A10 Micro-6000 Series APUs (codenamed “Mullins”)
  • AMD E1/E2/A4/A6/A8-6000 Series APUs (codenamed “Beema”)

UPDATE: The AMD Radeon™ R9 290X graphics card delivers higher DirectX® 12 performance than the GeForce GTX 980 in independent testing from PC Perspective!

 

ORIGINAL ARTICLE

Today I’m pleased to welcome the 3DMark® API Overhead Feature Test to the world! This powerful extension to the 3DMark® suite lets everyday users compare the performance of different graphics APIs—Mantle, DirectX® 12 and DirectX® 11—on their PC. The early results are very promising for AMD customers, as the promised performance benefits of DirectX® 12 on full display.

 

I understand that not everyone has a few hours to throw at these kinds of tests, however, so let’s jump right into a few data points I’ve collected to illustrate how big these performance jumps really are.

 

PERFORMANCE

First we’ll look at DirectX® 12’s raw ability to ramp GPU throughput, with higher throughput representing new opportunities to put image quality on screen for you. In the new 3DMark® test, DirectX® 12 delivers performance that’s 10-16X its predecessor on AMD Radeon™ R9 and R7 graphics hardware.

DX12 HW EFficiency.PNG.png

Next I wanted to show you what DirectX® 12 can do for the performance-per-watt of a PC. Using an AMD A-Series APU, the world’s best SoC for DirectX® 12, we see a performance per watt improvement of 511%. In other words, every watt of power consumption just accomplished 6X the work that it could under DirectX® 11.

dx12_APU.PNG.png

Finally, I wanted to show you just how much better DirectX® 12 is at using multi-core CPUs like the AMD FX-8350. This wildly improved use of such CPUs is due to a feature called multi-threaded command buffer recording, which finally allows a multi-core communication lane between your AMD FX processor and AMD Radeon™ GPU. The graph shows this very clearly, with DirectX® 11 demonstrating no benefit beyond two cores even while DirectX® 12 sees an average uplift of +2.9 million draw calls with every CPU core added up to 6 cores.

 

For obvious reasons, multi-threaded command buffer recording is a defining feature in DirectX® 12 that will have a huge impact on the lives of gamers.

mt_scaling.PNG.png

THE ROAD TO OPTIMIZATION
The mind-boggling data I’ve collected from the 3DMark API Overhead Feature Test is testament to our passion for DirectX® 12 and its promise as a graphics API.

Our software developers are months into their work with companies, like Futuremark®, who want to make the most of our DirectX® 12-ready Graphics Core Next architecture. While such work is never truly finished, the early results are plainly impressive.

 

There are other factors at work, too!  AMD has been working on “low-overhead” or “console-like” APIs for over three years. During that time, we’ve been working with top game developers to establish best practices for these APIs on AMD hardware. We expect game developers to have a head start in having their games work great on AMD hardware as a result. By no coincidence, console game development is also targeted at AMD hardware with a unique set of low-overhead graphics APIs.

 

Given that DirectX® 12 will be a transformative experience for millions of gamers, it’s important that hardware vendors like AMD have a 360-degree view of the issue. Thankfully, the pervasive nature of the GCN Architecture in the games industry highlights that AMD stands alone with that perspective.


Today’s extraordinary 3DMark® results show that we’re already putting it to good use.

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

FOOTNOTES:

  • Image #1: Core i7-4960X, Asus X79 Sabertooth, 16GB DDR3-1866, Windows 10® 10 Technical Preview 2 (Build 10041), AMD Catalyst™ driver 15.20.1012. DirectX® 11 multi-threaded vs. DirectX® 12 multi-threaded. 3840x2160 resolution.
  • Image #2: AMD A10-7850K, Asus A88X-Pro, 8GB DDR3-1866, Windows® 10 Technical Preview 2 (Build 10041), AMD Catalyst™ driver 15.20.1012. DirectX® 11 multi-threaded vs. DirectX® 12 multi-threaded. 1920x1080 resolution.>
  • Image #3: AMD FX-8350, AMD Radeon™ R9 290X, Gigabyte 990FXA-UD5, 8GB DDR3-1866, Windows® 10 Technical Preview 2 (Build 10041), AMD Catalyst™ driver 15.20.1012. DirectX® 11 multi-threaded vs. DirectX® 12 multi-threaded. 3840x2160 resolution.

No stuttering. No tearing. No extra costs. Just smooth gaming. Those are pretty straightforward and reasonable requests from gamers, right? Today it becomes reality with our latest AMD Catalyst™ driver release. This is our first driver with AMD FreeSync™ technology enabled, and I’m happy to report that there are FreeSync technology-enabled monitors shipping or are about to ship imminently. Should this be your first encounter with AMD FreeSync technology, please make sure you check this out first to learn about how it works! You can also find more information on our website.

 

SPEAKING OF MONITORS

Below you’ll find a chart with all of the AMD FreeSync technology-compatible monitors announced to date. I’ve had the pleasure of playing around with a few of them, and they’re more than worth your consideration. You may prefer the Acer or BenQ’s 1440p models that have a wide refresh rate range (40-144Hz). Alternatively, proponents of IPS panels or ultra-wide aspect ratios would be keen to check out the 29” or 34” options from LG. And more monitors are on their way. Up to 20 monitors supporting AMD FreeSync technology are in the pipe for 2015, in fact!

 

Now, think back to when you saw your first HD video—it was difficult to be satisfied with standard-def content. It was for me, anyhow. That’s how I feel about gaming on AMD FreeSync technology. I always disliked tearing and stuttering, but I couldn’t do much about them with yesterday’s technologies. AMD FreeSync technology changes the game, fixing both tearing and stuttering with smooth gameplay at virtually any framerate. I can now dial up the detail without worrying about whether or not I’m sacrificing smoothness, and I find it difficult to game on normal monitors now.

 

1.png

 

THE DEFINITION OF “FREE”

AMD FreeSync technology costs virtually nothing for a monitor manufacturer to adopt. Most of them already had the relevant components in their supply chains, but needed the right software to come along to expose latent capabilities. With the help of VESA, the DisplayPort Adaptive-Sync specification was born to do exactly that.

 

DisplayPort Adaptive-Sync has no unique material or licensing costs, and AMD FreeSync technology builds on top of that industry standard to give gamers a benefit in all of their games.

 

No licensing. No proprietary hardware. No incremental hardware costs. As some might say: “free as in beer.”

 

All of these savings are reflected in the price tags. Several of the displays announced by our technology partners are up to hundreds cheaper than comparable displays featuring our competitor’s dynamic refresh technology. Other displays, like the ones from LG, are actually cheaper this year with AMD FreeSync than comparable models were last year without. This is the advantage from doing technologies the right way: as open standards with low and inexpensive barriers to entry. You’ve heard that from us time and time again, but it rings true with AMD FreeSync.

 

PERFORMANCE BENEFITS

Here’s another interesting fun fact: our testing indicates that AMD FreeSync technology doesn’t incur any performance penalties. The competition can’t say the same. In fact, the competition remarked to AnandTech last year that enabling their technology costs you 1ms of latency—an average performance hit of 3-5%. AMD FreeSync technology is smarter than that. Our data suggests a modest performance gain with AMD FreeSync enabled, and that too is the advantage of taking the time to thoughtfully develop an industry standard.


2.png

*footnote

 

FOR TWITCH FPS GAMERS

We heard you guys loud and clear: Vsync isn’t enough. You don’t want it because it limits framerates, and that limits opportunities for the freshest mouse data to reach your eyeballs. Call it what you will: mouse lag, input latency, whatever. With AMD FreeSync™ technology, we uniquely give you the opportunity to turn Vsync off when the framerate of the application leaves the dynamic refresh range supported by the monitor.

 

So, if you have one of those 144Hz BenQ or Acer displays, but you’re a Counter-Strike: Global Offensive player that wants to run at 240 FPS… you can! You still get beautifully smooth, tearing-free gameplay from 40-144Hz with those monitors, but you don’t have to sacrifice your input latency to get it when the framerate goes to 145+.

 

Below you can see a conceptual example of this relationship. In this theoretical exercise, the red line reflects framerates and input latency of an application Vsynced to 60Hz, and the blue line demonstrates the superior framerates and mouse latency of a game unrestricted by Vsync.  This is a hypothetical scenario, and you’ll want to tinker with your favorite game, but AMD FreeSync actually gives you the choice—the competition doesn’t.

 

3.png


WRAP UP

AMD FreeSync technology is free of incremental hardware costs, free of performance penalties, free as a standard, open for use by anyone in the gaming industry, and unbelievably smooth framerates are I-can-never-go-back-to-the-old-way incredible for PC gaming.

 

It’s hard to go wrong. What monitor will you buy?

 


Robert Hallock is Technical Marketing Manager for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

 

 

Footnote:

In tests by AMD as of January 30, 2015, enabling AMD FreeSync™ technology on the AMD Radeon™ R9 290X, and G-sync on the NVIDIA GeForce GTX 780 had an average performance impact of +0.274% FPS (avg) and -1.447% FPS (avg), respectively, in Alien: Isolation™ (SMAA T1x), BioShock® Infinite, Tomb Raider™, Sniper Elite™ III (2.25x SSAA), and Thief™ (normal quality). All applications were evaluated at 2560x1440 with 8xAA and 16xAF unless otherwise noted. System configuration: i7-4770K CPU, MSI Z87 motherboard, 16GB memory, Windows 8.1 64-bit, AMD Catalyst™ 15.3 Beta, Nvidia 347.52 WHQL driver. G-sync monitor: ASUS ROG Swift PG278Q. AMD FreeSync™ technology monitor: BenQ XL2730Z.

glnext-logo.pngSince the advent of Mantle, gamers widely believed that Mantle would become an industry-standard graphics API or, at the very least, inspire successors that would offer similarly powerful benefits to hardware beyond AMD Radeon™ graphics. Many hoped that Mantle would come to OSes beyond Windows, too. These voices weren’t wrong: those were our goals, too!

 

The recent arrival of those oh-so-inspired successors has subsequently honed this chatter to one question: “What does Mantle do now?” AMD has cryptically replied—with very good reason—that Mantle’s destiny is openness and coexistence. Today we’re ready to be clear on one aspect of what that means.

 

The cross-vendor Khronos Group has chosen the best and brightest parts of Mantle to serve as the foundation for “Vulkan,” the exciting next version of the storied OpenGL API.

 

WHAT THIS MEANS

OpenGL has long and deservedly commanded respect for being a fast, versatile and wide open API that works on all graphics vendors across multiple operating systems.

 

Meanwhile, Mantle has seen acclaim for many improvements in gaming and game development: higher framerates, reduced rendering latency, reduced GPU power consumption, better use of multi-core CPUs, and re-pioneering new features like split-frame rendering.

 

Vulkan combines and extensively iterates on these characteristics as one new and uniquely powerful graphics API. And as the product of an incredible collaboration between many industry hardware and software vendors, Vulkan paves the way for a renaissance in cross-platform and cross-vendor PC games with exceptional performance, image quality and features.

 

STAY TUNED FOR MORE INFO

“Open” and “flexible” technologies are an essential piece of AMD’s DNA, and we have a long history in supporting those ideals. Our co-development of the Vulkan API through contributions like Mantle is another chapter in that open technology tale for AMD, an exciting evolution of Mantle, and a big step forward for PC gamers alike.

 

Stay tuned for more information on the specifics of Vulkan from the Khronos Group! We’ll be working hard to make it a fascinating story in the meantime.

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

apu-desktop-notebook-tablet-Banner.pngGreat gameplay isn’t an accident—it’s built. Every chip is years in the making: the child of keen industry forecasting, of expert engineering, of collaboration with top game devs, and of the unrelenting thirst to win. When designing the AMD APUs, great gameplay was top of mind. But the “proof is in the pudding,” they say, and recent testing by Anandtech had all the proof anyone could need: the AMD A10 and A8-7000 Series APUs crushed the competition in head-to-head DirectX® 12 performance testing using StarSwarm by Oxide Games.

 

I encourage you to read the complete article, but let me summarize and digest the data for you:

  • Migrating from DirectX® 11 to DirectX® 12 yielded an average framerate improvement of 41.2% for the APUs tested by Anandtech. The competition’s average? Just 3.25%.
  • In batch submissions, measured by the time it takes to bundle and process large bodies of graphics work, AMD A-Series APUs were 41% faster at the job.
  • In fact, AMD A-Series APUs were 12.5x faster at processing a batch submission in DirectX® 12 as compared to DirectX® 11. The competition was only 10x faster in the same scenario.

 

These three data points reveal a great deal about the harmony between compatible AMD APUs and DirectX® 12. Not only were the AMD APUs faster than the competition in absolute framerates, they delivered more fidelity, did it more efficiently, and demonstrated greater benefit from the switch to DirectX® 12 than their opponent.

 

That’s a flawless victory for an extraordinary family of SoC designs that lie at the heart of world-class devices like the Xbox One™, PS4™, laptops, desktops, ultra-thins, HTPCs, arcade machines, airliners, and a breathtaking array of other devices.

 

Speaking of compatibility, all AMD APUs and GPUs based on the award-winning Graphics Core Next architecture are already DirectX® 12-compliant. Just install the Windows® 10 Technical Preview, grab the latest updates, and you’re ready to go!

 

Great DirectX® 12 performance is really that simple with AMD APUs.

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

AMD_Mantle_Logo.png

MARCH 04, 2015: Did you know that the Khronos Group has selected Mantle to serve as the foundation for Vulkan, a low-overhead PC graphics API that works on multiple OSes and hardware vendors? Learn more!

 

AMD's Mantle Graphics API has gathered incredible momentum in its first year, gaining support from five advanced game engines and 10 premium applications.

 

Mantle has also revolutionized the industry's thinking on low-overhead/high-throughput graphics APIs as solutions that do not compromise developer productivity. Compelling content was delivered on Mantle in historically quick time, paving the way for various graphics standards bodies to move forward with conviction on their own similar API standards and specifications.

 

We are proud of these accomplishments, and we have been inspired by everything we have learned along the way. We also haven’t forgotten the promise we made: openness.

 

AMD is a company that fundamentally believes in technologies unfettered by restrictive contracts, licensing fees, vendor lock-ins or other arbitrary hurdles to solving the big challenges in graphics and computing. Mantle was destined to follow suit, and it does so today as we proudly announce that the 450-page programming guide and API reference for Mantle will be available this month (March, 2015) at www.amd.com/mantle.

 

This documentation will provide developers with a detailed look at the capabilities we’ve implemented and the design decisions we made, and we hope it will stimulate more discussion that leads to even better graphics API standards in the months and years ahead.

 

Proud moments also call for reflection, and today we are especially thoughtful about Mantle’s future. In the approaching era of DirectX® 12 and the Next-Generation OpenGL Initiative, AMD is helping to develop two incredibly powerful APIs that leverage many capabilities of the award-winning Graphics Core Next (GCN) Architecture.

 

AMD’s game development partners have similarly started to shift their focus, so it follows that 2015 will be a transitional year for Mantle. Our loyal customers are naturally curious about what this transition might entail, and we wanted to share some thoughts with you on where we will be taking Mantle next:


  1. AMD will continue to support our trusted partners that have committed to Mantle in future projects, like Battlefield™ Hardline, with all the resources at our disposal.
  2. Mantle’s definition of “open” must widen. It already has, in fact. This vital effort has replaced our intention to release a public Mantle SDK, and you will learn the facts on Thursday, March 5 at GDC 2015.
  3. Mantle must take on new capabilities and evolve beyond mastery of the draw call. It will continue to serve AMD as a graphics innovation platform available to select partners with custom needs.
    1. The Mantle SDK also remains available to partners who register in this co-development and evaluation program. However, if you are a developer interested in Mantle "1.0" functionality, we suggest that you focus your attention on DirectX® 12 or GLnext.

 

As an API born to tackle the big challenges in graphics, much of this evolution is already well under way. We invite you to join AMD this week at Game Developer Conference 2015 to see not just the future of Mantle, but the future of PC graphics itself.

 

Raja Koduri is Vice President of Visual and Perceptual Computing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

Great computing experiences don’t just happen—they’re AMD-enabled.

 

The latest major release of our drivers are available as a free upgrade for AMD customers. In addition to releasing new versions of the system software at regular intervals, AMD today released an AMD Catalyst™ Omega special edition software update that will include enhancements to enrich the user experience.

 

Why? Today’s hardware and software have become highly interconnected and interdependent dynamically interacting to shape a cohesive computing unit. This symbiotic relationship between hardware and software is vital to the ongoing evolution of future computing devices. New software becomes incorporated into an existing generation of hardware, enabling faster, more capable, and more reliable performance.

 

WHAT IS THE AMD CATALYST™ OMEGA DRIVER?

Last year alone, AMD Catalyst™ drivers were downloaded more than 80 million times — and we are thrilled that millions of customers are enjoying the benefits of our new software. Giving them something extra-special this time of year is the best way to thank them for their continuing support, and show our appreciation for being part of our AMD community.

 

Our software team has worked hard to enrich the user experience, and create a remarkable environment for developers by providing them the ability to create incredible new apps. The AMD Catalyst Omega driver was engineered to take full advantage of the advanced technologies built into AMD’s products that feature GCN Architecture, and help make them more powerful and capable.

Capture.PNG.png

Extra performance — no extra cost

Think of the last time a product you purchased actually improved over time. AMD Radeon™ graphics and AMD A-series APUs featuring GCN architecture can get  easy software upgrades that boost performance, enhance reliability, and help reduce heat and energy consumption. Installing the AMD Catalyst Omega driver on select AMD products enables free software upgrades that install automatically and can improve your gaming performance.

 

For example, early buyers of an AMD Radeon™ R9 290X GPU who download and install the AMD Catalyst Omega driver can realize up to 19% faster* gameplay on BioShock Infinite.

 

Similarly, users of AMD’s advanced APUs like the AMD A10 7850K can achieve up to 29% faster** gaming performance on Batman: Arkham Origins.

 

Great software brings out the best of great hardware

The AMD Catalyst Omega driver extracts the true potential of GCN-enabled AMD APUs and GPUs. Here are a few examples of new AMD Catalyst Omega driver capabilities:

  • Enabling the UltraHD revolution: UltraHD TVs and monitors are now available, and becoming much more affordable. UltraHD displays demand UltraHD content—but very little content or entertainment is being recorded in 4K at this time. The good news: we are offering built-in Ultra HD upsampling with frame rate conversion and HD detail enhancement that will convert 1080p videos to near UltraHD quality on 4K displays.

  • Perfect Picture UltraHD: Our Perfect Picture UltraHD technology strives for “pixel-perfect” images, with Compression Artifact Removal 2, and Frame Rate Conversion for Blu-ray Playback enabling pixel-by-pixel image processing.

  • What is better than even more powerful? Smoother: There are many reasons that make AMD APUs and AMD GPUs a match made in heaven. But one of the major ones is that one brings out the best in the other. When select products are paired together through Dual Graphics with frame pacing enhancements, the powerful gameplay becomes smooth.

 

Here are examples of AMD enabling developers to deliver outstanding user experiences:

 

  • OpenCL™ 2.0 Support: Enabling developers to extend the reach of their app content and functionality based on industry standards.

  • TressFX Hair 3.0: Introducing new gaming capabilities for game developers with TressFX, such as rendering of fur onto “skinned” geometries.

  • CodeXL Tools: A comprehensive tools suite for the performance-aware developer to Debug, Profile and Analyze applications. Also included is a realtime display of APU power consumption that collects data on power consumption, core frequency, temperature changes and voltage and current levels.

 

Testing quantity delivers exceptional product quality

The benefits of “quality vs. quantity” are frequently debated — except when it comes to delivering an exceptional user experience, where the quality of a product heavily depends on the quantity of product testing. This is why every AMD Catalyst™ driver release undergoes exhaustive testing to uncover and fix hidden flaws and make the user experience as intuitive, reliable, and enjoyable as possible.

 

Testing the AMD Catalyst Omega driver required executing around 65% more automated and 10% more manual test-cases, utilizing 10% more varied system configurations, with 10% more different display makes and models.*** However, we did not stop there.

 

Our community managers asked six of the largest PC communities to share their candid feedback about our AMD Catalyst™ drivers, and report on the issues they discovered. Our dedicated QA teams worked on reproducing, debugging, and fixing these issues.  Every AMD Catalyst™ driver release undergoes exhaustive testing but we set the bar even higher with this latest driver release - all to ensure a user experience as intuitive, reliable, and enjoyable as possible.

Capture3.PNG.png

For all the aforementioned reasons, the AMD Catalyst Omega special edition driver is the biggest and the best software upgrade AMD has released this year. It’s our way of saying ‘Thank you’ and Happy Holidays.

 

Sasa Marinkovic is Head of Software Marketing for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

FOOTNOTES:

* Intel Core i7 4960X with 16GB DDR3-1866, AMD Radeon™ R9 290X Windows 8.1 64bit comparing launch driver 13.12 vs Driver 14.501.  Tests run at 3840x2160. BioShock Infinite @  ultra scored 30.47 vs 36.24 fps.

** AMD A10 7850K with R7 graphics, 2x4GB DDR3 2400, Windows 8.1 64bit comparing Catalyst 14.2 vs Driver 14.50. In Batman: Arkham Origins @ 1080P,  PHYSX=off GEOMETRYDETAIL=normal DYNAMICSHADOWS=normal MOTIONBLUR=off DOF=normal DISTORTION=off LENSFLARES=off LIGHTSHAFTS=off REFLECTIONS=off AO=normal we see an uplift from 34.96 fps to 45.2 fps.

*** Compared to previous driver release

NOTE: Gamers with Mantle-enabled AMD Radeon™ graphics cards or AMD APUs must have AMD Catalyst™ 14.9.2 Beta (or newer) installed in their system. The game will allow users to select Mantle at runtime. This driver is available here.

 

Friends, diplomats, would-be bureaucrats, today is a truly exciting day in the history of PC gaming: we Sid Meier’s Civilization® addicts have an all-new Civ game to play! Before you commit to one more turn and push your bed time back by five hours, please join us in exploring the day-one Mantle support in Sid Meier’s Civilization®: Beyond Earth™.1

 

A GAME THAT SCARCELY NEEDS AN INTRODUCTION

Sid Meier's Civilization: Beyond Earth is a new science-fiction-themed entry into the award-winning Civilization series. Set in the future, global events have destabilized the world leading to a collapse of modern society, a new world order and an uncertain future for humanity. As the human race struggles to recover, the re-developed nations focus their resources on deep space travel to chart a new beginning for mankind.

 

introbanner.jpg

As part of an expedition sent to find a home beyond Earth, you will write the next chapter for humanity as you lead your people into a new frontier and create a new civilization in space. Explore and colonize an alien planet, research new technologies, amass mighty armies, build incredible Wonders and shape the face of your new world. As you embark on your journey you must make critical decisions. From your choice of sponsor and the make-up of your colony, to the ultimate path you choose for your civilization, every decision opens up new possibilities.

 

AN AMD GAMING EVOLVED COLLABORATION

Firaxis Games and AMD have been in close collaboration on Sid Meier’s Civilization: Beyond Earth for many months, and indeed Firaxis has been an enthusiastic advocate and development partner for Mantle. Looking back at comments made by the studio in April, AMD Radeon™ customers definitely have cause for excitement:

 

By reducing the CPU cost of rendering, Mantle will result in higher frame rates on CPU-limited systems.  As a result, players with high-end GPUs will have a much crisper and smoother experience than they had before, because their machines will no longer be held back by the CPU.On GPU-limited systems, performance may not improve, but there will still be a considerable drop in power consumption.  This is particularly important given that many of these systems are laptops and tablets. The reduced CPU usage also means that background tasks are much less likely to interfere with the game’s performance, in all cases.


Finally, the smallness and simplicity of the Mantle driver means that it will not only be more efficient, but also more robust. Over time, we expect the bug rate for Mantle to be lower than D3D or OpenGL.  In the long run, we expect Mantle to drive the design of future graphics APIs, and by investing in it now, we are helping to create an environment which is more favorable to us and to our customers.


These benefits should come as no surprise to gamers that have been following the history of Mantle, but they’ve been put to particularly good use in Civilization. Let’s dig in!

 

MANTLE IN SID MEIER’S CIVILIZATION: BEYOND EARTH

Mantle is a high-efficiency graphics interface (an “API”) that permits supporting software to leverage the complete capabilities of an AMD Radeon™ graphics card. Mantle does this by reducing software bottlenecks and widening the parallelization of a game’s renderer.

 

Akin to allowing more cars on the road with no additional congestion, Mantle’s design endows a PC with the power to process more simultaneous information. New rendering techniques, higher framerates, more fluid gameplay and superior visual fidelity are all possible with Mantle. AMD is over a year ahead of other graphics companies in delivering this kind of technology to its customers and development partners.

 

John Kloetzli, Firaxis Games’ Principal Graphics Programmer for Civilization: Beyond Earth, put it this way:

 

“If you play [Civilization: Beyond Earth] for 40 hours, you’ve built an enormous empire. There’s a huge amount going on, besides just these tactical battles. We do allow you to zoom out quite far.  […] When you back up, you see your whole empire at once. That’s demanding. That’s when the performance, typically, in PC strategy games begins to go down. This is exactly the situation wherein we’re incredibly excited about Mantle.”

 

We also asked John if Mantle was difficult or complicated to implement:

 

“There definitely is cost involved [for supporting Mantle]. It’s definitely not an API that’s going to hold your hand and it’s not for hobbyists, really. But Mantle is not a significant overhead for a professional graphics team to add to a game. In fact, I did most of the design and programming of the graphics features in [Civilization: Beyond Earth] myself, and I also found time to do the vast majority of the programming for our Mantle backend as well. We fit it in our production schedule, it didn’t push us back any, and we’ll release [Mantle] concurrently with the DirectX® 11 version.”

 

That sounds like a winning combination for gamers and developers. Let’s see how Firaxis put Mantle to use!

 

MANTLE SPLIT-FRAME RENDERING WITH AMD CROSSFIRE™ TECHNOLOGY

UPDATE: Firaxis Games has published additional commentary on split frame-frame rendering (SFR) in Mantle. You should give it a read!

 

With a traditional graphics API, multi-GPU arrays like AMD CrossFire™ are typically utilized with a rendering method called “alternate-frame rendering” (AFR). AFR renders odd frames on the first GPU, and even frames on the second GPU. Parallelizing a game’s workload across two GPUs working in tandem has obvious performance benefits.

 

As AFR requires frames to be rendered in advance, this approach can occasionally suffer from some issues:

  • Large queue depths can reduce the responsiveness of the user’s mouse input
  • The game’s design might not accommodate a queue sufficient for good mGPU scaling
  • Predicted frames in the queue may not be useful to the current state of the user’s movement or camera

 

Thankfully, AFR is not the only approach to multi-GPU. Mantle empowers game developers with full control of a multi-GPU array and the ability to create or implement unique mGPU solutions that fit the needs of the game engine.

 

In Civilization: Beyond Earth, Firaxis designed a “split-frame rendering” (SFR) subsystem. SFR divides each frame of a scene into proportional sections, and assigns a rendering slice to each GPU in AMD CrossFire™ configuration.2 The “master” GPU quickly receives the work of each GPU and composites the final scene for the user to see on his or her monitor.

 

ESSENTIAL READING: How does split frame rendering work in Civilization: Beyond Earth?

 

As you can probably surmise, SFR requires high parallelization, efficient inter-GPU communication, and reliable delivery of slices to the master GPU. AMD Radeon™ graphics cards running Mantle are uniquely equipped to meet those requirements.

 

NOTE: Sid Meier’s Civilization®: Beyond Earth™ presently supports a maximum of two graphics cards. To try mGPU on Mantle for yourself, navigate to %homepath%\Documents\my games\Sid Meier's Civilization Beyond Earth\ in "My Computer." Open the GraphicsSettings.ini file and set "Enable MGPU=1".


MANTLE MULTI-THREADED COMMAND BUFFER SUBMISSION

As Mantle rises to meet the parallelization requirements of SFR, Mantle also supercharges Beyond Earth’s ability to utilize a gamer’s multi-core CPU.

 

In computer graphics, a “command buffer” is a type of memory buffer containing instructions (or “commands”) that the GPU will execute to carry out required rendering workloads. Feeding the GPU with a continuous, uninterrupted flow of commands is essential to keeping the whole graphics card at high utilization. High utilization can yield higher framerates and/or higher image quality, depending on the focus of the game developer.

 

CivBeyondEarth07.jpg

 

Mantle is remarkable in its ability to spread a game engine’s command buffer submissions across multiple CPU cores, ultimately allowing for a wider stream of graphics work to be processed and queued to the GPU.

 

In the case of Sid Meier’s Civilization: Beyond Earth, you’ll see later in this blog that this wide communication lane to the AMD Radeon™ GPU is used to sustain higher overall framerates when empires get large and detailed in the late game.

 

EQAA in Mantle

Aliasing, the nasty “jaggies” on the edges of 3D objects in a PC game, is the bane of gamers everywhere.  Aliasing is produced when a sharp edge is rendered to a monitor, which doesn’t offer sufficiently high pixels per square inch to properly express a smooth line.

 

There are many types of anti-aliasing designed to combat this unwanted phenomenon, and the majority of them fall into a category known as “multisample anti-aliasing” or MSAA. As the name implies, MSAA relies on “samples,” which is a graphics card’s test for whether or not a pixel on your monitor is occupied by one or more objects from the game world. If a pixel is covered by more than one triangle then the final contents/color of that pixel will be a blend of the information covering that pixel to produce a smoother edge.

 

Games and GPUs can cooperate to increase the number of samples being taken with each pixel, and these samples may test for color or coverage. Higher coverage sampling improves the accuracy of detecting whether or not an object occupies the pixel; higher color sampling improves the blending between samples confirmed to be occupied. Gamers increase the sample rate by choosing 2x, 4x or 8x MSAA, causing every pixel to be tested for color and coverage in two, four or eight locations.

 

LEARN MORE: A Quick Overview of MSAA

 

Like MSAA, AMD’s Enhanced Quality Anti-Aliasing (EQAA) also comes in 2x, 4x and 8x sampling modes, but each EQAA mode takes twice as many coverage samples as MSAA. Increased coverage testing allows the GPU to more accurately detect objects within a pixel, potentially allowing EQAA to detect and smooth a hard edge that might have been missed with fewer samples. Coverage samples are computationally cheaper than color samples, so EQAA proves to be a good compromise between quality and performance.

 

EQAA_samples.png

 

Civilization: Beyond Earth automatically enables EQAA in Mantle (and DirectX®!) on supporting AMD Radeon™ GPUs when the user chooses to enable the in-game anti-aliasing options.

 

Customers with older GPUs that lack hardware support for Mantle can still take advantage of EQAA through the AMD Catalyst™ graphics driver. Simply enable 2x, 4x or 8xMSAA in the options menu of your favorite game (if supported), and ensure you have “enhance application settings” selected in the 3D Application Settings tab of AMD Catalyst™ Control Center.

 

SINGLE-GPU PERFORMANCE

Throughout this blog you’ve learned how Mantle can be used to enable great multi-GPU responsiveness, superior CPU multi-threading and smooth anti-aliasing. But thousands of customers effectively tell us every day that single-GPU performance matters more than anything – by owning single-GPU systems!

 

Our collaboration with Firaxis Games to integrate Mantle with Civilization: Beyond Earth is a landmark technical achievement that proves we’re listening. Across every GPU comparison we tested, AMD Radeon™ graphics cards with Mantle delivered the best performance. In fact, the AMD Radeon™ R9 290X 8GB is the fastest graphics single-GPU card on the planet. If you want to play Civilization: Beyond Earth, It doesn’t get any simpler than that.3

CivBE_4k_Ultra_8xAA.png

CivBE_1440p_Ultra_8xAA.png

CivBE_1080p_High_4xAA.png

 

WRAP-UP

AMD and Firaxis Games have worked together for months, not only to equip Civilization: Beyond Earth with a Mantle-based renderer, but to refine the Mantle specification with the features that Firaxis wanted to see. Hundreds of collaborative man hours are coming together for AMD Radeon™ customers at this very moment, and the results speak for themselves: fast, beautiful, efficient performance for Sid Meier’s Civilization: Beyond Earth.

 

That is the power of the AMD Gaming Evolved Program. We hope you enjoy one more turn!

 


Sid Meier's Civilization: Beyond Earth is a technology partner in the AMD Gaming Evolved program. Robert Hallock does Technical Communications for Desktop Graphics at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

 

  1. Mantle application support is required.
  2. AMD CrossFire™ technology requires an AMD CrossFire Ready motherboard and may require a specialized power supply and AMD CrossFire Bridge Interconnect. Check with your component or system manufacturer for specific model capabilities.
  3. In Sid Meier’s Civilization®: Beyond Earth™ internal benchmark test at 3840x2160, the AMD Radeon™ R9 290X 8GB with Mantle outperforms the GeForce GTX 980 with DirectX® 11, NVIDIA’s highest-performing single-GPU graphics card as of October 20, 2014, by 45.38 average FPS to 44.89 average FPS using the Ultra in-game preset with 8xAA. Test system: Intel Core i7-4960X, 16GB DDR3-1866, Asus SABERTOOTH X79, Windows 8.1 x64, AMD Catalyst™ 14.9.2 Beta and ForceWare 344.16 WHQL.

Alien: Isolation™ hits the streets today promising to test your fortitude for playing in the dark. While you’re busy skulking through Alien-infested corridors, no doubt hiding from those crazy telescoping jaws and a river of acid spit, have a pause to admire the world around you. That world is jam-packed with truly state-of-the-art rendering technology. Today we’ll be exploring how AMD and The Creative Assembly utilized the resources of the AMD Gaming Evolved program to develop and optimize those technologies for DirectX® 11-ready AMD Radeon™ graphics cards.


NERD WARNING: Serious tech talk ahead! PC graphics junkies are in for a treat, but we’re going into exhaustive detail. Buckle up!

 

BUT FIRST, A LITTLE ABOUT THE GAME

Discover the true meaning of fear in Alien: Isolation, a survival horror set in an atmosphere of constant dread and mortal danger. Fifteen years after the events of Alien™, Ellen Ripley's daughter, Amanda enters a desperate battle for survival, on a mission to unravel the truth behind her mother's disappearance.

As Amanda, you will navigate through an increasingly volatile world as you find yourself confronted on all sides by a panicked, desperate population and an unpredictable, ruthless Alien.

 

Underpowered and underprepared, you must scavenge resources, improvise solutions and use your wits, not just to succeed in your mission, but to simply stay alive.

 

Want to see more of Alien: Isolation™? More killer videos are right over here.


IT’S BEAUTIFUL

PC gamers are in for a treat when they dial up the settings of Alien: Isolation. Alien: Isolation’s engine is all-new, written from the ground up to provide all of the advanced effects discussed in this blog. PC gamers will be delighted to learn that both console and PC performance envelopes were specifically targeted to provide a unique, highly-optimized experience on any system Alien: Isolation can be played.

 

ILLUMINATING THE SEVASTOPOL

To achieve the dramatic lighting effects on the Sevastopol, a setting in Alien: Isolation, a “deferred renderer” lies at the heart of its engine. This kind of renderer renders the entire scene visible to the player in a single pass, then stores all properties (e.g. positions and materials) required for beautiful lighting in a “G-Buffer.” The stored properties that matter to scene lighting can now be deferred until after the scene geometry is rendered, which makes the processing effort of lighting proportional only to the lighting complexity rather than lighting and geometry complexity. In short, the deferred renderer allows artists to place hundreds of dynamic lights in the scene and achieve great geometric detail simultaneously.

 

ai_gbuffer.png
TOP LEFT: Albedo, TOP RIGHT: Normal mapping, LOWER LEFT: Shininess, LOWER RIGHT: Fully-lit scene!

 

But the benefits of a deferred renderer are matched by some drawbacks. Foremost: limited support for diverse material types (e.g. metal, cloth, wood, skin, hair, etc.) and proper illumination of semi-transparent objects.

 

Classically, diverse material types must be rendered as a separate pass after the deferred lighting—a performance penalty. Alternatively, diverse materials can be treated with a grossly simplified physical model that doesn’t effectively simulate the true properties of those materials. Can you avoid sacrificing performance and/or quality if you want good lighting and realistic materials? Alien: Isolation proves that you can.

 

Alien: Isolation circumvents the materials issue through novel use of the GPU’s stencil buffer to tag the objects that use a unique material in the scene. The lighting/material interaction for each unique material type is rendered using a classic multi-pass technique, with the unique exception that the engine also tests the visibility of each material to the player’s field of view. Unseen materials are rejected in the graphics pipeline objects to avoid paying the rendering penalty typically associated with the multi-pass lighting we mentioned above.

 

And where semi-transparent objects are usually difficult for a deferred renderer, Alien: Isolation works around this as well. Only solid/opaque geometry can be rendered into the engine’s G-buffer, which means the semi-transparent geometry is normally rendered after the scene is composed using a reduced number of lights to conserve performance. The Creative Assembly’s solution is to dynamically generate a light map for each semi-transparent object. The light map is populated on-the-fly with the lighting data from the G-buffer, meaning translucent objects receive correct lighting regardless of scene complexity.

 

More technical details behind The Creative Assembly’s brilliant lighting model can be found in this presentation.

 

REAL-TIME RADIOSITY IN DIRECTCOMPUTE

Lighting an in-game world with direct sources like lamps and sunlight is not enough to achieve believable or realistic lighting. Here in the real world, rays of light bounce off of all kinds of reflective surfaces and scatter light into the surrounding area; those light rays continue to bounce around the room until all the energy from the rays has been absorbed. That bouncing and reflectivity is called “radiosity.”

 

Radiosity is an insanely difficult problem to solve in real-time graphics, and most games only fake it by using some form of full-scene ambient lighting. “Approximation” was not good enough for The Creative Assembly, who developed a full real-time radiosity engine for Alien: Isolation.

 

At the highest level, Alien: Isolation’s engine is constantly updating the radiosity model for the entire scene. This is achieved by placing a set of invisible “light probes” throughout the scene. Using Microsoft’s DirectCompute, these probes process how much light they are receiving from the lighting coming out of the deferred renderer. Lighting contributions from emissive surfaces, like computer screens and LED signs, are added to the data processed by the probe and combined with indirect (reflected) lighting coming from the previously-rendered frame. To light fixed or static objects in the scene visible to the player, the light probe data is crunched into lightmaps, applied to the geometry and rendered out.

 

ai_radiosity2.png

LEFT: The radiosity lightmaps, RIGHT: The world lit only with lightmap data. Notice how precise the real-time lighting is.

 

For the dynamic objects in the world, such as characters and particle effects, the light probes are used to generate radiosity cubemaps via DirectCompute.

 

Finally, the use of DirectCompute for AMD Radeon™ graphics customers is especially important, as the award-winning Graphics Core Next (GCN) architecture was

specifically designed with such “general purpose” languages in mind. Though that general purpose-ness was originally intended to be used in non-gaming scenarios, modern game engines have made great use of DirectCompute to quickly crunch highly-parallelized data. Awesome!

 

ai_radiosity.png
LEFT: Full engine render with radiosity disabled, RIGHT: Render with radiosity enabled. Notice the more subtle lighting throughout the scene, which fully accommodates reflections from metallic surfaces.

 

HIGH DEFINITION AMBIENT OCCLUSION+ (HDAO+)

To complement Alien: Isolation’s dynamic lighting and real-time radiosity, the renderer also uses HDAO+ (an AMD-developed technique) to calculate the shadows that are created when lighting reaches cracks and crevasses throughout the scene. HDAO+ uses DirectCompute (good for AMD Radeon™ graphics!) to calculate the size and strength of these shadows. HDAO+ uses the information coming out of the G-buffer and computes at multiple resolutions to help achieve the best balance of quality and performance.

 

ai_hdao.png
TOP LEFT: HDAO+ disabled, TOP RIGHT: HDAO+ enabled, LOWER LEFT: All the shadows that would never get rendered without HDAO+.

 

BETTER TEXTURES IN THE YEAR 2137

Texture compression is essential for good performance in content-heavy games. With texture compression, developers can cram more textures into a scene without overloading the GPU’s framebuffers or exhausting memory bandwidth while loading those textures into VRAM.

 

The industry has long relied on “DXT” compression which compresses each 4x4 block of pixels from the original image into a data set that’s one quarter to one eighth the size. These textures can be decompressed on the fly with dedicated capabilities in AMD Radeon™ graphics hardware.

 

The problem with compressing textures is that artifacts are introduced due to the compression scheme. You’ve seen pixilated and blocky JPEG files, and the DXT artifacts are not dissimilar. The Abs Error column below isolates these errors, with more color indicating a higher artifact quantity.

 

DirectX® 11 introduced a better, more complex, compression scheme called “BC7” that still compresses to a quarter of the size of the original image but significantly reduces the artifacts normally associated with the older DXT methods like BC3. AMD Radeon™ graphics hardware is ready for DirectX® 11.2, meaning those gamers will have access to the BC7-compressed texture pack for superior texture fidelity.

 

DXTC.png
The high artifact depicted in the BC3 abs error column would be seen as fuzzy or blocky textures by the player. The low abs error rate on BC7 texture compression preserves performance and quality for AMD Radeon™ graphics users.

 

LURKING IN THE SHADOWS

Realistic shadowing is an essential ingredient of Alien: Isolation’s creepy atmosphere. To make these shadows as realistic as possible, The Creative Assembly team tapped AMD’s “contact hardening shadow” technology. This technique dynamically hardens or soften a shadow’s edges depending on the distance of the shadow from the light source and object casting that shadow.

 

While shadowing techniques are incredibly efficient on the Graphics Core Next (GCN) architecture in contemporary AMD Radeon™ graphics products, this technique nevertheless requires a powerful GPU and can only be enabled when the “ultra” in-game graphics preset is enabled.

 

ai_chs.png
LEFT: Contact Hardening Shadows disabled, RIGHT: Contact Hardening Shadows enabled. Notice that the shadows are softer and more realistically diffuse with this effect enabled.

 

GPU-ACCELERATED PARTICLES

The particle effects in Alien: Isolation breathe life into the eerie setting of the Sevastopol. From fire and smoke effects, to the streams of sparks generated by Ripley’s blow torch, an efficient way to simulate the thousands of simultaneous particles is to run a physical simulation on an AMD Radeon™ GPU.

 

The different characteristics of these particle types are artist-controlled using parameters baked into the metadata of a texture. Particles can be affected by velocity fields and bounced off the scene geometry by reading data out of the G-buffer. When it's time to render for the player, the particle physics are GPU-accelerated with DirectCompute on AMD Radeon™ graphics cards!

 

ai_particles.jpg

Affected by thermoclines and world geometry, embers soar into the sky backed by a real physics simulation calculated on an AMD Radeon™ graphics card.

 

SMOOTHIN’ THOSE SURFACES

Throughout Alien: Isolation, the Graphics Core Next architecture’s prowess with geometry tessellation is put to excellent use with silhouette-enhancing tessellation. This kind of tessellation smartly adds detail to a scene by dynamically increasing geometric complexity only on the edges of objects visible to the player. This calculated exercise of tessellation improves details on pipes, padding and alien hives without wasting GPU cycles on invisible work.

 

ai_tessellation.png

TOP LEFT: Tessellation disabled, TOP RIGHT: Tessellation enabled, LOWER LEFT: Tessellation disabled (wireframe), LOWER RIGHT: Tessellation enabled (wireframe). Notice the increased geometric complexity and detail.


PERFORMANCE

Now that you’ve seen how AMD and The Creative Assembly collaborated to implement a host of AMD Radeon™ graphics-optimized effects in this stellar new game engine, let’s see how it performs! We’ll let the charts speak for themselves—AMD dominates!

 

ai_perf2.png

ai_perf1.png

ai_perf3.png

 

WRAP-UP

When you’re done messing your knickers and fleeing from Aliens, stop to appreciate what’s around you:

  1. Unique PC effects for everyone to enjoy;
  2. and AMD Radeon™ graphics-optimized performance for AMD customers.

Those are our top missions in the AMD Gaming Evolved program, and we’re proud to support developers, like The Creative Assembly, who are equally passionate about PC gaming.

 

Speaking of AMD Gaming Evolved, you may have heard of our new Never Settle: Space Edition promotion. Never Settle: Space Edition leverages the AMD Gaming Evolved partnerships we have with developers like The Creative Assembly to give you complimentary codes for games, like Alien: Isolation, with the purchase of an eligible AMD Radeon™ R9 Series GPU from a participating retailer.

 

Get AMD Radeon™ graphics and get your game on!

 


Alien: Isolation is a technology partner in the AMD Gaming Evolved program. Robert Hallock does Technical Communications for Desktop Graphics at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

NOTE: Sniper Elite III is available on Steam or in the Never Settle: Space Edition promotion with support for the Mantle graphics API starting today!This blog was authored by Kevin Floyer-Lea, Head of Programming, Rebellion Developments. It has been reprinted with permission.


Over the last few months at Rebellion we've taken our in-house Asura engine used in Sniper Elite 3 and added support for AMD's Mantle API. Our Head of Programming Kevin Floyer-Lea brings us up-to-date with the story so far...

 

WHY MANTLE?

 

The primary goal of Mantle is to provide a low-level interface that allows applications to speak directly to AMD's "Graphics Core Next" family of GPUs - greatly reducing the CPU overhead of translating commands for the GPU. With more traditional APIs like DirectX 11 there is often a disconnect between how costly a developer thinks (hopes!) an API call will be, and how much work the driver actually ends up doing underneath.

 

In simple terms the expected CPU gains of Mantle should be twofold. Firstly, making a command stream for the GPU should be less work on the CPU - and without any "surprises" or mysterious stalls. Secondly, the making of command streams can be entirely multithreaded. The native support of multithreading is perhaps one of the most important features from Rebellion’s point of view - while Microsoft had made some attempts at supporting multithreading with DX11 it was fundamentally limited by the single-threaded design choices of the previous versions.

 

Furthermore, with Mantle the developer gains access to things that drivers typically hide away - like the GPU's dedicated memory. This brings the PC closer to console programming, where developers are used to having direct control over available resources and squeezing the most out of the hardware.

 

It was these aspects which drew us to supporting Mantle - we'd long wished for the sort of control we had on console on our PC titles, and it was clear that whatever else may happen with Mantle in the future, it's most definitely kick-started a move to more lightweight APIs as we've seen with recent announcements concerning Microsoft’s DirectX 12, Apple's Metal, and Khronos’ Next Generation OpenGL Initiative.

 

OUR AIMS

 

Our main goal for supporting Mantle was to take maximum advantage of the potential for multithreading the API calls, and refactor our existing engine rendering pipeline to better fit what we predict are the requirements of this new breed of lightweight APIs. In that respect we spent more time restructuring our engine's rendering architecture than we did writing Mantle-specific code!

 

It was also important that we reused exactly the same data and assets as the (already shipped!) DX11 version of Sniper Elite 3 - so we wouldn't be optimising any shaders, data formats or rendering techniques at this stage - we'd just be shipping a new executable and reusing the same assets. This was primarily done to reduce cost and risk - but in hindsight it makes us a fairly unbiased test case between the two APIs.

 

What we have now is a fairly preliminary implementation in many respects - as Asura is a fully cross-platform engine designed to work on multiple platforms simultaneously, we aim to build upon this work to make a more independent code layer which sits over multiple low-level APIs as they become available.

 

EARLY RESULTS

 

For our first comparison let’s look at the beginning of the “Siwa” level of Sniper Elite 3, which is one of the more graphically demanding start positions in the game as it encompasses lots of layered scenery and vegetation stretching off to the old city complex in the distance. Half-hidden in the scene are dozens of people and some vehicles which the culling system can’t remove because they are actually visible – just not that obvious. Gameplay hasn’t really kicked off yet so the rest of the engine’s systems are idling along; rendering is the biggest CPU hit here.

 

se3.jpg

Below is what Task Manager reports if we just sit at the start position for 60 seconds. This is using an Intel i7-3770K CPU with 8 logical processors, coupled with an AMD R9 290X GPU, running on Ultra settings at a resolution of 1920x1200 – so we’d expect to be GPU bound in this scenario.

 

se3cpudata.jpg

 

The Mantle version clearly shows a much more balanced CPU load across the cores – though the total CPU utilisation has only dropped from 23% on DirectX 11, to 21% on Mantle. The more balanced load is exactly as we’d hoped, since all the Mantle API calls are now distributed across the available cores by our Asura engine’s multithreaded task system, just like we do for other systems like AI, animation or physics.

 

It’s worth noting that Sniper Elite 3 and the Asura engine are already optimised to account for DirectX 11’s weaknesses. For example, we make heavy use of instancing and similar batching techniques to reduce the number of draw calls we make per frame – all the usual things to reduce CPU overhead, which means Mantle will have less easy wins compared to other draw-call heavy titles.

 

So that’s what the CPU is doing – but what’s the actual framerate? On those settings we’re running at an average of 88fps on DX11, and 100fps on Mantle – around a 14% speed increase. This explains why the total CPU utilisation is still quite similar – with Mantle the CPU has to cope with 12 more frames every second, meaning we’re packing in more work and still using less CPU power. Furthermore because the work is more distributed, if we increase CPU load (say by using a faster graphics card, or by lowering resolution) we’re less likely for a single logical processor to become the bottleneck.

 

The size of the frame-rate increase is a pleasant surprise, as frankly at this stage in development we were expecting to have a more roughly equal frame-rate when GPU bound. There’s still a fair amount of scope for increasing performance with Mantle, particularly as we’re not yet taking advantage of the Asynchronous Compute queue. This would allow us to take some of our expensive compute shaders – like our Obscurance Fields technique – and schedule them to run in parallel with the rendering of shadow maps, which are particularly light on ALU work.

 

One reason for the performance gains seen so far may be the way we are handling the GPU’s memory - we pre-allocate VRAM in large chunks and then directly manage and defragment that memory ourselves. Similarly when updates for dynamic data and streaming textures are needed, we DMA copy the affected memory as part of our command stream to the GPU - thus eliminating the sort of copying and duplicating of buffers the DirectX drivers might have to do.

 

Ironically, one unintended consequence of increased texture streaming performance, and the ability  to hold more textures at once given we have more control over memory, is that we’ve found that we often have far more high resolution textures being used in the Mantle version... which could in theory increase rendering time. Thankfully speed increases from other areas seem to have hidden this, so you’ll just get better looking textures!

 

Another big reason for the speed gains is the way Mantle handles shaders. On DirectX we’re accustomed to having separate shader stages that are treated independently – the common ones being vertex and pixel shaders. Mantle instead uses monolithic pipelines – a concept that combines all the shader stages and the relevant rendering state into a single object.

 

As well as taking less CPU overhead to use, having everything together in one pipeline allows for some holistic optimisations that otherwise wouldn’t be possible – for example, perhaps that value calculated in the vertex shader isn’t actually used in the pixel shader... so it could be optimised out entirely. This seems to have particularly benefitted Sniper Elite 3 when it comes to tessellation, where we’re making heavy use of all the traditional stages as well as hull and domain shaders.

 

BENCHMARKS

 

To make testing easier we’ve added a Benchmark option to Sniper Elite 3 – available on the “Extras” page from the game’s front end menus. The benchmark contains varying scenes similar to what happens in game, e.g. wide, long distance views; close-ups with tessellation; obscurance fields and shadows; a truck full of characters driving by; lots of special effects overdraw in a gratuitous slow-mo explosion. These put different degrees of stress on the CPU and GPU and hopefully give us a more representative view of what happens in the game as a whole.

 

A word of caution at this point - when leaving the benchmark running repeatedly, we found that the dynamic power management software can kick in, reducing GPU cycle speed and thus skewing the profiling results. So it’s a good idea to use something like AMD’s OverDrive panel to monitor your GPU and guarantee consistency – and possibly increase your allowed fan speed if you don’t mind trading noise for frame-rate!

 

At the end of the benchmark you’ll get an average frame-rate report, and a more detailed log file is saved out to your Documents folder. Our initial tests with the benchmark are showing very similar performance gains as seen in the Siwa test above; here’s breakdown using our R290X setup, varying both resolution and quality settings.


To guarantee we’re GPU bound for the final setting we’ll use 1920x1200 at Ultra quality with 4x supersampling – which means the engine internally renders everything at 3840x2400, and then right at the end downsamples back to 1920x1200 to give us an extremely good looking (and expensive) anti-aliased image.

se3perf.png

Similarly here are the results for a HD7970, coupled with an older CPU that has only 4 logical processors:

 

se3perf2.png

Rather than going into more detail here we’ll let tech sites and interested users have a go themselves and come to their own conclusions. Let us know what you find!       

 

TRY IT YOURSELF

 

The latest version of Sniper Elite 3 now available on Steam has support for both Mantle and the Benchmark feature. To enable the Mantle build you need to select the “Use Mantle” tickbox in the game’s launcher, which is accessed via the Options button. The tickbox should be greyed out if you don’t have the requisite hardware or up to date drivers – we require AMD Catalyst™ 14.9 or later drivers which are available here:  http://support.amd.com/en-us/download


NOTE: be aware that these drivers only support Windows 7, Windows 8.1 and Windows 10 – not Windows 8.0! If you have Windows 8.0 you can update to 8.1 for free via the Windows Store page. Best to back stuff up first!


CONCLUSIONS

 

All in all, even this first pass of Mantle has delivered all that we’d hoped for:

 

  • Improved frame-rate
  • Reduced CPU power consumption (important for laptops)
  • Less susceptible to frame-rate spikes when other programs hit the CPU
  • Future scalability with higher numbers of cores
  • Scope for increasing scene and world complexity
  • Ability to increase the CPU budget for other systems like AI

 

The last two points are more relevant to our future games, and for now we need to see how this first pass of Mantle behaves in the wild and fix any issues that come up, before moving onto new features and improvements that would make sense to add to Sniper Elite 3. One big area that we haven't yet addressed which needs investigating is multiple GPU support - this can be a tricky area to get right.

 

The way DirectX11 handles multiple GPUs is “AFR” or Alternate Frame Rendering, which as the name suggests means if you have two comparably powered GPUs they simply take turns rendering frames. This is in many respects the easiest approach to take – and is a great way of making your game CPU bound! So possibly our Mantle version could show some big improvements when using this method.

 

However, with the independent control over the GPUs Mantle gives us, we could approach the problem very differently - for example one GPU could be rendering the basic geometry in the scene, while another handles lighting and shadows for the same frame, with the final image composited at the end. This may also provide a route for when GPUs aren’t of a comparable power level – for example an integrated APU motherboard coupled with a desktop GPU. It’s the potential for completely new approaches like this which excites me the most about Mantle and the APIs which will follow it.

 

Kevin Floyer-Lea is Head of Programming at Rebellion Developments. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

WANTED: Experienced RPGers, tabletoppers and fantasy gamers that have grown weary of Wizards and Sorcs hamstrung by silly things like cooldowns, spells per day and expensive reagents. We know you pine for raining Meteor Swarms upon unsuspecting foes like an April Shower. If that sounds appealing to you, then perhaps Lichdom: Battlemage is the game for you.

 

Let me sweeten the deal a little further: Lichdom is optimized for AMD Radeon™ customers with the world’s first implementation of TressFX Hair v2.0, AMD TrueAudio technology, AMD Eyefinity technology, and validation for 4K gaming.1,2 With the power of unlimited magic, you get sweet gameplay. And with the power of AMD Gaming Evolved, you get sweet technology. Let’s dig in!

 

YOU ARE A BADASS

Welcome to the first game where the Mage is an unmitigated badass! With no mana pools or cool-downs, Lichdom: Battlemage throws out all of the classic tropes of playing a Mage. No longer is the character marginalized so that other classes can adventure through the same levels, and finally the true jaw-dropping potential of magic has been realized.

 

Lichdom: Battlemage is a first-person caster that focuses entirely on the Mage. With limitless magical power at your disposal and brutal enemies around every corner, victory hinges on a combination of skill and strategy. You must carefully craft a vast array of spells and learn to cast them in the heat of combat.

The Lichdom: Battlemage spell crafting system offers an enormous range of customization. Every Mage is the product of crafted magic that reflects the individual's play style. Whether you prefer to target your foes from a safe distance, wade into combat and unleash your power at point-blank range, or pit your enemies against each other, endless spell customization lets you become the Mage you want to be.

 

TRESSFX HAIR V2.0

For the complete story on TressFX Hair v2.0, you should read our recent blog that comprehensively explores the technology’s latest developments. However, below I’ve compiled an executive summary of the changes:

  • New functionality to support for grass and fur
  • Continuous levels of details (LODs) are designed to improve performance by dynamically adjusting visual detail as TressFX-enabled objects move towards and away from the player’s POV
  • Improved efficiency with many light sources and shaders via deferred rendering
  • Superior self-shadowing for better depth and texture in the hair
  • Even more robust scalability across GPUs of varying performance envelopes (v.s. TressFX 1.0)
  • Modular code and porting documentation
  • Stretchiness now respects the laws of physics
  • and numerous bug fixes!

 

In general, TressFX 2.0 is a much more detailed, efficient and laws-of-physics-abiding technology than ever before. Awesome!

 

tressfxgryphon.jpg

And it’s double awesome that Lichdom: Battlemage is the very first game to make use of TressFX Hair v2.0. The effect is most prominently utilized on The Gryphon, a companion character to the protagonist. Both the male and female Gryphons feature TressFX, however the effect on the female companion is more pronounced by virtue of her haircut. You encounter The Gryphon early in Lichdom: Battlemage, and re-encounter her or him often throughout the game.

 

AMD TRUEAUDIO TECHNOLOGY

AMD TrueAudio technology is a hardware-level feature found on the AMD Radeon™ R9 295X2, R9 290X, R9 290, R9 285, R7 260X and R7 260 graphics cards. A small block of audio processing hardware is integrated directly into the graphics chips in these products. That audio processing hardware is called a “Digital Signal Processor,” or DSP.

 

blocks.png

A high-level diagram of the Tensilica Xtensa HiFi EP DSP cores and the associated hardware that comprises AMD TrueAudio technology
in an AMD Radeon™ graphics chip.

 

A DSP is specialized silicon dedicated to the task of processing digital signals. Example applications for a DSP include: audio compression, audio filtering, speech processing and recognition, simulating audio environments, creating 3D sound fields and more.

 

DSPs are fully programmable, which allows developers to creatively harness the hardware in ways limited only by their imagination and skill. We are striving with AMD TrueAudio to give game developers a blank canvas for new and never-before-heard audio environments and techniques. We hope that, with time, game developers will do with programmable audio what programmable graphics pipelines did for PC graphics.

 

Best of all, the hardware-accelerated effects of AMD TrueAudio technology are experienced with any old stereo headphones. Your headset or earbuds will do just fine!

 

chain.png

AMD TrueAudio effects are processed and applied as in-game audio is being generated. This allows the user to experience
AMD TrueAudio with plain stereo headphones and any existing sound chip.


AMD TRUEAUDIO IN LICHDOM: BATTLEMAGE

With respect to Lichdom: Battlemage, AMD TrueAudio is utilized to calculate an effect called “convolution reverb.” Convolution reverb is a technique that mathematically simulates the echoes (i.e. reverberation) of a real-life location. This effect is accomplished by recording an “impulse response,” which is a snapshot of the echo characteristics of a real-world location. That impulse response is fed back into software that can recreate that behavior in a PC game.

 

Lichdom: Battlemage uses this technique to make buildings, cathedrals, alleyways and other in-game venues sound quite like they would in real life! In an environment where there are adjacent areas with different echo characteristics and impulse response (example: a cathedral adjacent to a cave and an open space), multiple convolution reverbs must be processed in parallel to create the most realistic sound environment. This effect is automatically enabled when an AMD TrueAudio-capable GPU is configured on the system.

 

You can experience the convolution reverbs for yourself most prominently in the level immediately following the opening tutorial mission. The soaring caverns and claustrophobic tunnels of the second stage make for an exciting and complex acoustic environment.

 

BIG SCREENS

Lichdom: Battlemage has achieved “validated” status for Eyefinity technology 3x1 configurations. This means that the user will enjoy the proper field of view, all menus and HUD elements will be placed correctly, cutscenes will be played without unexpected cropping or stretching, and more.  This is the highest level of compatibility we can award to any game. Additionally, this validation definitely makes Lichdom: Battlemage ready for 4K60 MST and 4K60 SST UltraHD displays!

 

PERFORMANCE

As an AMD Gaming Evolved title, performance on Lichdom: Battlemage solidly offers an advantage to AMD Radeon™ graphics cards. See the benchmarks below for results and recommended graphics settings for your GPU.3

 

Additionally, while the AMD Radeon™ R7 260X and R7 260 are unlisted in our charts, these users should run the game at 1080p with medium quality settings. TressFX Hair v2.0 and anti-aliasing should be disabled. Performance on these graphics cards can be in the mid- to high-30s with these settings.3lichdom_perf1.png

lichdom_perf2.png

lichdom_perf3.png

WRAP-UP

If you have a hankering to be the Mage you always wished you could be, then pick up a copy of Lichdom: Battlemage from Steam today. And as you blast your way through cities and ruins overrun with the occult, take a moment to admire the scenery and the technology—you won’t be disappointed.

 


Robert Hallock does Technical Communications for Gaming & Desktop Graphics at AMD.

His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only.  Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of

AMD or any of its products is implied.


FOOTNOTES:

  1. AMD TrueAudio technology is offered by select AMD Radeon™ R9 and R7 200 Series GPUs and is designed to improve acoustic realism. Requires enabled game or application. Not all audio equipment supports all audio effects; additional audio equipment may be required for some audio effects. Not all products feature all technologies — check with your component or system manufacturer for specific capabilities.
  2. AMD Eyefinity technology supports multiple monitors on an enabled graphics card. Supported display quantity, type and resolution vary by model and board design; confirm specifications with manufacturer before purchase. To enable more than three displays, or multiple displays from a single output, additional hardware such as DisplayPort™-ready monitors or DisplayPort 1.2 MST-enabled hubs may be required. A maximum of two active adapters is recommended for consumer systems. See www.amd.com/eyefinityfaq for full details.
  3. All resolutions and quality levels described by the performance diagrams were tested by AMD performance labs on the following platform: Intel Core i7-4970X, Asus X79 Sabertooth, 16GB DDR3-1866, Windows 8.1 x64. AMD Catalyst™ revision: 14.7 RC3. NVIDIA driver revision: 340.52 WHQL.

Since introducing TressFX Hair in the smash hit Tomb Raider™ last year, we’ve been diligently working to optimize the technology, enable compatibility with more platforms, and add new features. Today we wanted to take a little bit of your time to tell you about what’s new with TressFX Hair, and where the technology will be going in the near term.

 

Before we dive in, however, a quick primer on the history of TressFX Hair feels warranted to set the stage. TressFX Hair was the world’s first real-time hair physics simulation in a playable game. TressFX brought an end to the era of short hair, fixed hairstyles, helmets and other unseemly workarounds structured to disguise the limited nature of hair technology.

 

In fact, TressFX Hair represented the first occasion that a hair physics technology had ever made an appearance on the PC outside of limited technical demos. AMD and Crystal Dynamics collaborated extensively to develop and optimize the technology for PC gamers, and to give Lara Croft the unabashedly contemporary look she deserved for a new chapter in her story.

 

LEVERAGING AMD RADEON™ GRAPHICS ACROSS PLATFORMS

Over the past year, we at AMD have remarked on more than one occasion that bringing AMD Radeon™ Graphics and AMD APU technologies to life on multiple gaming platforms would pay dividends for gamers. That prediction came true with the remastered Tomb Raider: Definitive Edition for Xbox One™ and PS4™. In this revisiting, TressFX Hair made its debut outside of the PC space for the very first time.

tressfx.jpg

“Getting TressFX Hair running on PlayStation® 4 and Xbox One™ benefited from the fact that AMD’s Graphics Core Next (GCN) architecture powers the graphics of these platforms,” said Gary Snethen, Chief Technology Officer of Crystal Dynamics. “We were already familiar with GCN from our collaboration with AMD on Tomb Raider, and that experience was instrumental when it was time to bring TressFX Hair to life on consoles with Tomb Raider: Definitive Edition.”

 

Citing the Graphics Core Next architecture2 as the motivation to broaden the audience for TressFX Hair is an important occasion, as it validated in practice the idea that a common architecture makes it easier to share code across the platforms targeted by a development studio. From another perspective, it shows gamers that this cross-platform simplicity enables new headroom to explore in-game effects—effects that may have gone unused in past generations due to insufficient ROI.

 

TALE OF TWO HAIRCUTS
TressFX Hair was certainly impressive from a visual perspective, but less discussed is the operational efficiency that compels appreciation on both technical and philosophical grounds. To put a fine point on that, we wanted to illustrate the actual performance impact of AMD’s TressFX Hair contrasted against NVIDIA’s Hairworks.

 

In the below diagram, we isolated the specific routine that renders these competing hair technologies and plotted the results. The bars indicate the portion of time required, in milliseconds, to render the hair from start to finish within one frame of a user’s total framerate. In this scenario, a lower bar is better as that demonstrates quicker time to completion through more efficient code.

tfx_tr_perf.png

In the diagram, you can see that TressFX Hair exhibits an identically low performance impact on both AMD and NVIDIA hardware at just five milliseconds. Our belief in “doing the work for everyone” with open and modifiable code allowed Tomb Raider’s developer to achieve an efficient implementation regardless of the gamer’s hardware.

 

In contrast, NVIDIA’s Hairworks technology is seven times slower on AMD hardware with no obvious route to achieve cross-vendor optimizations as enabled by open access to TressFX source. As the code for Hairworks cannot be downloaded, analyzed or modified, developers and enthusiasts alike must suffer through unacceptably poor performance on a significant chunk of the industry’s graphics hardware.

 

With TressFX Hair, the value of openly-shared game code is clear.

 

WHAT’S NEXT FOR TRESSFX HAIR?

As Crystal Dynamics worked to bring TressFX to other platforms, we have been busy developing an even newer version of our award-winning hair tech. In November we announced “TressFX 2.0,” an update to the effect that brings several notable changes:

  • New functionality to support for grass and fur
  • Continuous levels of details (LODs) are designed to improve performance by dynamically adjusting visual detail as TressFX-enabled objects move towards and away from the player’s POV
  • Improved efficiency with many light sources and shaders via deferred rendering
  • Superior self-shadowing for better depth and texture in the hair
  • Even more robust scalability across GPUs of varying performance envelopes (vs. TressFX 1.0)
  • Modular code and porting documentation
  • Stretchiness now respects the laws of physics
  • and numerous bug fixes!

 

Starting with grass and fur, implementing realistic physics for these objects is rather similar to hair: treat each strand as a chain, group chains together, and then apply an external force. There is obviously some voodoo at work to make grass and fur behave more like grass and fur, and rather less like long hair, but the principles are so similar that they’re a logical extension to TressFX’s capabilities.

 

In designing TressFX 2.0, we addressed a notable issue in our hair physics simulation: stretchiness. Extreme linear and angular acceleration of a fast-moving or fast-turning character could cause the hair sim to appear unnaturally stretchy. In very rare instances, the physics model could even prevent the hair from ever recovering its original length.

 

While AMD and Crystal Dynamics were largely able to overcome this problem by performing rolling iterations of a “length constraint” system in Tomb Raider, we wanted to fix it permanently and more efficiently. TressFX 2.0 addresses this issue head-on through R&D and the creation of a new General Constraint Formulation, which is designed to be considerably more accurate than the old model at dealing with the forces of acceleration on a head of hair’s global and local (per-strand) level.

Additionally, we overhauled the math behind the aforementioned “chain” structure of hair, grass and fur. We now use the Thomas Algorithm to evaluate the behavior of these objects, and this is notable because the Thomas Algorithm is very efficient and lightweight with respect to GPU number crunching. The end result for you: hair that behaves more realistically.

tressfx_math.png

Behind-the-scenes R&D work for TressFX 2.0; simplifying the TressFX Hair algorithm.

 

Next, we wanted to illustrate the impact self-shadowing (right) has on the texture and depth found in a head of hair

shadowing2.jpg

Finally, we’ll take a look at TressFX 2.0’s LOD levels. As indicated earlier in this blog, a LOD level brings scalable detail to a system of 3D objects. As LOD-enabled objects move away from you, detail is reduced by a system that sustains the apparent quality for the player—you shouldn’t notice a thing if we do our job right! Inversely, when an object moves closer to you, the detail levels are slowly dialed up to maximum in a manner that, again, should be largely imperceptible to the player.

 

The primary benefit of a LOD system is an improvement in overall system performance. With LOD levels, the GPU needn’t render a full-detail head of hair when those details are beyond the visual acuity of the player’s position in the game world.

lod_levels.png

THE FIRST TRESSFX 2.0 GAME

Beyond improvements to the effect, many gamers have asked about the next game to use TressFX Hair, and I’m pleased to say it’s Lichdom: Battlemage! The team at Xaviant is making healthy use of TressFX Hair 2.0, and had this to say about their decision to adopt TressFX:

 

“TressFX Hair is the most impressive advancement in visual fidelity in the past 24 months,” said Michael McCain, CEO and Founder, Xaviant. “TressFX proved that significant leaps in realism are still possible, even in an age where many have expressed skepticism about the very possibility of such a leap occurring. The beauty, simplicity and performance of TressFX—especially compared to its alternatives—made it an obvious choice to augment the commitment to image quality we have for Lichdom.”

 

Lichdom: Battlemage prominently uses TressFX Hair to render the female version of The Gryphon, a companion/aid to the player when a male protagonist is chosen. Her short bob haircut moves and shines just as you would expect real hair to do.

 

 

A MULTI-PLATFORM WORLD

TressFX Hair took the PC gaming world by storm, chiefly because it demonstrated that 3D graphics needn’t be incremental improvements—big and unexpected leaps can still happen! We were (and still are) very proud of that fact.

 

TressFX Hair also demonstrated the power of being transparent with your code when working with game developers. By collaborating so closely with Crystal Dynamics on TressFX Hair, we were able to make the technology efficient for all hardware, quickly incorporate the lessons and feedback from Tomb Raider™ into the 2.0 version TressFX, and make those improvements publicly available in source code form for adoption in games like Lichdom: Battlemage!

 

Finally and excitingly for gamers everywhere, Crystal Dynamics’ decision to adopt TressFX Hair for Tomb Raider: Definitive Edition shows that cross-pollination between PCs and consoles is not only possible, but happening right now and improving the overall experience on all platforms.

 

SUPPORTING RESOURCES

 


Robert Hallock does Technical Communications for Gaming & Desktop Graphics at AMD.  His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.


FOOTNOTES:

  1. All performance evaluation conducted on the following platform: Intel Core-i7 4960X, ASUS X79 Sabertooth, 16GB DDR3-1866, Windows 8.1 x64. AMD Driver: 14.2 Beta 1.3. NVIDIA Driver: 334.89. Settings: 1920x1080, maximum in-game quality preset.
  2. >Select AMD Radeon graphic cards are based on the GCN Architecture and include its associated features (AMD PowerTune technology, AMD ZeroCore Power technology, PCI Express 3.0, etc.). Not all features are supported by all products—check with your system manufacturer for specific model capabilities.

3399_RADEON30_EMAILBNR_FNL.png

 

AMD and its retail partners would like to thank you for tuning into the #AMD30Live broadcast to celebrate 30 years of graphics and gaming with us! To commemorate the occasion, many of those fine retailers have assembled some killer deals for gamers all over the world. Check the list below and jump on the ones you gotta have!

 

RETAILERTHE DEALNEED IT?
NeweggGet a free SSD when you buy an AMD Radeon™ R9 295X2Get it!
Overclockers.co.ukGet a limited edition metal box Sapphire Radeon™ R9 295X2 and a FREE Superflower 1200W power supplyGet it!
TigerDirectSave up to $250 when you buy a new PSU with an AMD Radeon™ R9 295X2Get it!
NCIXSave up to $200 when you buy a new PSU with an AMD Radeon™ R9 295X2Get it!
NeweggSave up to $225 when you buy a new PSU with an AMD Radeon™ R9 295X2Get it!
CyberPowerPCBuying a rig? Get a free upgrade from an AMD Radeon™ R9 270X 2GB to XFX Radeon™ R9 280 3GBGet it!
iBUYPOWERPCs starting from $589 with a free upgrade to an AMD FX-6300 CPUGet it!
TigerDirectSave big on CPU+GPU bundles featuring AMD Radeon R9 290, R9 270 or R7 250 graphicsGet it!
LDLCGreat deals on an AMD Radeon™ R9 290 and power supply bundle starting at €379.95Get it!
CaseKingSave big when you purchase an XFX brand AMD Radeon™ graphics card and a Leadex power supplyGet it!
CSLGet a complete AMD Radeon™ R9 280X-based gaming PC from just €1099Get it!
UlmartMultiply retailer bonuses by five with the purchase of a Sapphire Radeon™ R9 290XGet it!
DanteGet a great deal on a Sapphire Radeon™ R9 295X2 and a Dell display!Get it!
Flipkart

CHOOSE ONE:
Buy any AMD Radeon™ R9 Series graphics card and get select games free.

Buy any AMD Radeon™ R9 Series GPU and get a chance to win a free gaming headset, mouse, keyboard and mousepad!
Buy any AMD Radeon™ R9 Series GPU and get 20% off any PC game. Offer valid for 90 days.

Get 'em!

As we head into the dog days of summer, EA wants to give thanks to all their loyal players and would  like to do that with a big ol’ AMD Radeon™-filled program called Battlefest! The program kicked off on July 9th, so you should get in on the action immediately after giving the below details a read!

battlefest.png

Here's what you need to know about Battlefest:

  • From July 9th through August 13th, there will be a daily contest called “Battleshots.” EA will ask you to send a screenshot in Battlefield 4™ based on a theme of their choosing. Screenshots get submitted here. Each day, they will crown a winning screenshot that will win an AMD Radeon™ graphics card, a DICE store gift card, and a Battlefield 4™ Premium membership on the platform of your choosing. (See official rules.)
  • Each Friday, the Battlefield™ team will be releasing a free camo unlock for all players.
  • To kick off the program on July 12th-13th there was a double XP weekend!
  • Each week of Battlefest will feature a global community challenge to reach an in-game goal. If the global BF4 community meets the goal, everyone gets a gold Battlepack. The first Community Mission begins July 15 with a challenge to reach 15 million revives by July 20. Good luck, soldiers!
  • Last but not least, the Stunt Video Competition runs July 14 through August 2. We want you to send us your best stunt video that can only be done in Battlefield 4™. The DICE team will pick the top 12 winners and then you, the loyal fans, will vote on the top three winners to receive a screamin’ fast AMD-based PC valued at $3100 US! The nine runners up won’t go home empty-handed, either: each one will receive a high-end AMD Radeon™ GPU. (See official rules.)

That’s it! A month of “thank you” to everyone. Keep your eyes on the Battlefield™ Blog for even more Battlefest prizes and announcements in the weeks ahead!


Robert Hallock does Technical Communications for Desktop Graphics at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

Two new games, Plants vs. Zombies Garden Warfare and Sniper Elite III, both became available recently. These games have been hotly anticipated, and both offer gameplay that’s novel and engrossing. What’s more, both games benefit from close cooperation with AMD, and Sniper Elite III even joins our Never Settle Forever game bundle program.

 

In Sniper Elite III, the developers’ goal is to faithfully simulate what it’s like to be a real military sniper. That means stealth is paramount, wind and ballistic effects are realistic and you must be careful and deliberate in your moves. Every shot is precious, and the damage done is deconstructed in slow motion to convey the devastation a sniper’s bullet causes. The tension runs high.

 

The experience of playing this game really is unique. For those who prefer sniping to running and gunning, it’s hard to beat. Only seeing the game in action can really make the point. Click here to watch the trailer. (WARNING: This game isn’t for everyone. It’s extremely graphic, and so is the trailer. It’s rated M for mature audiences by the ESRB and 16+ by PEGI).


Sniper Elite III is part of the Never Settle Forever Gold and Silver tiers. Buyers of qualifying AMD Radeon™ graphics cards can get it now.

 

If shredding human anatomy isn’t your thing, don’t despair. Plants vs. Zombies Garden Warfare combines strategy, tactics and lighthearted fantasy weapons into addictive, fast-paced multiplayer mayhem.

 

Its unique cartoon style sets the mood, but don’t mistake its appearance for a lack of sophistication; it runs on the same powerful Frostbite 3 engine that powers Battlefield 4. Many elements of the multiplayer action will feel familiar to fans of the Battlefield games.

 

Plant mushroom sentries, call in corn strikes or tunnel beneath the lawn for a sneak attack. The action is definitely weird, but that’s part of the series’ long-standing appeal.

 

Rated for players age 10 and above, and making relatively modest demands on hardware, Plants vs. Zombies Garden Warfare is sure to be a hit with a very large number of PC gamers. It’s is one of three titles developed by EA in conjunction with our Gaming Evolved program for this season. Dragon Age Inquisition and Battlefield Hardline will follow.

 

Just in case the arrival of these games doesn’t excite you enough, take note that both also will also support the Mantle API. Plants vs. Zombies Garden Warfare can be run with Mantle now; Sniper Elite III will have Mantle support activated in an upcoming update. The end result is that games using AMD technology  can get the best possible gameplay experience with these titles, and Mantle will have its already-strong reputation bolstered with two new and popular games added to its portfolio.

 

It spite of the slowdown that usually happens this time of year, there couldn’t be a more exciting time for AMD Gaming Evolved and Never Settle Forever.

 

Happy hunting!

 

Jay Lebo is a Product Marketing Manager at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

Filter Blog

By date:
By tag: