Skip navigation
1 2 3 Previous Next

Gaming

32 Posts authored by: rhallock Employee

Hi, everyone! About two weeks ago we started the first of many planned “Community Update” blogs about the AMD Ryzen™ ecosystem. In the initial update, we promised all sorts of goodies for our customers. Today we’re back to make good on that promise with some important updates on topics you proposed: performance tuning and BIOS updates.

 

Unleashing Ryzen in Ashes of the Singularity™

 

Enthusiasts aren’t strangers to the advanced game engine inside Ashes of the Singularity. Ashes distinguished itself early on as a visionary new breed of PC game that plainly proved the potential of low-overhead APIs, and it continues today as an often-updated game that can be punishing even at 1080p. As a bonus, the benchmark capabilities built into Ashes of the Singularity produce a refreshingly candid level of detail. It’s no surprise why the community has rallied around Ashes as a great game and a great test for new hardware.

 

Behind the scenes, AMD has enjoyed a great relationship with the teams at Stardock and Oxide Games. They were early supporters of the Mantle API project and have often collaborated with us on precision-tuned rendering paths for Radeon™ GPUs. This month, they were once again eager to help when we began our promised effort to work with game devs to extract the full performance of the AMD Ryzen™ processor.

 

After just a week or two of work, we’re pleased to report that a new build (v2.11.x) of Ashes of the Singularity is hitting Steam™ today with performance optimizations for the AMD Ryzen™ processor. Compared to version 2.10.25624 featured in the initial reviews for the AMD Ryzen 7 processors, this optimized build averaged a whopping 30% faster when we put it through our labs on the AMD Ryzen 7 1800X CPU.1

 

ahses1.png

System configuration: AMD Ryzen™ 7 1800X Processor, 2x8GB DDR4-2933 (15-17-17-35), GeForce GTX 1080 (378.92 driver), Gigabyte GA-AX370-Gaming5, Windows® 10 x64 build 1607, 1920x1080 resolution, high in-game quality preset.

 

As an additional layer of validation, we also tabulated some results for the CPU-Focused test (below). The CPU-focused test attempts to deemphasize the GPU and focus specifically on how well the processor is driving up game performance. A better result in this test positively correlates with the performance bottleneck being moved to the GPU where it belongs. Results for our optimizations were again notable, with the average performance of the AMD Ryzen™ 7 1800X jumping by 14.29%.

 

System configuration: AMD Ryzen™ 7 1800X Processor, 2x8GB DDR4-2933 (15-17-17-35), GeForce GTX 1080 (378.92 driver), Gigabyte GA-AX370-Gaming5, Windows® 10 x64 build 1607, 1920x1080 resolution, high in-game quality preset.

 

As a parting note on Ashes of the Singularity goodness, a major new update (v2.20.x) will soon be releasing with some great new features: game replays, mod support, three new maps, and a huge number of balance tweaks. The work AMD, Oxide, and Stardock have done for the AMD Ryzen™ processor will be carried forward, and you can learn more about the 2.20.x changes at the official Stardock forums.

 

Boosting minimum framerates in DOTA™ 2

 

Many gamers know that an intense battle in DOTA 2 can be surprisingly demanding, even on powerful hardware. But DOTA has an interesting twist: competitive gamers often tell us that the minimum framerate is what matters more than anything in life or death situations. Keeping that minimum framerate high and steady keeps the game smooth, minimizes input latency, and allows players to better stay abreast of every little change in the battle.

 

As part of our ongoing 1080p optimization efforts for the AMD Ryzen™ processor, we identified some fast changes that could be made within the code of DOTA to increase minimum framerates. In fact, those changes are already live on Steam as of the March 20 update!

 

We still wanted to show you the results, so we did a little A:B test with a high-intensity scene developed with the assistance of our friends in the Evil Geniuses eSports team. The results? +15% greater minimum framerates on the AMD Ryzen™ 7 1800X processor2, which lowers input latency by around 1.7ms.

 

Not bad for some quick wrenching under the hood, and we’re continuing to explore additional optimization opportunities in this title.

 

System configuration: AMD Ryzen™ 7 1800X Processor, 2x8GB DDR4-2933 (15-17-17-35), GeForce GTX 1080 (378.92 driver), Gigabyte GA-AX370-Gaming5, Windows® 10 x64 build 1607, 1920x1080 resolution, tournament-optimized quality settings.

 

Let’s talk BIOS updates

 

Finally, we wanted to share with you our most recent work on the AMD Generic Encapsulated Software Architecture for AMD Ryzen™ processors. We call it the AGESA™ for short.

 

As a brief primer, the AGESA is responsible for initializing AMD x86-64 processors during boot time, acting as something of a “nucleus” for the BIOS updates you receive for your motherboard. Motherboard vendors take the baseline capabilities of our AGESA releases and build on that infrastructure to create the files you download and flash.

 

We will soon be distributing AGESA point release 1.0.0.4 to our motherboard partners. We expect BIOSes based on this AGESA to start hitting the public in early April, though specific dates will depend on the schedules and QA practices of your motherboard vendor.

 

BIOSes based on this new code will have four important improvements for you

  1. We have reduced DRAM latency by approximately 6ns. This can result in higher performance for latency-sensitive applications.
  2. We resolved a condition where an unusual FMA3 code sequence could cause a system hang.
  3. We resolved the “overclock sleep bug” where an incorrect CPU frequency could be reported after resuming from S3 sleep.
  4. AMD Ryzen™ Master no longer requires the High-Precision Event Timer (HPET).

 

We will continue to update you on future AGESA releases when they’re complete, and we’re already working hard to bring you a May release that focuses on overclocked DDR4 memory.

 

Until next time

 

What are you interested in hearing more about in our next AMD Ryzen Community Update? Let us know on Twitter @AMDRyzen.

 


Robert Hallock is a technical marketing guy for AMD's CPU division. His/her postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

Footnotes:

  1. Testing conducted by AMD performance labs as of 3/27/2015. Baseline Ashes of the Singularity version (2.10.25624): 63.85 average FPS of all batches (avg FPS for normal, medium and large batches 68.62, 63.65 and 59.8 respectively). New version (2.11.x): 83.7 average FPS of all batches (avg FPS for normal, medium and large batches 92.25, 84.65 and 75.6 respectively). Total % increase in avg FPS for all batches: 31.1%. System configuration:  AMD Ryzen 7 1800X, 2x8GB DDR4-2933 (15-17-17-35), GeForce GTX 1080 (378.92 driver), Gigabyte GA-AX370-Gaming5, Windows 10 x64 1607, 1920x1080 Resolution, HIGH image quality preset. RZN-27
  2. Testing conducted by AMD performance labs as of 3/27/2015. Pre-March 20 update: 79 average minimum FPS. Post-March 20 update: 91 average minimum FPS. Uplift: 15%. System configuration:  AMD Ryzen 7 1800X, 2x8GB DDR4-2933 (15-17-17-35), GeForce GTX 1080 (378.92 driver), Gigabyte GA-AX370-Gaming5, Windows 10 x64 1607, 1920x1080 Resolution, HIGH image quality preset. RZN-28

The AMD Ryzen™ processor is a completely new and different platform from what gamers may be accustomed to, and established practices for configuring a system may prove incorrect or unreliable. We’ve assembled the following configuration steps to ensure users are extracting the best possible performance and reliability from their new PC.

 

Update Your Firmware

Ensure that you are using the latest UEFI ROM for your motherboard.

  1. The latest ROMs will support the Windows 10 tickless kernel for best application performance.
  2. Newer ROMs can improve the functionality/stability of your motherboard and its UEFI menu options.

 

Memory Matters

AMD Ryzen™ processors have an appetite for faster system RAM, but it’s important to ensure that you have a solid setup before proceeding.

 

  1. The AMD Ryzen™ processor does not offer memory dividers for DDR4-3000 or DDR4-3400. Users shooting for higher memory clocks should aim for 3200 or 3500 MT/s.
  2. Memory vendors have also begun to validate 32GB (4x8GB) kits at 3200 MT/s rates for select motherboards.
  3. Ensure that you are programming your BIOS with the recommended timings (CAS/tRCD/tRP/tRAS/tRC/CMD) and voltages specified on the DRAM packaging.
  4. To ensure reliable POST, the AMD Ryzen™ processor may fall back to a DIMM’s JEDEC SPD “safe” timings in the event an overclock proves unreliable. Most DIMMs are programmed to boot at DDR4-2133 unless otherwise instructed by the BIOS, so be sure your desired overclock is in place before performance testing. Use CPU-Z in Windows to confirm.
  5. For speed grades greater than DDR4-2667, please refer to a motherboard vendor’s memory QVL list. Each motherboard vendor tests specific speeds, modules, and capacities for their motherboards, and can help you find a memory pairing that works well. It is important you stick to this list for the best and most reliable results.1
  6. We have internally observed good results from 2933, 3200, and 3500 MT/s rates with 16GB kits based on Samsung “B-die” memory chips. Potential kits include:
    • Geil EVO X - GEX416GB3200C16DC [16-16-16-36 @ 1.35v]
    • G.Skill Trident Z - F4-3200C16D-16GTZR [16-18-18-36 @ 1.35v]
    • Corsair CMK16GX4M2B3200C16 VERSION 5.39 [16-18-18-36 @ 1.35v]
  7. Finally, as part of AMDs ongoing development of the new AM4 platform, AMD will increase support for overclocked memory configurations with higher memory multipliers. We intend to issue updates to motherboard partners in May that will enable them, on whatever products they choose, to support speeds higher than the current DDR4-3200 limit without refclk adjustments. AMD Ryzen™ processors already deliver great performance in prosumer, workstation, and gaming workloads, and this update will permit even more value and performance for enthusiasts who chose to run overclocked memory.
  8. AMD’s officially-supported DRAM configurations are below for your reference:

    DDR4 Speed (MT/s)
    Memory RanksDIMM Quantities
    2667Single2
    2400Dual2
    2133Single4
    1866Dual4


Mind Your Power Plan

Make sure the Windows® 10 High Performance power plan is being used (picture). The High Performance plan offers two key benefits:

 

  1. Core Parking OFF: Idle CPU cores are instantaneously available for thread scheduling. In contrast, the Balanced plan aggressively places idle CPU cores into low power states. This can cause additional latency when un-parking cores to accommodate varying loads.
  2. Fast frequency change: The AMD Ryzen™ processor can alter its voltage and frequency states in the 1ms intervals natively supported by the “Zen” architecture. In contrast, the Balanced plan may take longer for voltage and frequency changes due to software participation in power state changes.

 

In the near term, we recommend that games and other high-performance applications are complemented by the High Performance plan. By the first week of April, AMD intends to provide an update for AMD Ryzen™ processors that optimizes the power policy parameters of the Balanced plan to favor performance more consistent with the typical usage models of a desktop PC.

 

The Observer Effect

Ensure there are no background CPU temperature or frequency monitoring tools when performance is essential. Real-time performance measurement tools can have an observer effect that impacts performance, especially if the monitoring resolution (>1 sample/sec) is increased.

 

Overclocking!

Overclocking is a time-tested and beloved way to squeeze even more “free” performance out of a system. That’s why every AMD Ryzen™ processor is unlocked for overclocking.2

 

Consider the example of the AMD Ryzen™ 7 1700 processor. It has a base clock of 3.0GHz, a two-core boost clock of 3.7GHz, an all-cores boost clock of 3.1GHz, and a 2-core XFR clock of 3.75GHz. Many have reported all-core overclocks of around 3.9GHz, which is a full 25% higher than the default behavior of the CPU.

 

PUTTING IT ALL TOGETHER

To test the performance impact of all of these various changes, we threw together a brand new Windows 10-based system with the following specifications:

 

  • AMD Ryzen™ 7 1800X (8C16T/3.6-4.0GHz)
  • 16GB G.Skill (2x8) DDR4-3200
    • Clocked to 2133MT/s: 15-15-15-35-1t
    • Clocked to 2933MT/s: 14-14-14-30-1t
  • ASUS Crosshair VI Hero (5704 BIOS)
  • 1x AMD Radeon™ RX 480 GPU (Radeon Software 17.2.1)
  • Windows 10 Anniversary Update (Build 14393.10)

 

Throughout this process we also discovered that F1™ 2016 generates a CPU topology map (hardware_settings_config.xml) when the game is installed. This file tells the game how many cores and threads the system’s processor supports. This settings file is stored in the Steam™ Cloud and appears to get resynced on any PC that installs F1™ 2016 from the same Steam account. Therefore: if a user had a 4-core processor without SMT, then reused that same game install on a new AMD Ryzen™ PC, the game would re-sync with the cloud and believe the new system is also the same old quad core CPU.

 

Only a fresh install of the game allowed for a new topology map that better interpreted the architecture of our AMD Ryzen™ processor. Score one for clean computing! But it wasn’t a complete victory. We also discovered that the new and better topology map still viewed Ryzen™ as a 16-core processor, rather than an 8-core processor with 16 threads. Even so, performance was noticeably improved with the updated topology map, and performance went up from there as we threw additional changes into the system.

 

As an ultimate maneuver, we asked the question: “Can we edit this file?” The answer is yes! As a final step, we configured F1™ 2016 to use 8 physical CPU cores, rather than the 16 it was detecting by default. Performance went up again! After all was said and done, we gained a whopping 35.53% from our baseline configuration showing how a series of little changes can add up to something big.

 

The picture tells the story clear as day: configuration matters.

 

 


Robert Hallock is a technical marketing guy for AMD's CPU division. His/her postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

Footnotes:

1. Overclocking memory will void any applicable AMD product warranty, even if such overclocking is enabled via AMD hardware and/or software.  This may also void warranties offered by the system manufacturer or retailer or motherboard vendor.  Users assume all risks and liabilities that may arise out of overclocking memory, including, without limitation, failure of or damage to RAM/hardware, reduced system performance and/or data loss, corruption or vulnerability.  GD-112
2. AMD processors, including chipsets, CPUs, APUs and GPUs (collectively and individually "AMD processor"), are intended to be operated only within their associated specifications and factory settings. Operating your AMD processor outside of official AMD specifications or outside of factory settings, including but not limited to the conducting of overclocking using the Ryzen Master overclocking software, may damage your processor, affect the operation of your processor or the security features therein and/or lead to other problems, including but not limited to damage to your system components (including your motherboard and components thereon (e.g., memory)), system instabilities (e.g., data loss and corrupted images), reduction in system performance, shortened processor, system component and/or system life, and in extreme cases, total system failure. It is recommended that you save any important data before using the tool.  AMD does not provide support or service for issues or damages related to use of an AMD processor outside of official AMD specifications or outside of factory settings. You may also not receive support or service from your board or system manufacturer. Please make sure you have saved all important data before using this overclocking software.

It’s been about two weeks since we launched the new AMD Ryzen™ processor, and I’m just thrilled to see all the excitement and chatter surrounding our new chip. Seems like not a day goes by when I’m not being tweeted by someone doing a new build, often for the first time in many years. Reports from media and users have also been good:

 

  • “This CPU gives you something that we needed for a long time, which is a CPU that gives you a well-rounded experience.” –JayzTwoCents
  • Competitive performance at 1080p, with Tech Spot saying the “affordable Ryzen 7 1700” is an “awesome option” and a “safer bet long term.”
  • ExtremeTech showed strong performance for high-end GPUs like the GeForce GTX 1080 Ti, especially for gamers that understand how much value AMD Ryzen™ brings to the table
  • Many users are noting that the 8-core design of AMD Ryzen™ 7 processors enables “noticeably SMOOTHER” performance compared to their old platforms.

 

While these findings have been great to read, we are just getting started! The AMD Ryzen™ processor and AM4 Platform both have room to grow, and we wanted to take a few minutes to address some of the questions and comments being discussed across the web.

 

Thread Scheduling

We have investigated reports alleging incorrect thread scheduling on the AMD Ryzen™ processor. Based on our findings, AMD believes that the Windows® 10 thread scheduler is operating properly for “Zen,” and we do not presently believe there is an issue with the scheduler adversely utilizing the logical and physical configurations of the architecture.

 

As an extension of this investigation, we have also reviewed topology logs generated by the Sysinternals Coreinfo utility. We have determined that an outdated version of the application was responsible for originating the incorrect topology data that has been widely reported in the media. Coreinfo v3.31 (or later) will produce the correct results.

 

Finally, we have reviewed the limited available evidence concerning performance deltas between Windows® 7 and Windows® 10 on the AMD Ryzen™ CPU. We do not believe there is an issue with scheduling differences between the two versions of Windows.  Any differences in performance can be more likely attributed to software architecture differences between these OSes.

 

Going forward, our analysis highlights that there are many applications that already make good use of the cores and threads in Ryzen, and there are other applications that can better utilize the topology and capabilities of our new CPU with some targeted optimizations. These opportunities are already being actively worked via the AMD Ryzen™ dev kit program that has sampled 300+ systems worldwide.

 

Above all, we would like to thank the community for their efforts to understand the Ryzen processor and reporting their findings. The software/hardware relationship is a complex one, with additional layers of nuance when preexisting software is exposed to an all-new architecture. We are already finding many small changes that can improve the Ryzen performance in certain applications, and we are optimistic that these will result in beneficial optimizations for current and future applications.

 

Temperature Reporting

The primary temperature reporting sensor of the AMD Ryzen™ processor is a sensor called “T Control,” or tCTL for short. The tCTL sensor is derived from the junction (Tj) temperature—the interface point between the die and heatspreader—but it may be offset on certain CPU models so that all models on the AM4 Platform have the same maximum tCTL value. This approach ensures that all AMD Ryzen™ processors have a consistent fan policy.

 

Specifically, the AMD Ryzen™ 7 1700X and 1800X carry a +20°C offset between the tCTL° (reported) temperature and the actual Tj° temperature. In the short term, users of the AMD Ryzen™ 1700X and 1800X can simply subtract 20°C to determine the true junction temperature of their processor. No arithmetic is required for the Ryzen 7 1700. Long term, we expect temperature monitoring software to better understand our tCTL offsets to report the junction temperature automatically.

 

The table below serves as an example of how the tCTL sensor can be interpreted in a hypothetical scenario where a Ryzen processor is operating at 38°C.

 

Product NameTrue Junction Temp (Example)tCTL Offset for Fan Policy
Temp Reported by tCTL
AMD Ryzen™ 7 1800X38°C20°C58°C
AMD Ryzen™ 7 1700X38°C20°C58°C
AMD Ryzen™ 7 170038°C0°C38°C

 

Power Plans

Users may have heard that AMD recommends the High Performance power plan within Windows® 10 for the best performance on Ryzen, and indeed we do. We recommend this plan for two key reasons:

  1. Core Parking OFF: Idle CPU cores are instantaneously available for thread scheduling. In contrast, the Balanced plan aggressively places idle CPU cores into low power states. This can cause additional latency when un-parking cores to accommodate varying loads.
  2. Fast frequency change: The AMD Ryzen™ processor can alter its voltage and frequency states in the 1ms intervals natively supported by the “Zen” architecture. In contrast, the Balanced plan may take longer for voltage and frequency (V/f) changes due to software participation in power state changes.

In the near term, we recommend that games and other high-performance applications are complemented by the High Performance plan. By the first week of April, AMD intends to provide an update for AMD Ryzen™ processors that optimizes the power policy parameters of the Balanced plan to favor performance more consistent with the typical usage models of a desktop PC.

 

Simultaneous Multi-threading (SMT)

Finally, we have investigated reports of instances where SMT is producing reduced performance in a handful of games. Based on our characterization of game workloads, it is our expectation that gaming applications should generally see a neutral/positive benefit from SMT. We see this neutral/positive behavior in a wide range of titles, including: Arma® 3, Battlefield™ 1, Mafia™ III, Watch Dogs™ 2, Sid Meier’s Civilization® VI, For Honor™, Hitman™, Mirror’s Edge™ Catalyst and The Division™. Independent 3rd-party analyses have corroborated these findings.

 

For the remaining outliers, AMD again sees multiple opportunities within the codebases of specific applications to improve how this software addresses the “Zen” architecture. We have already identified some simple changes that can improve a game’s understanding of the "Zen" core/cache topology, and we intend to provide a status update to the community when they are ready.

 

Wrap-up

Overall, we are thrilled with the outpouring of support we’ve seen from AMD fans new and old. We love seeing your new builds, your benchmarks, your excitement, and your deep dives into the nuts and bolts of Ryzen. You are helping us make Ryzen™ even better by the day.  You should expect to hear from us regularly through this blog to answer new questions and give you updates on new improvements in the Ryzen ecosystem.

In the last five years, eSports has grown from a little-known niche corner of the gaming market to a global phenomenon on-track to reach $1 billion in revenue by 2019.

 

At the heart of this trend lies Twitch, which has helped feed the growth of eSports by serving as a cultural nexus for gamers (like me!) enjoying a community of like-minded people.

The simplicity and reach of Twitch’s platform has cultivated a new field of tools (like Radeon™ ReLive) that make it possible to broadcast game footage, audio, webcams, overlays, and other multimedia to legions of fans. In fact, gamers watched 4 billion man-hours of gameplay in 2015 alone!

 

But the simplicity of broadcasting to Twitch can come with some steep hardware requires. According to Twitch customer support: "many broadcasters will find that they get a lot of ‘input lag’ when playing video games.”

 

“Some games are very CPU-intensive and require a strong computer to run. These games are tough on your processor, especially if you are running the game on the highest settings,” Twitch Support reads. “To make matters worse, streaming is an extremely CPU-intensive process. Combine these two together, and it is trouble. If, on top of that, you open a browser to read chat, another program to play music, and a third program to keep track of donations, you might find that your game lags more than you would like.”

 

The proposed solution is expensive: “Use two computers to split up the workload.” One system plays the game, and a second system with a capture card receives output from the GPU and serves as a dedicated broadcasting system to alleviate performance bottlenecks. Many streamers will be familiar with this.

 

Many broadcasters also say the rise of hardware-based video encoding has not done much to address the needs of streamers that expect the best quality for their viewers. Many streamers also agree that the tight 3500Kbps bitrate limits of Twitch, and the short render-to-broadcast window for a timely stream, put the GPU at a disadvantage. Users often report that fixed-function encoders in CPUs and GPUS need more bitrate to achieve the same quality as the CPU-based x264 encoder preconfigured on streaming packages like Open Broadcaster Software (OBS) and XSplit. Though fixed-function encoders are getting better all the time, and work wonders for recording gameplay to disk, streamers often still rely on processors to give the best result for their fans.

 

 

Ultimately, these perspectives highlight that the typical 4C4T or 4C8T processors simply doesn’t offer enough performance to keep up with the demands of simultaneous gaming and video encoding. For such enthusiastic gamers, the AMD Ryzen™ 7 1700 can be a welcome relief.

 

With eight physical cores and 16 threads, one system with this one consumer processor now has enough hardware to simultaneously dedicate a full 4C8T to both the encoding and gaming workloads. Paired with a sufficient quantity of RAM and a powerful graphics card, it is possible for just one system to broadcast a top-flight 1080p/60 FPS/3500Kbps stream for viewers with little compromise to the performance or input latency of the game.

 

Since no streamer would willingly give their viewers a stream that fails 18% of the time, the balanced design of the AMD Ryzen™ 7 1700 processor sets the standard for effortless single-system streaming.

 


 

Robert Hallock is a technical marketing guy for AMD's CPU division. His/her postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied. 

 

OBS to Twitch Results: Tested using DOTA™ 2 as of 2/14/2017. OBS Target Settings: 1920x1080 source resolution, 1920x1080 broadcast resolution, 60 FPS broadcast frame rate, 3500Kbps VBR target bitrate, x264 encoder. “Encode Failure Rate” defined as percentage of video frames dropped by x264 encoder due to “CPU too slow” errors. System configs: AMD Reference Motherboard (AMD) and AMD Ryzen™ 7 1700, ASUS X99 STRIX motherboard and Core i7-6900K, 16GB DDR4-2400, GeForce Titan X, NVIDIA driver 21.21.13.7633, Windows 10 x64 RS1. Dropped frame count: 0/23000 (AMD), 4177/23000 (Intel). GD-111

 

Twitch Unique Monthly Broadcaster Source(s): Twitch yearly retrospective (twitch.tv/year/{2012-2015} for 2012-2015; 2016 data source DMR Stats)


Use of third party marks / products is for informational purposes only and no endorsement of or by AMD is intended or implied.

You know that processors can have more cores. You know that processors can have faster cores. But what about smarter cores? That’s a new horizon we’re exploring with AMD SenseMI technology in the all-new AMD Ryzen™ processor!

 

AMD SenseMI wraps up five features that work in concert to enhance the performance of the AMD Ryzen CPU1.  Some of the features optimize power and clockspeeds, while others bring important data into the processor or optimize processor pathways for new work. Altogether they make a rational intelligent machine that’s constantly obsessing over how to optimize performance and power efficiency for you and your applications. Let’s take a look!

 

It All Starts With the Infinity Fabric

 

Today’s processors can often be called systems-on-chip (“SoC”), which means support for USB, PCI Express® and SATA are integrated directly into the CPU core. Getting these technologies to communicate with the CPU cores has historically been a time-consuming, expensive, or inefficient task. But the Ryzen SoC is a different beast thanks to the Infinity Fabric.

 

 


The Infinity Fabric is a common interface that allows us to quickly mesh these pieces together and get them “speaking the same language,” almost like snapping toy building blocks together. We also use the Infinity Fabric to establish fast communication between groups of CPU cores, as we do in the 8-core Ryzen processors that contain two groups of four cores.

 

Most importantly for AMD SenseMI, the Infinity Fabric gives us command and control powers to nearly all areas of the CPU. That’s crucial because the Ryzen processor has a networked “smart grid” of several hundred sensors, each accurate to 1 milliwatt, 1 milliamp, 1 millivolt and 1°C. These sophisticated sensors are what allow the Ryzen processor to dial in voltages, clockspeeds, and optimal datastreams. Having extensive insight into the readings of these sensors via the Infinity Fabric allows the processor to orchestrate them for best results.

 

Extended Frequency Range

 

Temperature is king when it comes to determining a processor’s maximum clockspeed, as cooler temps improve the efficiency and reliability of the tiny transistors that make up a processor. Other factors in the clockspeed include power draw from the CPU socket, what percentage of the CPU’s circuits are in use, and the distance to maximum thermal output (TDP). But temperature is the one factor that you can easily control with better CPU or chassis cooling.

 

Thanks to AMD SenseMI technology, the Extended Frequency Range (XFR) feature, available on select Ryzen processors, can measure the difference between the current CPU temperature and the operating temperatures we’ve designed the Ryzen processor to handle. If the current temp is sufficiently low, that extra thermal headroom can be converted into extra top-end frequency.

 

For example, the AMD Ryzen™ 7 1700X processor has a maximum clockspeed of 3.8GHz at 60°C, but XFR can automatically get the maximum frequency to 3.9GHz if the current temperature is lower than that. As you can see, select Ryzen processors are capable of giving a little more to users that build premium systems with robust system and CPU cooling. Pretty neat!

 

Precision Boost

 

AMD SenseMI also comes in handy for boosting the CPU clockspeed with a feature we call “Precision Boost.” The same temperature/current/TDP analysis that governs XFR is once again in play to establish the boundaries of safe operation for a Ryzen processor. Like any other processor, we want to make sure that the Ryzen processor consumes only so much power, operates within an expected temperature range, and emits only so much heat (TDP). Overclocking can naturally expand or override these boundaries, but we’re talking out-of-the-box functionality in this case.

 

As long as a Ryzen processor isn’t bumping up against any of those boundaries, Precision Boost can raise the clockspeed in exacting 25MHz increments. Relative to past processors, these small

 

increments allow the Ryzen CPU to get that much closer to a optimal frequency taking all thermal and electrical boundaries into account. The 25MHz increments can also enable higher sustained frequencies by minimizing clockspeed reductions that occur when a reliability threshold is encountered.

 

Example A: A Ryzen processor is running a lightly-threaded workload using just a few CPU cores. Because the other CPU cores are dormant, or working on background tasks, there is significant thermal or electrical headroom for the processor to just go faster. The Ryzen processor can use Precision Boost to convert that headroom into additional clockspeed (e.g. 3.0GHz → 3.7GHz on the AMD Ryzen™ 7 1700X processor).

 

Example B: A Ryzen processor running at 3.8GHz could encounter a heavy workload that’s on a trajectory to use more power than the CPU socket is designed to provide. This is an ordinary and manageable event for processors, and perhaps a short dip to 3.775GHz would be sufficient to correct the trajectory back into expected levels. Precision Boost can make that possible, and the clockspeed could quickly be pushed back to 3.8GHz when the workload lightens. Other processors might have to drop to 3.7GHz, taking off another 75MHz of frequency that a Ryzen processor might not.

 

Pure Power

 

The exemplary power efficiency of the Ryzen processor comes from two key areas: 14nm FinFET manufacturing and low-power design methodologies. Pure Power orchestrates those methodologies, imbuing every Ryzen processor with the power to inspect and adjust its own electrical characteristics.

 

Pure Power is especially vital during manufacturing. When a Ryzen processor rolls off the assembly line, each chip is capable of looking into itself and analyzing the quality of its own silicon. The results of that analysis allows the processor to zero in on an idealized voltage vs. frequency curve for itself. That fine tuning allows the processor to get pretty close to the perfect voltage for a given frequency. A magic wand wouldn’t do much better!

 

During the design phase, this self-tuning opens the door for AMD to reduce or eliminate guardbands, which is “slack” built into the voltage or frequency targets that can compensate for moments when the processor’s automated routines can’t quite nail a specific value. This can happen for any number of reasons, including transient fluctuations in a power supply’s output, or sudden large jumps in CPU utilization. But Ryzen processors came off the line with precise knowledge of themselves, so reducing or eliminating these guardbands allows for higher overall clockspeeds and lower operating voltages for you.

 

And in day-to-day use, Pure Power is aggressively managing dynamic or “operational” power. Idle pieces of the Ryzen processor are downclocked or shut down to trim power, or to reallocate that power to areas of the processor that can productively use it. This technology is called “clock gating.”

 

As an example: We put the AMD Ryzen™ 7 1800X processor against the Core i7-6900K in the demanding POV-Ray test. This test measures the performance of a processor with raytracing, the most realistic form of 3D rendering. As you can see from our data below, the Ryzen 7 1800X enabled a better score and higher performance per watt.2

 

ProcessorPOV-Ray Score
Average System Wall Power
Performance per Watt (Higher is better)
AMD Ryzen 7 1800X3266157.45W20.74
Core i7-6900K2964153.5919.29

 

Neural Net Prediction

 

Where Pure Power, XFR, and Precision Boost cooperate to control power/frequency characteristics of the Ryzen processor, Neural Net Prediction is responsible for anticipating optimal pathways in the processor for the programs you’re running.

 

Neural Net Prediction starts with a true artificial intelligence (AI), which uses a simplified approximation of the human brain (neural net), to learn how your programs behave. Applications, and the languages  used to write those applications, are human-created and have predictable patterns. Humans love patterns, and those patterns hidden in the applications can be learned!

 

The learned patterns form a behavioral history of an application, and that history lets the processor predict what a program is likely to do in the future. The Ryzen processor uses those predictions to pre-load certain capabilities—like storing to RAM, adding numbers, or comparing values together—so they’re ready to go before your application even makes a request. This saves processing time, and contributes to higher processor performance.

 

It’s important to know that the behavioral learning of Neural Net Prediction is temporary. The history is emptied when you launch a new application, or when the PC is reset or powered down. The applications you run re-train the neural net each time, and you might find that the second time you run a benchmark is a little faster than the first. That’s Neural Net Prediction at work!

 

Smart Prefetch

 

Before the Ryzen processor can start to run your applications, relevant data must be brought into the processor and stored in local cache. Cache is ultra-fast memory located right on the processor, and processors like Ryzen achieve peak performance when important data fits into that cache.

 

It’s worth highlighting that “data” typically means “code,” where entire sub-routines of a running program are stored in cache. This can reduce or eliminate the odds that the processor has to reach across the motherboard to retrieve data from your RAM. Although the RAM is only a few inches away from the processor to your eyes, that’s a very long way from the perspective of a processor, so cache is paramount for top performance.

 

But feeding the cache with data is only half the battle. Getting the right data is the other half of the equation, and that’s where Smart Prefetch shines. Smart Prefetch consists of sophisticated learning algorithms that intuit what data is most used and most relevant in your applications. Smart Prefetch can then prioritize the important data, or even predict the important data, so it’s ready to go before the application needs it.

 

Having the next important dataset queued for execution behind the current work helps ensure that the Ryzen processor always has a consistent flow of high-quality data. And with an atypically large 20MB combined cache, Ryzen 7 1800X, 1800 and 1700 processors are uniquely equipped to handle large datasets common in scientific or creative workloads.

 

Wrap-up

 

More cores and faster cores is well-tread ground in the PC industry (though we dare say the Ryzen™ processor is the best blend yet!), but AMD is exploring a new horizon with smarter cores. Armed with sophisticated learning algorithms, neural networks, and uncanny powers of prediction, the Ryzen processor is an incredibly intelligent and rational agent that’s ready and waiting to zero in on the exact level of performance and power efficiency you and your applications deserve.

 

Robert Hallock is a technical marketing guy for AMD. His postings are his/her own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

 


1. Not all AMD Ryzen™ processors offer every feature of AMD SenseMI technology. For specific capabilities of different processor models, please visit www.amd.com. If your system is pre-built, contact your manufacturer for additional information.

 

 

2. Default POVRay rendering preset. AMD system configuration: Ryzen 7 1800X (8C16T/3.6-4.0GHz), 2x8GB DDR4-2400, AMD Customer Reference Motherboard, NVIDIA Titan X (Pascal), NVIDA Driver 21.21.13.7633, Windows 10 x64. Intel system configuration: Core i7-6900K Extreme (8C16T/3.2-3.7GHz), 2x8GB DDR4-2400, ASUS STRIX X99 Gaming motherboard, NVIDIA Titan X (Pascal), NVIDA Driver 21.21.13.7633, Windows 10 x64. Average wall power draw: 157.45W (AMD) vs. 153.59W (Intel). POVRay scores: 3266 (AMD) vs. 2964.12 (Intel). Performance/Watt (higher is better): 3266/157.45W=20.74 score/W (AMD) vs. 2964.12/153.59W=19.29 score/W

Processors have one of the most important jobs in a gaming PC: getting requests from the game to the graphics card. Everything you see and do in your favorite game must first go through the CPU, and a CPU that keeps a hungry GPU fed with a constant stream of data is a delicious recipe for great performance. That relationship was a guiding light in the design of the AMD Ryzen™ processor. We built a high-throughput machine that’s great for hungry GPUs, and today I wanted to share some gaming data with you.

 

Figure A: System configuration AMD: Ryzen 7 1800X (8C16T, 3.6-4.0GHz), 16GB DDR4-2400, AMD reference motherboard, AMD Wraith Max cooler. System configuration Intel: Core i7-6900K (8C16, 3.2-3.7GHz), 16GB DDR4-2400, Asus STRIX X99 Gaming, Intel BXTS13A cooler. Shared configuration: NVIDIA Titan X, 3840x2160 resolution, Samsung 960 PRO 512GB NVMe, graphics driver 21.21.13.7633. Game settings: Ashes of the Singularity (Crazy preset), Battlefield 4 (Ultra preset), DOOM (Ultra preset), GTAV (Default preset), Civilization VI (Ultra preset), Alien: Isolation (Ultra preset, standard SSAO).

 

At first blush, you can already see that performance of the flagship Ryzen 7 1800X processor makes it a great chip for gamers with high-end needs. Average framerates are >60 FPS for the titles we looked at today, and you can see that level of performance across a diverse set of graphics APIs: Vulkan®, DirectX® 12 and DirectX® 11. It’s clear that the 1800X is a processor that’s ready for APIs of today and tomorrow.

 

99th Percentile Frame Rates

You may not be familiar with 99th percentile frame rates (“99th%”), represented above with the dataset on the left half of the cart. This is a groundbreaking approach that objectively measures the smoothness of a game. It was pioneered by my friend and colleague Scott Wasson during his time as Editor-in-Chief and Owner of The Tech Report. His seminal work, “Inside the Second,” sought to explain why games with high framerates could still often feel choppy to users. He did so by asking the following question: how fast are frames being rendered 99% of the time, and how slow is that last 1%?

 

His research showed that a great many games reporting high average framerates were also frequently throwing many slow frames into the mix. That last 1% of all frames took much longer to render than average, and they happened often enough that the naked eye would perceive the game’s motion as choppy. The average FPS value was hiding problematic rendering! He also found that games with higher 99th% framerates just generally felt smoother to play. But you can cut the percentages any way you like, so he also found games that would look good 50% of the time—generating great average framerates—but run very slowly the other 50% of the time. These games felt awful to play, but nobody had objectively demonstrated why before Mr. Wasson’s work.

 

This is why 99th% frame rates are an essential piece of data in our gaming analysis. Higher 99th% values are simply a better measurement of a game’s true experience, because it looks past outliers that can contaminate—for good or bad—the average framerate. So, what about Ryzen? Looking great! The Ryzen™ 7 1800X is definitely a stellar chip in 99th% frame rates, especially in Battlefield™ 4 and DOOM™.

 

Incredible performance for your money

The sensitivity of 99th% frame rate also makes it a great ingredient to help measure the true value of a processor. We know value is important to PC gamers at any price; nobody wants to feel like they paid more than they had to.

 

To objectively measure “value,” we take the average of the 99th% FPS in the six games we just looked at, then plot that level of performance over the suggested retail price. This visualizes how much average performance you’re getting 99% of the time for your hard-earned cash. Dots towards the upper left of the chart represent a better value for you (more performance, less money). The value of the 1800X is simply extraordinary: it offers a super smooth 99th% experience at half the price.

 

Figure B: 99th Percentile Per Dollar is the mean of the 99th percentile frame rates of all tested titles in Figure A, on the same system(s) as Figure A. Core i7-6900K pricing ($1099 USD) obtained from Intel ARK as of 2/1/2017. AMD Ryzen™ 7 1800X pricing ($499 USD) is AMD SEP as of 2/1/2017.

 

Are you ready for Ryzen™?

The AMD Ryzen™ 1800X Processor and nearly 80+ motherboards are available in the market—right now! Gamers should consider a motherboard based on the AMD B350 chipset for single-GPU systems, or the AMD X370 chipset for dual-GPU systems. Pair that with a speedy NVMe SSD, plus 8-16GB of dual channel DDR4-2667, and you’re off to the races with a seriously powerful gaming rig.

 

And if you’ve already pressed the “order” button, let us know on Twitter @AMDRyzen! We’d love to see pictures of your new build when the parts arrive.

 

Robert Hallock is a technical marketing guy for AMD. His postings are his/her own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

Correction notice: The original version of this blog incorrectly indicated that Battlefield™ 4 was running in DirectX® 12 mode. This has been corrected to DirectX™ 11.

Last month we showed you how AMD FX® customers are getting supercharged Battlefield® 1 performance with DirectX® 12. This month, the good news keeps rolling straight into virtual reality: the AMD FX® 4350, 6350, 8350, 8370, 9370 and 9590 are now approved processors for the Oculus Rift™. With the power of these AMD FX® processors and a GPU like the Radeon™ RX 470, an Oculus-approved experience is now incredibly accessible for millions of gamers.

 

A peek behind the curtains

 

Today’s certification is the exciting conclusion to the unveiling of Asynchronous Space Warp (ASW) in October. In a nut shell, ASW allows a PC to compare the differences between two rendered frames to quickly create a third frame with all of its scenery in the proper place. Vitally, the ASW frame is inferred from a comparison, rather than rendered in full, so it’s quick to make.

 

ASW is an important tool to address the often challenging issue of rapid head movement in virtual reality. Fast or erratic movement in a VR environment can be tough for any gaming PC to handle, and that can be compounded as the visual fidelity of the game increases—the margin for error just gets smaller and smaller. If the user is looking around while the PC is between complete frames, the inferred frames from asynchronous space warp are an excellent way to smooth over that margin to sustain a fluent VR environment.

 

It seems uncomplicated from the 10,000-foot view, but asynchronous space warp is compute-intensive—it’s truly predicting the future based on past frames! Accurate ASW requires agility from the underlying PC, and a robust flow of information from the processor to a GPU’s compute pipelines. To be clear, the stakes are pretty high: there is the viscerally unpleasant possibility that players could become sick if the system is not up to the task of enforcing smooth gameplay.

 

We take that responsibility very seriously, and so does our hardware. The Oculus “seal of approval” for these AMD FX processors confirms that our powerful multi-core chips are more than up to the task of delivering the smooth experience you deserve on a budget you’ll love.

Battlefield™ 1 has now been on the scene for a spell, and we hope y’all are having a blast storming the trenches with powerful Great War weapons like the mighty Kolibri. Between rounds, we’ve been crunching the numbers on the new DirectX® 12 renderer in Battlefield 1’s Frostbite Engine, and AMD FX users are in for a real treat: 30-46% higher framerates!1

 

Here it is, plain as day:

bf1_blog.png

But… how?

The secret lies in a DirectX® 12 feature “multi-threaded command buffer recording,” which we covered in detail last year. The short version is pretty straightforward: MTCBR allows a game’s “to-do list”—its geometry, texture, physics, and other requests—to be interpreted and passed to the GPU by multiple CPU cores, rather than just one or two cores as in DirectX® 11.

 

Because the processor can tackle the to-do list more quickly with DirectX® 12, the flow of information into the graphics card can be accelerated, which helps rendering tasks spend less time waiting around for important bits to appear.

 

In software as in real life: having more hands for a complex job just gets things done a little (or a lot) more quickly. See you on the Battlefield!

 

Robert Hallock is an evangelist for CPU/APU technologies and IP at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

FOOTNOTES:

1. Testing conducted by AMD Performance Labs as of 19 October, 2016 on the AMD FX 8370, FX 8350, FX 8300, FX 6350 and FX 6300. Test system: Radeon™ RX 480 GPU, 8GB DDR3-1866, 512GB SanDisk X300 SSD, Windows 10 Pro x64, Radeon™ Software 16.9.2, 1920x1080 resolution, Ultra in-game preset. Average framerates DirectX® 11 vs. 12: AMD FX-8370 (66.9 vs. 86.9), FX-8350 (61.58 vs. 84.89), FX-8300 (58.76 vs. 80.6), FX-6350 (60.03 vs. 80.48), FX-6300 (52.38 vs. 76.24).  PC manufacturers may vary configurations, yielding different results. Results may vary with future drivers. DTV-84

We’ve been watching and listening when you’ve been asking about the status of a UEFI VBIOS for certain Radeon™ GPUs. Those of you who know what that is are likely quite interested in reading the rest of this blog, and you’ll be pleased to know that we have a solution for you.

 

A little background on UEFI

Unified Extensible Firmware Interface (UEFI) is a relatively new standard for motherboard firmware that replaces the classic BIOS firmware standard. UEFI offers neat features like smart hardware monitoring, full color and high-resolution GUIs, PCIe® SSD booting, mouse and flash drive support and more. UEFI is also an essential player in the chain of custody driving the SecureBoot and fast boot features in Windows® 8, 8.1 and 10. Other devices essential to the PC boot process, like GPUs, can also have a firmware that is compliant with the requirements of UEFI.

 

If every device on the system has UEFI-compliant firmware, then a UEFI motherboard can disable a feature called Compatibility Service Module (CSM) to get the fastest possible boot times in Windows.

 

In the race to obtain faster boot times, GPUs are in an interesting position:

  1. Loading a GPU with a UEFI-compliant firmware renders them incompatible with motherboards that still run BIOS firmware. These motherboards will never boot in this configuration.
  2. Loading a GPU with a “legacy” BIOS-compliant firmware maximizes compatibility, ensuring motherboards with a BIOS firmware can boot the GPU. However, UEFI motherboards must enable CSM to interpret and run the “legacy” GPU BIOS—boot times are slowed as a result.
  3. Loading a GPU with a “hybrid” firmware that contains both UEFI and BIOS-compliant firmware works just fine for UEFI motherboards, but some older motherboards with BIOS firmware cannot read the newer hybrid GPU firmware and do not boot.

 

Despite the drawbacks, it seems clear that option #2 is the best way forward to ship a GPU that works with everyone’s hardware. Options #1 and #3 would result in GPUs that simply don’t boot for millions of customers that have otherwise perfectly fine motherboards configured with BIOS firmware.

 

The Rise of UEFI

In recent months, new chipsets and I/O standards (e.g. M.2 or USB 3.1 Gen 2) have driven a wave of new motherboards overwhelmingly based on UEFI firmware. These exciting features have understandably driven a broad-based upgrade cycle that has flushed older motherboards with BIOS firmware out of the market. The appetite for UEFI-compliant GPUs has grown.

 

We anticipated this trend! Since the advent of the Radeon™ R9 300 and Fury Series GPUs, our board manufacturing partners (“AIBs”) have had access to source code suitable for building customized UEFI-based firmwares. Many AIBs have already transitioned to UEFI by including this code in their custom firmware images, or have implemented solutions like “dual-BIOS” switches to work around the potential issues with BIOS-based motherboards. Today, it’s quite easy to find a UEFI-compliant Radeon™ R9 300 or Fury Series GPU that enables a pure EFI boot environment and the fastest boot modes.

 

UEFI GPU Firmware Upgrade

We have been tracking the chatter from a small and passionate group of users with Radeon™ R9 Fury X or R9 Nano GPUs that shipped with BIOS-compliant firmware for compatibility reasons. These users tell us they would prefer UEFI-compliant firmware. We hear you loud and clear, and we want you to know that we’re able to assist on these specific products because they track rather closely to our original hardware/firmware designs.

 

As a result, today we are releasing AMD-built UEFI-compliant GPU firmware for the Radeon™ R9 Fury X and R9 Nano GPUs. These firmware images can be flashed to any Radeon™ R9 Fury X and R9 Nano GPU, respectively, to enable UEFI compliance and a pure EFI boot environment.

 

Download images:

1. Radeon™ R9 Fury X GPU firmware image

2. Radeon™ R9 Nano GPU firmware image

 

We appreciate all of your passionate feedback on this topic, and we hope you enjoy quicker and more secure boot times!

PC gamers that want to game on the go have always faced some tough choices when buying a notebook. Do we buy a gaming notebook that’s great to game on, but tough to carry? Or an ultrathin that’s easy to carry, but tough to game on? Some of us just buy two notebooks. Some of us buy a gaming notebook, wishing it were lighter every time they carry it. Some just buy the ultrathin, acknowledging that comfortable portability is probably more important than gaming over the long run. Every choice has drawbacks.

 

 

dilemma.png

 

Many gamers—myself included!—have dreamed of buying the best of both worlds with a lightweight notebook or 2-in-1 that also supports a powerful external graphics card. The notebook or 2-in-1 could be conveniently lightweight for work, relaxing on the couch, or travel. But, when needed, the PC could also tap into serious framerates and image quality with a powerful external GPU that’s not far from carrying an average gaming notebook. The point is: you choose.

 

A system compatible with AMD XConnect™ technology could offer exactly that.1

 

AMD XConnect™ technology is a new feature debuting in today’s Radeon Software 16.2.2 (or later) graphics driver that makes it easier than ever to connect and use an external Radeon™ graphics card in Windows® 10. External GPU enclosures configured with select Radeon™ R9 GPUs can easily connect to a compatible notebook or 2-in-11 over Thunderbolt™ 3. Best of all, a PC configured with AMD XConnect™ technology and external Radeon™ graphics can be connected or disconnected at any time, similar to a USB flash drive—a first for external GPUs.

 

xconnect.png


And it happens that there’s already one company out there that’s incorporating all of these pieces into an amazing package, which brings me to…

 

AMD XConnect™ In Action: Razer Blade Stealth & Razer Core

 

core.jpg

 

The Razer Blade Stealth with Thunderbolt™ 3 is an exciting new notebook that’s also the first to be compatible with AMD XConnect™ technology. The Razer Core, meanwhile, is an optional external graphics enclosure that connects to the Blade Stealth with Thunderbolt™ 3. Gamers are in for some pretty exciting features/convenience if the Core is configured with a Radeon™ R9 300 Series GPU:

  • Plug in, game on: There’s no need to reboot the PC to connect or disconnect the Razer Core thanks to AMD XConnect™ technology.
  • Flexible displays: Our driver gives you the flexibility to choose between gaming on the Blade Stealth’s display, or standalone monitors of your choice.
  • Upgradeable: We plan to continue testing and adding Radeon™ GPUs to the AMD XConnect™ support list, giving you the power to upgrade beyond the Radeon™ R9 300/Fury Series when the time is right for you.

perf.png

 

 

A Three-Party Collaboration

 

parties.png


The intersection of AMD XConnect™, the Razer Blade Stealth/Core, and Thunderbolt™ 3 is not a coincidence. AMD, Razer, and the Intel Thunderbolt™ group have been working for many months to architect a comprehensive hardware/software solution that brings plug’n’play external graphics to life over Thunderbolt™ 3. The first external graphics solution that “works like it should!”

 

It came from a simple place: we collectively shared a dream that external GPUs were an important step forward for the PC industry, but were adamant that three things were “must haves” for external graphics to finally be a serious option for gamers:

 

  1. The external GPUs had to have a graphics driver with all the right bits for simple plug’n’play use. With AMD XConnect™ technology, Radeon™ R9 300 and Fury Series GPUs now support this in Windows® 10.
  2. The external GPUs had to connect to a system with standardized connectors/cables and enough bandwidth to feed the appetite of a high-end GPU. Thunderbolt™ 3 does that very well.
  3. And the external chassis had to be upgradeable, so users could prolong the life of their system and buy into a performance level that’s right for their needs. The Razer Core supports that with gusto—up to 375W, dual slot, 12.2” PCB. You could fit easily fit a Radeon™ R9 Nano or 390X GPU in there!2

 

And so our joint project began with regular engineering and marketing meetings to design, build and test: drivers, enclosures, cabling, BIOSes, and so much more. After months of work and hundreds of man hours, here we are!

 

The Future of AMD XConnect™ technology

 

Future external GPU solutions from other companies may come in many shapes and sizes. Some may be very compact with integrated mobile Radeon™ GPUs. Other vendors might allow you to buy empty user-upgradeable enclosures that accept desktop Radeon™ GPUs of varying lengths. We foresee that there will be choice, and the choice will be yours.

 

To keep it easy, we will be maintaining a list of systems, system requirements, GPUs and enclosures that are compatible with AMD XConnect™ on www.amd.com/xconnect.

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 


FOOTNOTES:
1. Not all notebooks or 2-in-1s feature AMD XConnect™ technology, and not all external graphics (eGFX) enclosures are pre-configured with an AMD Radeon™ graphics card and/or feature user upgradability. Base system’s software package and BIOS must be configured to support AMD XConnect™ technology. System must have Thunderbolt™ 3 connection. Check with your manufacturer for full specifications and capabilities and visit www.amd.com/xconnect for a list of compatible devices. GD-86   

2. GPU upgrade must be supported by the system and enclosure OEM. New GPU must be supported by AMD XConnect™ technology. Visit your product’s support documentation for additional information. GD-87

 

ATTRIBUTIONS:
THUNDERBOLT AND THE THUNDERBOLT LOGO ARE TRADEMARKS OF INTEL CORPORATION IN THE U.S. AND/OR OTHER COUNTRIES. RAZER, RAZER DISTRESSED LOGO, TRIPLE-HEADED SNAKE LOGO ARE ALL TRADEMARKS OR REGISTERED TRADEMARKS OF RAZER INC. IN THE UNITED STATES AND/OR OTHER COUNTRIES. USB TYPE-C™ AND USB-C™ ARE TRADEMARKS OF USB IMPLEMENTERS FORUM. FALLOUT, FALLOUT: NEW VEGAS, FALLOUT SHELTER, VAULT BOY AND RELATED LOGOS ARE TRADEMARKS OR REGISTERED TRADEMARKS OF BETHESDA SOFTWORKS LLC IN THE U.S. AND/OR OTHER COUNTRIES.

Last week Ashes of the Singularity™ was updated with comprehensive support for DirectX® 12 Asynchronous Compute. This momentous occasion not only demonstrated how fast Radeon™ GPUs are in DirectX® 12 games, but how much “free” performance can be gained with our exclusive support for asynchronous compute.

 

A Brief Primer on Async Compute

Important in-game effects like shadowing, lighting, artificial intelligence, physics and lens effects often require multiple stages of computation before determining what is rendered onto the screen by a GPU’s graphics hardware.

 

In the past, these steps had to happen sequentially. Step by step, the graphics card would follow the API’s process of rendering something from start to finish, and any delay in an early stage would send a ripple of delays through future stages. These delays in the pipeline are called “bubbles,” and they represent a brief moment in time when some hardware in the GPU is paused to wait for instructions.

 

thread.PNGA visual representation of DirectX® 11 threading: graphics, memory and compute operations are serialized into one long production line that is prone to delays.

 

Pipeline bubbles happen all the time on every graphics card. No game can perfectly utilize all the performance or hardware a GPU has to offer, and no game can consistently avoid creating bubbles when the user abruptly decides to do something different in the game world.

 

What sets Radeon™ GPUs apart from its competitors, however, is the Graphics Core Next architecture’s ability to pull in useful compute work from the game engine to fill these bubbles. For example: if there’s a rendering bubble while rendering complex lighting, Radeon™ GPUs can fill in the blank with computing the behavior of AI instead. Radeon™ graphics cards don’t need to follow the step-by-step process of the past or its competitors, and can do this work together—or concurrently—to keep things moving.

 

threading.PNG
A visual representation of DirectX® 12 asynchronous compute: graphics, memory and compute operations decoupled into independent queues of work that can run in parallel.

 

Filling these bubbles improves GPU utilization, input latency, efficiency and performance for the user by minimizing or eliminating the ripple of delays that could stall other graphics cards. Only Radeon™ graphics currently support this crucial capability in DirectX® 12 and VR.

 

Ashes of the Singularity™: Async Compute in Action

chart.png

AMD Internal testing. System config: Core i7-5960X, Gigabyte X99-UD4, 16GB DDR4-2666 Radeon™ Software 15.301.160205a, NVIDIA 361.75 WHQL, Windows® 10 x64.

 

Here we see that the Radeon™ R9 Fury X GPU is far and away the fastest DirectX® 12-ready GPU in this test. Moreover, we see such powerful DirectX® 12 performance from the GCN architecture that a $400 Radeon™ R9 390X GPU ties it up with the $650 GeForce GTX 980 Ti.1 Up and down the product portfolios we tested, Radeon™ GPUs not only win against their equivalent competitors they often punch well above their pricepoints.

 

You don’t have to take our word for it. Tom’s Hardware recently explored the performance implications of DirectX® 12 Asynchronous Compute, and independently verified the commanding performance wins handed down by Radeon™ graphics.

 

“AMD is the clear winner with its current graphics cards. Real parallelization and asynchronous task execution are just better than splitting up the tasks via a software-based solution,” author Igor Wallossek wrote.

 

Other interesting data emerged from the THG analysis, summarized briefly:

  • The Radeon™ R9 Fury X gets 12% faster at 4K with DirectX® 12 Asynchronous Compute. The GeForce 980 Ti gets 5.6% slower when attempting to use this powerful DirectX® 12 feature.
  • DirectX® 12 CPU overhead with the Radeon™ R9 Fury X GPU is an average of 13% lower than the GeForce 980 Ti.
  • The Radeon™ R9 Fury X GPU is a crushing 98% more efficient than the GeForce 980 Ti at offloading work from the CPU to alleviate CPU performance bottlenecks. At 1440p, for example, THG found that the Fury X spent just 1.6% of the time waiting on the processor, whereas the 980 Ti struggled 82.1% of the time.

 

Of asynchronous compute, Wallossek later concludes: “This is a pretty benchmark that serves up interesting results and compels us to wonder what's coming to PC gaming in the near future? One thing we can say is that AMD wins this round. Its R&D team, which implemented functionality that nobody really paid attention to until now, should be commended.”

 

We couldn't have said it better ourselves.

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 


Footnotes:

1. Prices in $USD based on Newegg.com as of February 29, 2016. Happy leap day!

Today is an exciting day for PC gaming enthusiasts: the Khronos Group has announced immediate public release of the open standard Vulkan™ 1.0 graphics API! To mark the occasion, we’ve posted a Radeon Software beta for Vulkan. This graphics driver is primarily intended to enable a wider audience of game developers to work with Vulkan on Radeon™ graphics.

 

What is Vulkan?

From the consortium that brought you OpenGL, Vulkan is a new graphics API for developers who want or need deeper hardware control. Designed with “low-overhead” capabilities, Vulkan gives devs total control over the performance, efficiency, and capabilities of Radeon™ GPUs and multi-core CPUs.

 

Compared to OpenGL, Vulkan substantially reduces “API overhead,” which is background work a CPU must do to interpret what a game is asking of the hardware. Reducing this overhead gives hardware much more time to spend on delivering meaningful features, performance and image quality. Vulkan also exposes GPU hardware features not ordinarily accessible through OpenGL.

 

Vulkan inherits these capabilities from AMD’s Mantle graphics API. Mantle was the first of its kind: the first low-overhead PC graphics API, the first to grant unprecedented access to PC GPU resources, and the first to offer absolute control of those resources. Most importantly for gamers, Mantle got the industry thinking about how much additional GPU performance could be unlocked with a low-overhead graphics API.

 

Though the Mantle API was tailored for AMD hardware, Mantle was also designed with just enough hardware abstraction to accommodate almost any modern graphics architecture.  That architecture proved useful when we contributed the source code and API specification of Mantle to serve as the foundation of Vulkan in May of 2015.

 

Since that time, Vulkan has been forged under the stewardship of a comprehensive industry alliance that spans the hardware development, game development and content creation industries. Many new and significant capabilities have been added, such as support and performance optimizations for Android® smartphones and tablets, or cross-OS support for Windows® 7, Windows® 8.1, Windows® 10, and Linux®.

 

What our driver supports

AMD has been participating in Vulkan’s development since its inception and providing builds of our Vulkan-enabled driver to game developers for many months. As we transition into the public phase, our initial driver release enables Vulkan support for select Radeon™ GPUs on Windows® 7, Windows® 8.1, and Windows® 10. An upcoming release of the amdgpu Linux driver will also feature Vulkan support.

 

Please note that this initial Windows driver is not packaged with DirectX® driver components, so it is not a suitable replacement for your everyday graphics driver.

 

Our Vulkan driver supports the following AMD APUs and Radeon™ GPUs1 based on the Graphics Core Next architecture:

 

What are some of the Radeon™ graphics features Vulkan exposes?

Only Radeon™ GPUs built on the GCN Architecture currently have access to a powerful capability known as asynchronous compute, which allows the graphics card to process 3D geometry and compute workloads in parallel. As an example, this would be useful when a game needs to calculate complex lighting and render characters at the same time. As these tasks do not have to run serially on a Radeon™ GPU, this can save time and improve overall framerates. Game developers designing Vulkan applications can now leverage this unique hardware feature across all recent versions of Windows and Linux.

 

Capture.PNG

Another new feature that Radeon™ GPUs support with Vulkan is multi-threaded command buffers. Games with multi-threaded command buffers can dispatch chunks of work to the GPU from all available CPU cores. This can keep the GPU occupied with meaningful work more frequently, leading to improved framerates and image quality. Vulkan brings this performance advantage to recent versions of Windows and Linux.

 

Finally, Vulkan has formal support for API extensions. API extensions allow AMD to design new hardware capabilities into future Radeon™ GPUs, then immediately expose those capabilities with a software plugin that interfaces with Vulkan in a compliant way.

 

The road ahead

As we move deeper into 2016, stay tuned to the GPUOpen website, the AMD Developer portal, and our activities at Game Developer Conference 2016. We promise to bring you a whole lot more on the exciting power and potential of the Vulkan API on Radeon™ graphics!

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

Footnote:

1. These products are based on a published Khronos specification but has not yet passed the Khronos Conformance Test Process. A fully conformant implementation of the Vulkan API will be included in a forthcoming Radeon Software release.

Radeon™ R9 Series GPUs have rapidly proven to be the definitive choice for gaming with the advent of DirectX® 12. If blistering performance isn’t enough, AMD Radeon™ R9 Series GPUs also support an incredible selection of technologies to push seriously beautiful pixels. When you need all of that in the ultimate graphics card money can buy, look no further than our fastest and most advanced DirectX® 12-ready GPU: the Radeon™ R9 Fury X. And starting today, we’ve teamed up with Dell to make it a new option on the otherworldly Alienware™ Area- 51™!

 

 

 

SUPREME DIRECTX® 12 PERFORMANCE

The Radeon™ R9 Fury X GPU is built on AMD’s advanced Graphics Core Next (GCN) architecture. GCN is the scalable blueprint for all of AMD’s current GPUs, but the Radeon R9 Fury X is the best and brightest. Graphics Core Next is currently the only architecture in the world that supports DirectX® 12 asynchronous shading.

 

Asynchronous shading allows GPUs to process lighting, physics, or reflections at the same time they’re also rendering 3D objects like characters, cars or buildings. Game developers say that this can boost overall performance by up to 30%, and only Radeon™ GPUs like the R9 Fury X can do it!

 

As a recent example, Fable Legends™ from Lionhead Studios is the latest DirectX® 12-based game. The Radeon™ R9 Fury X GPU with asynchronous shading slashed the processing time of this game’s complex lighting and reflections by 50% and 69%, respectively, in AMD internal testing.

fable_image.png

The Radeon™ R9 Fury X GPU is also a powerhouse in Ashes of the Singularity™, the world’s first public DirectX® 12 game, and an incredibly demanding title at that. Nevertheless, our exclusive support for asynchronous shading played a crucial role in delivering performance gains of more than 25% when enabling DirectX® 12.

Capture.PNG
AMD Internal Testing. System configuration: AMD Radeon™ R9 Fury X, Core i7-5960X, 16GB DDR4-2666, Windows® 10 Professional x64, AMD Catalyst™ driver 15.20.1061.

 

PUSHING MORE PIXELS

Everyone’s talking about 4K resolution, and it’s easy to see why: 4K monitors pack four times greater detail onto the screen than a normal “1080p” monitor most people have on their desk. Gaming in 4K just looks sharper, clearer and more detailed than gaming at lesser resolutions. Seeing is believing!

 

But the Radeon™ R9 Fury X GPU has many ways to bring a high-resolution gaming experience to your desk.

 

  • 4K-Ready: The Alienware™ Area-51™ gives you the option of configuring your new rig with a 4K monitor, and the Radeon™ Fury X GPU is more than capable of delivering an outstanding 4K/high settings experience in the latest games like Star Wars™: Battlefront™!
  • 12K Gaming: 4K not enough? How about 12? For the truly insane, the Radeon™ R9 Fury X GPU supports multiple 4K monitors with AMD Eyefinity multi-display technology for gloriously panoramic 12K gaming.
  • Virtual Super Resolution: It’s okay if you’ll choose a smaller Dell UltraSharp display, because the Radeon™ R9 Fury X GPU can still render all of your games at glorious quality rivalling 4K resolution with a technology we call Virtual Super Resolution (VSR). VSR renders any game at up to 4K resolution, then downsizes those images on the fly to fit the monitor you have for a beautifully smooth supersample anti-aliasing effect.

 

Capture2.PNG
AMD Internal Testing. System configuration: Radeon™ R9 Fury X GPU, Core i7-5960X, 16GB DDR4-2666, Windows® 10 Professional x64, AMD Catalyst™ 15.20.1061.

 

READY FOR VIRTUAL REALITY WITH AMD LIQUIDVR™ TECHNOLOGY

The era of serious virtual reality is upon us with engrossing games like EVE: Valkyrie and crazy fun games like Keep Talking and Nobody Explodes. Thanks to AMD LiquidVR™ technology, the Radeon™ R9 Fury X GPU in the Alienware™ Area-51™ is ready to roll. And because a picture is worth 1000 words, we have a handy infographic ready to explain all the reasons why AMD LiquidVR™ is critical for a superb VR experience.

 

vr.jpg

NOW AVAILABLE FROM DELL

Whether you care about DirectX® 12 performance,  gigantic resolutions, or virtual reality, the Radeon™ R9 Fury X GPU can do it all at serious speed. Interested? Configure your own Alienware™ Area-51™ with AMD’s mightiest GPU today, then sit back and rest easy knowing that some of the world’s most veteran PC builders are expertly handling every little detail. All you’ll have to do is unbox, plug in, and game on!

Benjamin Franklin once said that there were only two certain things in life: death and taxes. Given his era, I suppose we can forgive him for not knowing about the third thing: Radeon™ graphics crushin’ it in Star Wars™ Battlefront™.

Yes, my friends, it wasn’t that long ago when the Internet exploded with joy as the Star Wars™ Battlefront™ trailer hit at E3 2013. Over the past 18 months, gamers and Star Wars fans have (im)patiently waited for the day they could finally visit worlds like Sullust. But here at AMD, we had a different job during that time: we worked shoulder-to-shoulder with our friends at EA and DICE to ensure that the Battlefront experience is unrivalled when you sit down this week to play on a Radeon™ GPU.

image016.jpg
Radeon™ graphics exclusively powered the PC reveal of Star Wars™ Battlefront™ at San Diego Comic Con 2015.

 

RADEON™ GRAPHICS PERFORMANCE IN STAR WARS™ BATTLEFRONT™

Over the past two weeks we’ve been doing the shakedown cruise on that collaboration, and the results couldn’t be any clearer: if you want the highest framerates in Star Wars™ Battlefront™, you want a Radeon™ GPU.1

perf_chart.PNG

And if you’re the sort of person that doesn’t have an AMD FreeSync™-enabled monitor, then you might want to sustain even higher framerates. The below table shows the combinations of GPUs, resolutions and in-game quality presets that can keep average performance around 60 FPS.

60fps_chart.png

WHY IS STAR WARS™ BATTLEFRONT™ SO GORGEOUS?

Many gamers have wondered how Star Wars™ Battlefront™ runs so well all the way up to 4K, especially considering how beautiful the graphics really are. “Overwhelming effort to optimize the PC experience” is the simplest explanation, but there are three specific technologies that play the largest role in the final product.

 

PHYSICALLY BASED RENDERING

Physically based rendering is a term that encompasses all the important aspects of correctly modeling and simulating how light interacts with the surfaces and materials seen in the game.

 

In order to achieve such a high level of realism, artists from DICE travelled to many real-world locations that best approximate the planets in the Star Wars galaxy. Surfaces from those location were photoscanned to accurately gather the exact diffuse and reflective properties of materials like basalt and snow. This real-world data is fed straight into the Frostbite™ Engine and lit according real-world parameters for light sources.

image_1.img.jpg
DICE Senior Level Artist Pontus Ryman traversing the barren landscapes of Iceland for inspiration on Sullust.

 

For example, the sun produces about 1.6 billion candela per square meter of luminance (or “brightness”) at noon, while a TV might produce around 400 candela per square meter—quite the difference!  But up until now, this enormous difference hasn’t been correctly represented in most games. The lighting calculations in Star Wars™ Battlefront™ are significantly more involved than previous lighting models, but the simulations are free of fudge factors and approximations, as everything is accurately based on the real world. The Frostbite™ renderer uses a powerful combination of complex compute shaders and pixel shaders to achieve this.


DISPLACEMENT MAPPED TESSELLATED TERRAIN

The terrains in Star Wars™ Battlefront™ are highly detailed, using a combination of high resolution textures and geometry that is hardware tessellated and then displacement-mapped. The degree of tessellation executed by a Radeon™ GPU is intelligently determined based on the roughness of the terrain and the distance to the camera. This adaptive detail scaling helps keep the cost of the scene within a sensible performance budget while still delivering spectacular visuals.

new-trailer-for-star-wars-battlefront-survival-mode-on-tatooine-e3-2015.jpg

GPU COMPUTE AMBIENT OCCLUSION

Screen space ambient occlusion is a technique that analyzes the scene for areas that should receive less ambient light. It searches for areas where there are corners, cracks and crevices in the geometry and effectively dials back the quantity of light being project into that space. The ambient occlusion technique used in previous Frostbite™ games has now been improved, optimized and moved from the graphics pipeline to the compute pipeline, which will run particularly well on the powerful compute hardware of the GCN architecture.

Star-Wars-Battlefront-Shows-AT-ST-in-Action-on-Endor-480716-2.jpg

SEE YOU ON THE BATTLEFRONT

Over the past 18 months, I’ve had the privilege to play one small role amongst thousands at EA, DICE, LucasArts, Disney and AMD—and so many more—working to usher in a new era of Star Wars on the PC. As both a lifelong Star Wars fan and a PC gamer, it’s been the opportunity of a lifetime.

origin_image.png

Now comes the best part of all: staying up way later than I should playing Star Wars™ Battlefront™ in 4K on my Radeon™ R9 Fury X GPU from the comfort of my own gaming rig!

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 

FOOTNOTES:

Testing conducted by AMD performance labs as of November 11, 2015 on Radeon™ R9 Fury X GPU vs. GTX 980 Ti vs. Radeon™ R9 390X GPU at 4K resolution with average scores of FPS 57.2 vs 52.03 vs. 46.3. PC manufacturers may vary configurations yielding different results. System configuration:  Core i7-5960X, Gigabyte X99-UD4, 16GB DDR4-2666, Windows 10 Pro x64, AMD Catalyst™ 15.11.1, ForceWare 358.91. Endor Survival stage. [GRDT-93]

It may surprise you to learn…

 

DirectX® 12 is the very first version of the DirectX® API that has specific features, techniques tools to support multi-GPU (mGPU) gaming. If you are indeed surprised, follow us as we take a trip through the complicated world of mGPU in PC gaming and how DirectX® 12 turns some classic wisdom on its head.

 

MULTI-GPU TODAY

Modern multi-GPU gaming has been possible since DirectX® 9, and has certainly grown in popularity during the long-lived DirectX® 11 era. Even so, many PC games hit the market with no specific support for multi-GPU systems. These games might exhibit no performance benefits from extra GPUs or, perhaps, even lower performance. Oh no!

 

Our AMD Gaming Evolved program helps solve for these cases by partnering with major developers to add mGPU support to games and engines—with resounding success! For other applications not participating in the AMD Gaming Evolved program, AMD has talented software engineers that can still add AMD CrossFire™ support through a driver update.1

 

All of this flows from the fact that DirectX® 11 doesn’t explicitly support multiple GPUs. Certainly the API does not prevent multi-GPU configurations, but it contains few tools or features to enable it with gusto. As a result, most games have used a classic “workaround” known as Alternate Frame Rendering (AFR).

 

HOW AFR WORKS

Graphics cards essentially operate with a series of buffers, where the results of rendering work are contained until called upon for display on-screen. With AFR mGPU, each graphics card buffers completed frames into a queue, and the GPUs take turns placing an image on screen.

 

AFR is hugely popular for the framerate gains it provides, as more frames can be made available every second if new ones are always being readied up behind the one being seen by a user.

 

afr.png

But AFR is not without its costs, as all this buffering of frames into long queues can increase the time between mouse movement and that movement being reflected on screen. Most gamers call this “mouse lag.”

 

Secondly, DirectX® 11 AFR works best on multiple GPUs of approximately the same performance. DirectX® 11 frequently cannot provide tangible performance benefits on “asymmetric configurations”, or multi-GPU pairings where one GPU is much more powerful than the other. The slower device just can’t complete its frames in time to provide meaningful performance uplifts for a user.

 

Thirdly, the modest GPU multi-threading in DirectX® 11 makes it difficult to fully utilize multiple GPUs, as it’s tough to break up big graphics jobs into smaller pieces.

 

INTRODUCING EXPLICIT MULTI-ADAPTER

DirectX® 12 addresses these challenges by incorporating multi-GPU support directly into the DirectX® specification for the first time with a feature called “explicit multi-adapter.” Explicit multi-adapter empowers game developers with precise control over the workloads of their engine, and direct control over the resources offered by each GPU in a system. How can that be used in games? Let’s take a look at a few of the options.

 

SPLIT-FRAME RENDERING

New DirectX® 12 multi-GPU rendering modes like “split-frame rendering” (SFR) can break each frame of a game into multiple smaller tiles, and assign one tile to each GPU in the system. These tiles are rendered in parallel by the GPUs and combined into a completed scene for the user. Parallel use of GPUs reduces render latency to improve FPS and VR responsiveness.

 

sfr.png

Some have described SFR as “two GPUs behaving like one much more powerful GPU.” That’s pretty exciting!

 

Trivia: The benefits of SFR have already been explored and documented with AMD’s Mantle in Firaxis Games’ Sid Meier’s Civilization®: Beyond Earth™.

 

ASYMMETRIC MULTI-GPU

DirectX® 12 offers native support for asymmetric multi-GPU, which we touched on in the “how AFR works” section. One example: a PC with an AMD APU and a high-performance discrete AMD Radeon™ GPU. This is not dissimilar from AMD Radeon™ Dual Graphics technology, but on an even more versatile scale!2

 

With asymmetric rendering in DirectX® 12, an engine can assign appropriately-sized workloads to each GPU in a system. Whereas an APU’s graphics chip might be idle in a DirectX® 11 game after the addition of a discrete GPU, that graphics silicon can now be used as a 3D co-processor responsible for smaller rendering tasks like physics or lighting. The larger GPU can handle the heavy lifting tasks like 3D geometry, and the entire scene can be composited for the user at higher overall performance.

 

asymm.png

4+4=8?

In the world of DirectX® 9 and 11, gamers are accustomed to a dual-GPU system only offering one GPU’s worth of RAM. This, too, is a drawback of AFR, which requires that each GPU contain an identical copy of a game’s data set to ensure synchronization and prevent scene corruption.

 

But DirectX® 12 once again turns conventional wisdom on its head. It’s not an absolute requirement that AFR be used, therefore it’s not a requirement that each GPU maintain an identical copy of a game’s data. This opens the door to larger game workloads and data sets that are divisible across GPUs, allowing for multiple GPUs to combine their memory into a single larger pool. This could certainly improve the texture fidelity of future games!

 

mem.png

 

WRAP-UP

A little realism is important, and it’s worth pointing out that developers must choose to adopt these features for their next-generation PC games. Not every feature will be used simultaneously, or immediately in the lifetime of DirectX® 12. Certainly DirectX® 11 still has a long life ahead of it with developers that don’t need or want the supreme control of 12.

 

Even with these things in mind, I’m excited about the future of PC gaming because developers already have expressed interest in explicit multi-adapter’s benefits—that’s why the feature made it into the API! So with time, demand from gamers, and a little help from AMD, we can make high-end PC gaming more powerful and versatile than ever before.

 

And that, my friends, is worth celebrating!

 

Robert Hallock is the Head of Global Technical Marketing at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

 


FOOTNOTES

1. AMD CrossFire™ technology requires an AMD CrossFire-ready motherboard, a compatible AMD CrossFire™ bridge interconnect (for each additional graphics card) and may require a specialized power supply.

2. AMD Radeon™ Dual Graphics requires one of select AMD A-Series APUs plus one of select AMD Radeon™ discrete graphics cards and is available on Windows® 7, 8, 8.1 and 10 OSs. Linux OS supports manual switching which requires restart of X-Server to engage and/or disengage the discrete graphics processor for dual graphics capabilities. With AMD Radeon™ Dual Graphics, full enablement of all discrete graphics video and display features may not be supported on all systems and may depend on the master device to which the display is connected. Check with your component or system manufacturer for specific mode capabilities and supported technologies.