General Discussions

black_zion · ‎04-29-2019

Talk about some bad news...Now those rumors of a 300w TDP for the top Navi card, despite being a 7nm part, do seem more credible.

https://wccftech.com/amds-navi-gpus-confirmed-to-retain-gcn-design/

pokester · ‎04-30-2019

Wow! Got nothing else to say.

ajlueke · ‎05-06-2019

I've actually never heard anything to the contrary. I was always under the impression that Navi was the final GCN GPU followed by Arcturus which would be a new architecture.

pokester · ‎05-06-2019

I believe I had seen speculation to that as well in the past. Not sure however it was ever put in official specs. Either way GCN has had some long legs!

black_zion · ‎05-06-2019

I had heard it was to be a new architecture since Vega was GCN based, and the leapfrogging design teams idea would be that while the Vega team worked on GCN, the Navi team would work on the new architecture and create the midrange cards, then when Vega were finished that team would move on to Navi's successor to create the high end variants.

But who knows, with Raja out of the picture the new guys may have found new life in GCN akin to the VLIW5 to VLIW4 change.

ajlueke · ‎05-07-2019

I think there is some opportunity there. GCN was AMD's first compute architecture, and designed to help AMD crack into the burgeoning markets of professional rendering cards, scientific compute cards, and machine learning and other AI applications. It has succeeded in those markets, but gaming has always been the secondary consideration (or in the case of the Radeon VII, not a consideration at all), and AMD's performance in that arena has slowly eroded away as a result.

Navi then will be and interesting product, as it was built with gaming in mind. And GCN does perform very well within it's power envelop. If you look at the Fury Nano vs the Fury X, they had vastly different power and heat requirements for only a 10% increase in performance. Vega 64, vs Vega 64 Liquid is a similar story. The question will be, where will that sweet spot fall for Navi at 7nm, and with the chip geared specifically towards gaming. Like any GCN product, once you hit the wall, power use and thermals will skyrocket for minimal performance gains.

I don't think hitting RTX 2070/GTX 1080 levels is out of the question for Navi. And if they price it in the $300 range and down, it'll be a win.

black_zion · ‎05-08-2019

Something I just thought of today, AMD has confirmed Navi will have ray tracing support, so it seems to me that it'd be much easier to design a new GPU architecture than to attempt to splice on ray tracing shaders, since I don't believe AMD would make Navi a multi-chip design with a ray tracing coprocessor.

pokester · ‎05-08-2019

Do you have a link to the confirmation of Navi on PC with Ray Tracing? I have seen articles stating that the PC products would not in 2019 but that the console chips will. That would be good news if they do.

black_zion · ‎05-08-2019

Not for Navi 10, which lends credence to the fact Navi will be GCN based. Articles do say Navi 20 will have ray tracing, but it makes little sense that after all this development time to restrict ray tracing to yet another GCN based Navi product if Navi 10 will have RTX 2070 level power. It would also be a daft business decision not to have ray tracing since if it does have RTX 2070 levels of performance, nVidia will certainly slash prices to roughly equal levels, yet with ray tracing ability, it gives them a higher appeal for futureproofing.

ajlueke · ‎05-09-2019

There aren't really different shaders for ray tracing vs not. NVidia simply uses their RTX drivers to route DXR calls from DX12 over the RT cores. That way, none of the work is being done by the SM. NVidia needs to do it this way because they make purpose specific dies. Their gaming GPUs are loaded with FP32 units and some mixed precision FP32/FP16 via the tensor cores. The professional units have a larger set of double precision cores for scientific applications.

AMD doesn't have the resources to generate different GPU dies, so effectively all the GPUs are the same, regardless whether it is a gaming card or pro card. That is why, Vega for example, isn't as efficient as Pascal, or Turing. There is a large amount of die space taken up with FP64 units no game will ever use, but the GPU still has to drive them. You have wasted space, that adds to your power budget and contributes nothing to performance. But it does save you money in only producing a single die.

And that is where the benefit comes in for ray-tracing. You could effectively create drivers to route the DXR ray tracing requests over the unused double precision units already present on your GPU die. That would then effectively become the ray-tracing co-processor. It probably wouldn't be quite as effective as the NVidia solution, as calculating all ray-tracing vector calculations at double precision is likely unnecessary and wastes clock cycles.

By keeping the hardware activation in driver, link to a DXR request, you could avoid having full rate FP64 performance activated on your consumer cards, and cannibalizing your Instinct sales.

pokester · ‎05-09-2019

Thanks for that explanation I had thought and asked that same question back when the RTX series came out. Why couldn't you just have the drivers use the unused compute power the AMD cards already have. Sounds like you could!

ajlueke · ‎05-09-2019

You certainly could. Gut generalized compute cores won't be as efficient for their purpose as the RT cores. The problem with dedicated hardware is always scalability. How many RT cores do you add to a GPU? If a game doesn't have any DXR extensions they are dead weight, and you would be better served just putting more general purpose cores on the die.

That was the same issue 10 years ago with pixel and vertex shaders. Both are faster at their respective tasks and general purpose shaders, but you could hit bottlenecks depending on the software being used. Some games were more pixel intensive, some more vertex. So it was better overall to move to generalized shaders to avoid bottlenecks despite the fact that the shaders were less efficient for the specific tasks.

The same will likely be true for DXR. If DXR sees implementation in some engines, then the RT cores will be great. But they will increasingly weigh down games that don't support them. In a few generations there will likely be enough generalized compute resources to run DXR without specialized hardware.

But the real problem AMD has, isn't the RT cores vs generalized compute, it's the Tensor cores. NVidia uses the Tensor cores (FP32 units that can also run two FP16 instructions) to denoise their ray-traced image. Essentially, NVidia doesn't have the resources to run hundreds of rays in real time. So they run only a handful, and use algorithm based denoising to make the image useable. The denoising algorithm runs over the mixed precision tensor cores at half precision (FP16), that's what makes this whole thing work. FP16 instructions can be executed fast enough to denoise the image in real time.

So they key to making this work is less about computing a few rays, and more about using an algorithm based cleanup to make the image that results from just a few rays useable. AMD can replicate this, as they also have FP16 instruction support imbedded in their FP32 units as well in the form of "Rapid Packed Math". However, RPM is only supported by Vega. So if you enable a similar denoising approach to NVidia it would only work on Vega. And really, that leaves a lot of AMD users out in the cold.

Now Navi, will likely sport the same architectural improvements as Vega, and with it being present in Sony and Microsoft's latest consoles it may be worthwhile for AMD to develop a software solution to deal with DXR. Developers would have support on the new consoles, and on two AMD GPUs while RTX would continue to work as well.

pokester · ‎05-09-2019

Great info! You sure understand this stuff in a way I never will. I remember reading when the RTX cards came out that the speculation of why they made dedicated RT and Tensor cores was to not cannibalize their own compute card market.

black_zion · ‎05-09-2019

That could be why the ray tracing demo Crytek ran on Vega 56 performed so well. The way I picture ray tracing right now is akin to nVidia's PhysX. When they acquired the technology from Ageia, it was coded in x87. As x87 ran like garbage on CPUs and was proprietary to nVidia, they had no reason to recode it, so running PhysX on AMD and Intel hardware was impossible. Same thing may be happening with ray tracing, it's still an evolving technology, so as it is it's still insanely inefficient. Although we are still seeing some noticeable improvements with new drivers and game patches, nVidia has zero interest to push ray tracing to be efficient, as they want to make boatloads from overpriced RTX cards, but game developers have an extreme interest in getting ray tracing to work as efficiently as possible as the midrange and mainstream cards dominate the market.

https://www.realworldtech.com/physx87/

ajlueke · ‎05-10-2019

It is definitely evolving. Right now, the NVidia hardware approach is built around Microsoft's DXR. But who's to say that DXR is the best way to do ray tracing? I'm sure Crytek's in engine approach utilized denoising just like NVidia does for DXR on their RTX GPUs. That works well on Vega, because as noted before, Vega's FP32 units can also execute two FP16 instructions. Vega still doesn't have anywhere near the compute power to cast hundreds of rays in real time, so doing a few over Vega's generalized compute engine and denoising the image is probably also the approach here.

This is probably the approach AMD will use with Navi, and why they stated they wanted to wait to role out ray-tracing when it was feasible for the entire product stack. Navi will almost certainly support rapid packed math, and replace Polaris from the low to mid tier. Then AMD can implement ray tracing across the entire product stack. But we may see bigger gains on the software side of things with Navi that will definitely affect ray tracing in the future.

Developers are used to writing all their code at full precision, even if the image quality doesn't improve. That is largely because, vertex shaders in the old days required full precision for their computations. When generalized shaders became a thing, the shaders were all FP32 so they could run both pixel and vertex commands. Now, with mixed precision shaders, a shader can run two half precision instructions or one full precision. If you have computations in your game that don't require full precision, you can run them at half and they execute twice as fast. Developers do have to code instructions specifically as FP16 so the hardware executes them that way, and right now they have no reason to do that because hardware with mixed precision shaders is extremely limited.

Navi can change all that. If it is in both new consoles, a developer can utilize FP16 where they make sense and see huge gains in fps. That is always something developers are looking to do on consoles so they will likely bake it in to their engines. That will benefit anyone on PC with a Vega or later GPU as well. NVidia has a harder time there, because the only cores that support mixed precision are the tensor cores on the RTX series. Older NVidia generations will have to run all instructions at FP32 and their performance will lag, while the RTX series could update drivers to route those calculations over the tensor cores. But with a limited number of tensor cores, the more you use in engine, the fewer are available to denoise ray tracing images. If mixed precision calculations in engine becomes popular, NVidia will release a new generation of GPUs where the majority if not all of the CUDA cores are now mixed precision. That will run great, and probably sell for a boat load, but it will again leave the buyers of their previous hardware in the dust.

General Discussions

Confirmed: Navi to be Graphics Core Next (GCN) Based (WCCFTech)