News AMD Big Navi and RDNA 2 GPUs: Release Date, Specs, Everything We Know

Unless my math is wrong; 80 Compute Units * 96 Raster Operations * 1600 Mhz clock = 12.28 TFLOPS of single precision floating point (FP32).

Not bad AMD. Not bad. Let's see what that translates to in the real world, though with the advances of DX12 and now Vulkan being implemented I expect AMD to be on a more even level playing field with high end Nvidia. I might be inclined to head back to team Red, especially if the price is right.
 
Unless my math is wrong; 80 Compute Units * 96 Raster Operations * 1600 Mhz clock = 12.28 TFLOPS of single precision floating point (FP32).

Not bad AMD. Not bad. Let's see what that translates to in the real world, though with the advances of DX12 and now Vulkan being implemented I expect AMD to be on a more even level playing field with high end Nvidia. I might be inclined to head back to team Red, especially if the price is right.
Your math is wrong. 🙂

FLOPS is simply FP operations per second. It's calculated as a "best-case" figure, so FMA instructions (fused multiply add) count as two operations, and each GPU core in AMD and Nvidia GPUs can do one FMA per clock (peak theoretical performance). So FLOPS ends up being:
GPU cores * 2 * clock

For the tables:
80 CUs * 64 cores/CU * 2 * clock (1600 MHz) = 16,384 GFLOPS.

ROPs and TMUs and some other functional elements of GPUs might do work that sort of looks like an FP operation, but they're not programmable or accessible in the same way as the GPUs and so any instructions run on the ROPs or TMUs generally aren't counted as part of the FP32 performance.
 
Your math is wrong. :)

FLOPS is simply FP operations per second. It's calculated as a "best-case" figure, so FMA instructions (fused multiply add) count as two operations, and each GPU core in AMD and Nvidia GPUs can do one FMA per clock (peak theoretical performance). So FLOPS ends up being:
GPU cores * 2 * clock

For the tables:
80 CUs * 64 cores/CU * 2 * clock (1600 MHz) = 16,384 GFLOPS.

Ah yes, I knew I was forgetting Texture Mapping Units. Thank you for the correction. I am assuming you meant 16.3 TFLOPS vice GigaFLOPS. I knew what you were trying to convey. Either way, those some pretty impressive theoretical compute performance. Excited to see how that translates tor real world performance versus some pointless synthetic benchmark.
 
Speaking of FLOPS we also should note that AMD gutted most of GCN that was left especially the parts that helped compute. I fully expect the same amount of FLOPS from this architecture to translate into more FPS since they are no longer making general gaming and compute GPU but a dedicated gaming GPU.
 
Ah yes, I knew I was forgetting Texture Mapping Units. Thank you for the correction. I am assuming you meant 16.3 TFLOPS vice GigaFLOPS. I knew what you were trying to convey. Either way, those some pretty impressive theoretical compute performance. Excited to see how that translates tor real world performance versus some pointless synthetic benchmark.
Well, 16384 GFLOPS is the same as 16.384 TFLOPS if you want to do it that way. I prefer the slightly higher precision of GFLOPS instead of rounding to the nearest 0.1 TFLOPS, but it would be 16.4 TFLOPS if you want to go that route.
 
Speaking of FLOPS we also should note that AMD gutted most of GCN that was left especially the parts that helped compute. I fully expect the same amount of FLOPS from this architecture to translate into more FPS since they are no longer making general gaming and compute GPU but a dedicated gaming GPU.
I'm not sure that's completely accurate. If you are writing highly optimized compute code (not gaming or general code), you should be able to get relatively close to the theoretical compute performance. Or at least, both GCN and Navi should end up with a relatively similar percentage of the theoretical compute. Which means:

RX 5700 XT = 9,654 GFLOPS
RX Vega 64 = 12,665 GFLOPS
Radeon VII = 13,824 GFLOPS

For gaming code that uses a more general approach, the new dual-CU workgroup processor design and change from 1 SIMD16 (4 cycle latency) to 2 SIMD32 (1 cycle latency) clearly helps, as RX 5700 XT easily outperforms Vega 64 in every test I've seen. But with the right computational workload, Vega 64 should still be up to 30% faster. Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games.
 
  • Like
Reactions: digitalgriffin
I'm not sure that's completely accurate. If you are writing highly optimized compute code (not gaming or general code), you should be able to get relatively close to the theoretical compute performance. Or at least, both GCN and Navi should end up with a relatively similar percentage of the theoretical compute. Which means:

RX 5700 XT = 9,654 GFLOPS
RX Vega 64 = 12,665 GFLOPS
Radeon VII = 13,824 GFLOPS

For gaming code that uses a more general approach, the new dual-CU workgroup processor design and change from 1 SIMD16 (4 cycle latency) to 2 SIMD32 (1 cycle latency) clearly helps, as RX 5700 XT easily outperforms Vega 64 in every test I've seen. But with the right computational workload, Vega 64 should still be up to 30% faster. Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games.


"Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games. "

Was my point ^ We will see more FPS in games than the FLOPS is telling us. It is not a flops is 30% more so we can expect that much more gaming performance it wont be linear this go around.
 
"Navi 21 with 80 CUs meanwhile would be at least 30% faster than Vega 64 in pure compute, and probably a lot more than that in games. "

Was my point ^ We will see more FPS in games than the FLOPS is telling us. It is not a flops is 30% more so we can expect that much more gaming performance it wont be linear this go around.
I agree with that part, though it wasn't clear from your original post that you were saying that. Specifically, the bit about "AMD gutted most of GCN that was left especially the parts that helped compute" isn't really accurate. AMD didn't "gut" anything -- it added hardware and reorganized things to make better use of the hardware. And ultimately, that leads to better performance in nearly all workloads.

Interesting thought:
If AMD really does an 80 CU Navi 2x part, at close to the specs I listed, performance should be roughly 60% higher than RX 5700 XT. Considering the RTX 2080 Ti is only about 30% faster than RX 5700 XT, that would actually be a monstrously powerful GPU. I suspect it will be a datacenter part first, if it exists, and maybe AMD will finally get a chance to make a Titan killer. Except Nvidia can probably get a 40-50% boost to performance over Turing by moving to 7nm and adding more cores, so I guess we wait and see.
 
I agree with that part, though it wasn't clear from your original post that you were saying that. Specifically, the bit about "AMD gutted most of GCN that was left especially the parts that helped compute" isn't really accurate. AMD didn't "gut" anything -- it added hardware and reorganized things to make better use of the hardware. And ultimately, that leads to better performance in nearly all workloads.

Interesting thought:
If AMD really does an 80 CU Navi 2x part, at close to the specs I listed, performance should be roughly 60% higher than RX 5700 XT. Considering the RTX 2080 Ti is only about 30% faster than RX 5700 XT, that would actually be a monstrously powerful GPU. I suspect it will be a datacenter part first, if it exists, and maybe AMD will finally get a chance to make a Titan killer. Except Nvidia can probably get a 40-50% boost to performance over Turing by moving to 7nm and adding more cores, so I guess we wait and see.
Looking at the numbers AMD could get an RX 5700XT performance part in a 150W envelope if their performance/watt numbers can be believed. Having a 1440p GPU in the power envelope of a GTX 1660 would be a killer product.
 
Looking at the numbers AMD could get an RX 5700XT performance part in a 150W envelope if their performance/watt numbers can be believed. Having a 1440p GPU in the power envelope of a GTX 1660 would be a killer product.
Yeah, I honestly think that's about where Navi 2x will bottom out. I mean, maybe something like RX 5600 XT with lower power plus ray tracing, but I don't think AMD will push Navi 2x much lower than that -- not with the ray tracing stuff enabled. Because Navi 1x is already in that space.

OTOH, I'm curious to see how much actually changes outside of ray tracing support. A 50% improvement in performance per watt on the same lithography is huge. It also suggests AMD missed some low hanging fruit with Navi 1x. I'm not sure I can come up with a single example in recent years of an architecture alone improving PPW by 50%. Pascal did it, but it was also a 28nm -> 16nm shrink. Maxwell maybe? GTX 970 was about as fast as GTX 780 Ti at launch IIRC, and used 37% less power. Maxwell also ditched a lot of compute stuff to improve gaming performance, which then got walked back in various ways in Pascal and Turing.

One concern is if AMD includes ray tracing performance as part of the improved performance per watt, though. I'd hate to see it end up as something like: "AMD tested 20 games and measured performance and power use. With our internal DirectX Raytracing enabled drivers running on RX 5700 XT, we're seeing Navi 2x overall performance improve by 50%. Five of the games tested support DXR." I really hope that the DXR support is 'extra' and not part of the PPW improvements!
 
High end current gen Nvidia - which will be replaced with next gen Nvidia about the same time that "Big Navi" is released... Always shooting for 3rd place...

I guess reading slides is too hard - on the new consoles the full scene ray tracing will be done in the cloud - not on the hardware... Would imagine the "Big Navi" willl do much of the same.

Take the "leak" and divide by half - and that's what you will actually get.
 
I agree with that part, though it wasn't clear from your original post that you were saying that. Specifically, the bit about "AMD gutted most of GCN that was left especially the parts that helped compute" isn't really accurate. AMD didn't "gut" anything -- it added hardware and reorganized things to make better use of the hardware. And ultimately, that leads to better performance in nearly all workloads.

Interesting thought:
If AMD really does an 80 CU Navi 2x part, at close to the specs I listed, performance should be roughly 60% higher than RX 5700 XT. Considering the RTX 2080 Ti is only about 30% faster than RX 5700 XT, that would actually be a monstrously powerful GPU. I suspect it will be a datacenter part first, if it exists, and maybe AMD will finally get a chance to make a Titan killer. Except Nvidia can probably get a 40-50% boost to performance over Turing by moving to 7nm and adding more cores, so I guess we wait and see.
In a world where Nvidia isn't spending money on R&D and isn't releasing their next gen, then yes, this should beat Nvidia...

3 problems with that - Nvidia is spending money on R&D and IS releasing their next gen (about the same time as this vaporware "Big Navi" is supposed to be released), and all that is known about "Big Navi" is what AMD has told you - so if past is prologue, then take the performance and divide by half... and that's what you will actually get - most likely a match for 2080TI but, once again, Aiming for where the competitors WERE and not where they are going to BE.
 
Unless my math is wrong; 80 Compute Units * 96 Raster Operations * 1600 Mhz clock = 12.28 TFLOPS of single precision floating point (FP32).

Not bad AMD. Not bad. Let's see what that translates to in the real world, though with the advances of DX12 and now Vulkan being implemented I expect AMD to be on a more even level playing field with high end Nvidia. I might be inclined to head back to team Red, especially if the price is right.
Too bad the night janitor is out with COVID-19 and is unable to make improvements to his driver code. A company that CANNOT get drivers right can have the bestest most excellent thing ever - but if you can't make use of it due to the poor drivers, then it's worthless regardless of the price point. Not to mention the new Nvidia coming around the same time.

I am sure that this card will be of the same excellent quality that AMD is know for - and will perform at or above the levels presented in the marketing release - since they ALWAYS are under promising and over delivering... I mean with both Nvidia and Intel being destroyed, AMD will have yet another victory to add to the 50 years of unbridled success and nothing but winning .... /s
 
In a world where Nvidia isn't spending money on R&D and isn't releasing their next gen, then yes, this should beat Nvidia...

3 problems with that - Nvidia is spending money on R&D and IS releasing their next gen (about the same time as this vaporware "Big Navi" is supposed to be released), and all that is known about "Big Navi" is what AMD has told you - so if past is prologue, then take the performance and divide by half... and that's what you will actually get - most likely a match for 2080TI but, once again, Aiming for where the competitors WERE and not where they are going to BE.
That's because AMD dragged their feet with this. They should have launched Navi much sooner.

and never released RX 590 and Radeon VII GPUs.

RX 5700XT and RX 5700 were cards meant to combat the RTX 2060 and RTX 2070.

AMD still hasn't offered anything to compete with the RTX 2080, now RTX 2080 Super, and RTX 2080Ti.
 
I prefer the slightly higher precision of GFLOPS instead of rounding to the nearest 0.1 TFLOPS
There is no point in worrying about a rounding error that falls within die-to-die boost behavior variance, especially when that rounding error is on a purely theoretical absolute best case that will never come close to being achieved in anything resembling a real workload.
 
There is no point in worrying about a rounding error that falls within die-to-die boost behavior variance, especially when that rounding error is on a purely theoretical absolute best case that will never come close to being achieved in anything resembling a real workload.
The problem isn't so much with chips that run at 10+ TFLOPS, but with the stuff that's down in the low to mid single digits. Anyway, TFLOPS and GFLOPS are interchangeable, as long as you put them in the right units for comparison. Plus, the math is very easy to see/explain with GFLOPS (16384 = 5120 * 2 * 1600) and less so with TFLOPS (16.3 = 5120 * 2 * 1.6). Potato, potahto.
 
  • Like
Reactions: digitalgriffin
That's because AMD dragged their feet with this. They should have launched Navi much sooner.

and never released RX 590 and Radeon VII GPUs.

RX 5700XT and RX 5700 were cards meant to combat the RTX 2060 and RTX 2070.

AMD still hasn't offered anything to compete with the RTX 2080, now RTX 2080 Super, and RTX 2080Ti.
That would have been lovely, except part of getting ready for Navi was probably the work on Vega 20. Certainly Navi 2x was nowhere near ready to go last year. AMD likely chose to limit die size (Navi 1x) to get a better handle on the new architecture and 7nm. Maybe Vega 20 helped them realize making larger chips at the time was going to be difficult. But really, Polaris 30 and Vega 20 were both stopgap solutions just to pass time while finishing up Navi 1x.
 
Unless the consoles prices are going way above what everyone thinks they'll go ($500/550 for PS5 and $600/650 for XsX), They'll literally be equivalent to a mid/high tier PC with their GPUs basically equivalent to $300-500 variants of current day GPUs and more powerful than most PCs at this point.

Should be interesting how this plays out in terms of pricing, since if this is the case, the consoles will be A LOT more price efficient than PCs versus just a bit more price efficient.
 
Although so many people hope for AMD to beat Nvidia's 3080Ti with Big Navi, it is probably best for us all if they stay slightly below that in performance terms and drive the price down. As for them dragging their feet, maybe but they did need the time to prepare the console chips, and it seems that they stumbled upon a heureka moment with perf per Watt in the process, so it might turn out just right. Very nice article Jarred, thank you. Although material about Nvidia's next gen is scarce, I would still very much like to read about it.
 
Although so many people hope for AMD to beat Nvidia's 3080Ti with Big Navi, it is probably best for us all if they stay slightly below that in performance terms and drive the price down. As for them dragging their feet, maybe but they did need the time to prepare the console chips, and it seems that they stumbled upon a heureka moment with perf per Watt in the process, so it might turn out just right. Very nice article Jarred, thank you. Although material about Nvidia's next gen is scarce, I would still very much like to read about it.
I'm working on that as well. Details are perhaps even more limited than what we've heard about Big Navi, but rumors abound!
 
  • Like
Reactions: sstanic