News Doom Eternal Runs Faster on Old AMD GPUs Than Comparable Nvidia GPUs

nvidia really depends on driver optimizations per game. It sets all the ratios between components to squeeze every last drop out of GPU.
amd have general profiles that are X:Y ratios, DONE.
nvidia have per game profiles that you remember from updates saying "tomb raider supported" that set those values to 0.89:1.2 becase that game needs more Y less X.
AMD cpu was similar peformer but with NV optimizations, they were above by that ~20% margin.
but they cannot support all the games for all the gpu's so now they lack the optimization table.
 
  • Like
Reactions: gg83

gg83

Distinguished
Jul 10, 2015
640
293
19,260
nvidia really depends on driver optimizations per game. It sets all the ratios between components to squeeze every last drop out of GPU.
amd have general profiles that are X:Y ratios, DONE.
nvidia have per game profiles that you remember from updates saying "tomb raider supported" that set those values to 0.89:1.2 becase that game needs more Y less X.
AMD cpu was similar peformer but with NV optimizations, they were above by that ~20% margin.
but they cannot support all the games for all the gpu's so now they lack the optimization table.
So does AMD save time and money this way? Or is it hard to tell?
 

salgado18

Distinguished
Feb 12, 2007
928
373
19,370
nvidia really depends on driver optimizations per game. It sets all the ratios between components to squeeze every last drop out of GPU.
amd have general profiles that are X:Y ratios, DONE.
nvidia have per game profiles that you remember from updates saying "tomb raider supported" that set those values to 0.89:1.2 becase that game needs more Y less X.
AMD cpu was similar peformer but with NV optimizations, they were above by that ~20% margin.
but they cannot support all the games for all the gpu's so now they lack the optimization table.
What do you mean by X:Y ratio?
 
  • Like
Reactions: TJ Hooker

jgraham11

Distinguished
Jan 15, 2010
54
21
18,535
Fine wine Technology! AMD has been a great forward looking company. Notice that more of
their technology is released as Open-Sourced, meaning others can build on it and make it
better! Vulcan, the new adaptive sharpening filters, adaptive sync. etc...

I've always appreciated AMD's tendency to supply better memory to their cards compared to
Nvidia, and they've done it for years!
 
"We would say there's no excuse for Nvidia to leave its older cards out given that they're a much bigger company than AMD. " There is a term for this, planned obsolescence. NVidia forces you to replace your GPU far more often since they stop optimizing games for older tech. This is a common practice for the biggest companies in a sector, see Apple.
 

larkspur

Distinguished
This is a quote from Tom's performance review of Doom: Eternal:
Doom Eternal needs 2942 MiB of VRAM to do 1080p with its low preset. It will still run on a 2GB card like the GTX 1050, even though it doesn't have the requisite 3GB of VRAM, but you can't even try bumping most settings higher. If you want to run at 1080p medium, you'll need a card with 4GB or more VRAM (3502 MiB to be precise), while 1080p high also comes in just under the 4GB barrier at 4078 MiB. Ultra needs 5230 MiB at 1080p, 5437 MiB at 1440p and 6025 MiB at 4K—so at least 6GB of VRAM. Nightmare pushes just beyond 6GB, to 6254 MiB, and ultra nightmare needs 6766 MiB at 1080p—an 8GB GPU will suffice in either case, at resolutions up to 4K.

Since Doom: Eternal has proven to need plenty of vRAM, I would think the old GTX 780 with its paltry 3gb is handicapped. The R9 290 has 4gb and is clearly performing better in this game. Driver optimizations play a factor, but less than 4gb of RAM in today's demanding games can really be a handicap.
 
This is a quote from Tom's performance review of Doom: Eternal:

Since Doom: Eternal has proven to need plenty of vRAM, I would think the old GTX 780 with its paltry 3gb is handicapped. The R9 290 has 4gb and is clearly performing better in this game. Driver optimizations play a factor, but less than 4gb of RAM in today's demanding games can really be a handicap.
Using Tom's own benchmarks at 1080p low shows the lowly 2GB 1050 pulling 54.6 FPS average/40.8 99th%. That is already higher than the 780 which according to this article says: "Whereas the GTX 780 is able to put down an average of 45 FPS, the R9 290, pushes out 116 FPS on average." The optimization factor is still seen in Tom's own review between the R9 390 and the GTX 1060 6GB. Most games at 1080p the 1060 6GB outperforms the 390 https://www.anandtech.com/bench/product/2303?vs=2301, however, there are some where the 390 is tied or a little a head. We see in the GPU shootout that the 390 & 1060 are tied at 1080p, except ultra for some reason, but 1440p & 2160p the 390 is 10+% faster.
 

King_V

Illustrious
Ambassador
This is a quote from Tom's performance review of Doom: Eternal:

Since Doom: Eternal has proven to need plenty of vRAM, I would think the old GTX 780 with its paltry 3gb is handicapped. The R9 290 has 4gb and is clearly performing better in this game. Driver optimizations play a factor, but less than 4gb of RAM in today's demanding games can really be a handicap.

True - but is the 3GB handicap so bad that it would be the primary cause of bringing the framerates to LESS than half of what the R9 290 manages?
 
  • Like
Reactions: jeremyj_83

larkspur

Distinguished
True - but is the 3GB handicap so bad that it would be the primary cause of bringing the framerates to LESS than half of what the R9 290 manages?
Without a doubt, driver optimizations are a major factor. This is how they actually get the game to work decently with the 1050 2gb. I just wanted to point out that Nvidia has traditionally released cards with less vRAM than their AMD competition and this also causes them to reach obsolescence at an earlier date. The fact that various iterations of GCN have been used since the HD 7xxx days all the way through Vega also helps AMD provide longer driver support... we're talking 8 years of GCN iterations!
 
  • Like
Reactions: King_V
Without a doubt, driver optimizations are a major factor. This is how they actually get the game to work decently with the 1050 2gb. I just wanted to point out that Nvidia has traditionally released cards with less vRAM than their AMD competition and this also causes them to reach obsolescence at an earlier date. The fact that various iterations of GCN have been used since the HD 7xxx days all the way through Vega also helps AMD provide longer driver support... we're talking 8 years of GCN iterations!
8 years of GCN is nothing. CUDA was first used by nVidia in 2007 with the G80 GPUs and is still used by all GeForce, Tesla, and Quadro GPUs.
 

TJ Hooker

Titan
Ambassador
This is a quote from Tom's performance review of Doom: Eternal:

Since Doom: Eternal has proven to need plenty of vRAM, I would think the old GTX 780 with its paltry 3gb is handicapped. The R9 290 has 4gb and is clearly performing better in this game. Driver optimizations play a factor, but less than 4gb of RAM in today's demanding games can really be a handicap.
Even old HD 7000 cards with 2 GB of VRAM are outperforming the 780 (Ti) in Doom Eternal. Doesn't look like VRAM is the issue here.
 
Last edited:

salgado18

Distinguished
Feb 12, 2007
928
373
19,370
8 years of GCN is nothing. CUDA was first used by nVidia in 2007 with the G80 GPUs and is still used by all GeForce, Tesla, and Quadro GPUs.
8 years of GCN paying off. I know if I buy an RX 5700 XT, for example, I'm good for a very long time. I know it's not GCN, but it's AMD. ;)

What's the point of Nvidia's architecture being old, if old cards don't work well?
 
CUDA is an API. Doesn't really say anything about the underlying GPU architecture.
NVidia also calles the cores in the GPUs CUDA cores. Starting in 2007 with the G80 that called them Stream Processors which were very different from previous versions of GPUs. Beginning with at the latest the GTX 580, in 2010, nVidia started calling them CUDA cores. "Fundamentally GF110 is the same architecture as GF100, especially when it comes to compute. 512 CUDA Cores are divided up among 4 GPCs " https://www.anandtech.com/show/4008/nvidias-geforce-gtx-580/2 essentially these are very similar to the Stream Processors that they were called back at the G80. While yes CUDA is an API, it is also what nVidia calls their GPU cores since before GCN was made.
 

TJ Hooker

Titan
Ambassador
NVidia also calles the cores in the GPUs CUDA cores. Starting in 2007 with the G80 that called them Stream Processors which were very different from previous versions of GPUs. Beginning with at the latest the GTX 580, in 2010, nVidia started calling them CUDA cores. "Fundamentally GF110 is the same architecture as GF100, especially when it comes to compute. 512 CUDA Cores are divided up among 4 GPCs " https://www.anandtech.com/show/4008/nvidias-geforce-gtx-580/2 essentially these are very similar to the Stream Processors that they were called back at the G80. While yes CUDA is an API, it is also what nVidia calls their GPU cores since before GCN was made.
"CUDA core" is just a generic term Nvidia has used for their GPU cores since they moved to unified shaders in 2006. This is equivalent to AMD starting to call their cores "stream processors" in 2007. This doesn't mean that both companies have been using the same GPU micro-architecture for 13/14 years. Unless you think Terascale, GCN, and RDNA are all the same architecture just because they all use a unified shader architecture.
 
  • Like
Reactions: alextheblue
"CUDA core" is just a generic term Nvidia has used for their GPU cores since they moved to unified shaders in 2006. This is equivalent to AMD starting to call their cores "stream processors" in 2007. This doesn't mean that both companies have been using the same GPU micro-architecture for 13/14 years. Unless you think Terascale, GCN, and RDNA are all the same architecture just because they all use a unified shader architecture.
Terascale isn't even close to GCN or RDNA. Terascale is VLIW based where as GCN and RDNA are SIMD based. All of the nVidia since at least Fermi until now are similar in design as the entire GCN family.
 
We would say there's no excuse for Nvidia to leave its older cards out given that they're a much bigger company than AMD.
Part of this might be that the GCN architecture was simply better suited for newer games than what Nvidia's cards were using at the time. We even see this with more recent cards, to some degree. It may be partly due to continued driver support, but the design of the hardware likely plays a role as well.

Nvidia may optimize their architecture for optimal performance in current titles at the time of a card's release, getting them more performance and efficiency out of a given level of hardware in the short term, whereas AMD might not optimize their architecture quite as much for what's out there now, instead brute-forcing their way to a similar level of performance while providing more hardware than is necessary in some areas. Then, down the line, when that hardware is actually utilized by games (for example, previously untapped compute performance), their cards may have a tendency to pull ahead.

Of course, just because that applied to GCN, doesn't necessarily mean it will apply as much to their newer architectures, like Navi. If the newer architecture becomes better optimized for current titles, it could be that they're leaving less untapped performance on the table.
 
The nvidia Optimized settings driver for Enternal on my GTX 960 4GB, FX 8320 @4.3Ghz is like low on 1920x1200. The game defaulted to high on everything. Holds 55-65FPS without issue on high for everything. Not using any fancy screen. Just a dell 24 Inch Ultrasharp so only 60Hz anyways.
 
I don't think Nvidia's driver optimization is that insanely complicated, but it doesn't matter, nvidia never shares what they do at the driver level to optimize games.

The reason why Nvidia GPUs have lower vram capacity than AMD is due to Nvidia's far superior memory compression technology. This is the main reason why 1060 6GB cards perform similar or equal to 580 8GB cards. (But their memory compression doesn't save them all the time, some games don't benefit as much...)

The biggest problem with Nvidia's older architectures, namely Kepler 1.0 & 2.0 (ESPECIALLY), Maxwell 1.0 & 2.0, and Pascal is that they were built with DX11 in mind. Pascal was more of a 50/50 (DX11/DX12), but the previous architectures were heavilly biased towards DX11 because that's all that was used for a majority of titles.
The only exception now is Turing which is finally NVidia's first architecture that is truly DX12/Vulkan optimized and not prioritized more towards DX11.
(This is easy to identify, lets take a RTX 2060 Super vs a GTX 1080, in DX11 titles they are both equal for the most part. But in DX12/Vulkan titles, the 2060 Super almost always has a 5-10% lead at the least.)

AMD's always had a major advantage over Nvidia when it came to newer bare metal APIs. Remember Mantle?? Mantle was the father/prototype for the Vulkan API. This was like 5 years ago, and AMD all the way back then was optimizing their GCN architectures for the AMD Mantle low level API. Even though Mantle failed, their architecture being biased towards mantle payed off tremendously with Vulkan being founded on the Mantle API. And DX12 seems to have the same behaviour due to it's similar "more down to the hardware level" type characteristics.

Going back to DOOM Eternal, I think the problem right now is the devs. They need to fix all the performance bugs before we can get a good outline of actual performance.
 
The reason why Nvidia GPUs have lower vram capacity than AMD is due to Nvidia's far superior memory compression technology. This is the main reason why 1060 6GB cards perform similar or equal to 580 8GB cards. (But their memory compression doesn't save them all the time, some games don't benefit as much...)
In this game, the 1060 6GB performs more like an RX 570. : P


In any case, memory compression isn't making any significant difference with nearly all current games with these cards. Very few titles currently benefit from having access to more than 6GB of VRAM, especially at the resolutions these cards are typically running at. So comparing a 6GB card against an 8GB model, the VRAM deficiency isn't going to be all that meaningful at this time, nor are any potential benefits or limitations of memory compression. Maybe when comparing 4GB cards, or perhaps in games coming out a couple years from now, but otherwise not so much at this time.
 
  • Like
Reactions: alextheblue
The reason why Nvidia GPUs have lower vram capacity than AMD is due to Nvidia's far superior memory compression technology. This is the main reason why 1060 6GB cards perform similar or equal to 580 8GB cards. (But their memory compression doesn't save them all the time, some games don't benefit as much...)
Unless you are in a situation where VRAM capacity is a limiting factor, having more VRAM on one GPU vs another doesn't matter. Based on your logic, back when 1GB VRAM was a lot, you would have had 1GB HD5670's being faster than 512MB HD4870's all the time since they have near identical memory compression. In the GTX 1060 6GB vs 580 comparison, the GTX 1060 when it was released was usually faster than the 580 benchmark. Now the 580 has aged better than the 1060 and it is typically faster than the 1060 benchmark. However, VRAM capacity has nothing to do with that since even in Doom Eternal you only need 6GB+ at 4k and we already know the 580 & 1060 aren't 4k GPUs.

The biggest problem with Nvidia's older architectures, namely Kepler 1.0 & 2.0 (ESPECIALLY), Maxwell 1.0 & 2.0, and Pascal is that they were built with DX11 in mind. Pascal was more of a 50/50 (DX11/DX12), but the previous architectures were heavilly biased towards DX11 because that's all that was used for a majority of titles. The only exception now is Turing which is finally NVidia's first architecture that is truly DX12/Vulkan optimized and not prioritized more towards DX11. (This is easy to identify, lets take a RTX 2060 Super vs a GTX 1080, in DX11 titles they are both equal for the most part. But in DX12/Vulkan titles, the 2060 Super almost always has a 5-10% lead at the least.)
This is at least mostly correct information. The pre-Turing micro-architectures were not designed with DX12 fully in mind. Pascal was far better than Maxwell with DX12 and Vulkan implementation. That is seen in that the GTX 980 was faster than the 1060 at the 1060 launch, but now that more games use Vulkan and DX12 the 1060 is generally faster than the 980. You statement about the 1080 and 2060 Super is pretty wrong. Looking at the 2060 Super launch review you find these results at 1440p. 2060 Super Launch Benchmarks

1.) Battle Field V DX 12 - 2060 Super +7%
2.) Destin 2 DX11 - Equal
3.) Far Cry 5 DX11 - 2060 Super +5%
4.) Final Fantasy XV DX 11 - 2060 Super +12%
5.) Forza Horizon 4 DX12 - 1080 +10%
6.) Metro Exodux DX12 - 2060 Super +20%
7.) Shadow of the Tom Raider DX12 - 2060 Super +15%
8.) Strange Brigade Vulkan - 2060 Super +30%
9.) Tom Clancy's The Division DX12 - 2060 Super +25%
10.) Tom Clancy's Ghost Recon DX11 - 2060 Super +8%
11.) The Witcher 3 DX11 - 2060 Super +15%
12.) Wolfensten II New Colossus Vulkan - 2060 Super +25%

In DX12 or Vulkan games the 2060 Super is ahead by about 15% average. The only game it lost to the 1080 was in DX12 as well.
In DX11 the 2060 Super is ahead by about 9% on average. DX11 also has the only game in which they tied.

The number of CUDA cores in the 2060 Super is 2176 whereas the GTX 1080 has 2560 Pascal CUDA cores. The clock speed also favors the 1080 when compared to the 2060 Super. The 2060 Super does have more RAM bandwidth at 464GB/s vs 320GB/s for the 1080. Based on this information we can see that Turing is a superior architecture compared to Pascal and it was able to help out not just in DX12/Vulkan but also in DX11.
 
Unless you are in a situation where VRAM capacity is a limiting factor, having more VRAM on one GPU vs another doesn't matter. Based on your logic, back when 1GB VRAM was a lot, you would have had 1GB HD5670's being faster than 512MB HD4870's all the time since they have near identical memory compression. In the GTX 1060 6GB vs 580 comparison, the GTX 1060 when it was released was usually faster than the 580 benchmark. Now the 580 has aged better than the 1060 and it is typically faster than the 1060 benchmark. However, VRAM capacity has nothing to do with that since even in Doom Eternal you only need 6GB+ at 4k and we already know the 580 & 1060 aren't 4k GPUs.


This is at least mostly correct information. The pre-Turing micro-architectures were not designed with DX12 fully in mind. Pascal was far better than Maxwell with DX12 and Vulkan implementation. That is seen in that the GTX 980 was faster than the 1060 at the 1060 launch, but now that more games use Vulkan and DX12 the 1060 is generally faster than the 980. You statement about the 1080 and 2060 Super is pretty wrong. Looking at the 2060 Super launch review you find these results at 1440p. 2060 Super Launch Benchmarks

1.) Battle Field V DX 12 - 2060 Super +7%
2.) Destin 2 DX11 - Equal
3.) Far Cry 5 DX11 - 2060 Super +5%
4.) Final Fantasy XV DX 11 - 2060 Super +12%
5.) Forza Horizon 4 DX12 - 1080 +10%
6.) Metro Exodux DX12 - 2060 Super +20%
7.) Shadow of the Tom Raider DX12 - 2060 Super +15%
8.) Strange Brigade Vulkan - 2060 Super +30%
9.) Tom Clancy's The Division DX12 - 2060 Super +25%
10.) Tom Clancy's Ghost Recon DX11 - 2060 Super +8%
11.) The Witcher 3 DX11 - 2060 Super +15%
12.) Wolfensten II New Colossus Vulkan - 2060 Super +25%

In DX12 or Vulkan games the 2060 Super is ahead by about 15% average. The only game it lost to the 1080 was in DX12 as well.
In DX11 the 2060 Super is ahead by about 9% on average. DX11 also has the only game in which they tied.

The number of CUDA cores in the 2060 Super is 2176 whereas the GTX 1080 has 2560 Pascal CUDA cores. The clock speed also favors the 1080 when compared to the 2060 Super. The 2060 Super does have more RAM bandwidth at 464GB/s vs 320GB/s for the 1080. Based on this information we can see that Turing is a superior architecture compared to Pascal and it was able to help out not just in DX12/Vulkan but also in DX11.

Thank you for correcting me. I guessed the extra performance was purely due to the APIs. That's cool, that Turing is that much faster in any API.
 
What do you mean by X:Y ratio?
its about cuda to memory clocks, shared to isolated cache ratios ... magic of cuda cores is that you can slap more of them for rasterization than shaders etc... if you move those a bit left or right, you can utilize card 99.8% where on AMD level you utilize shaders 100% and others at 80-70% because each game does not use same ratios between components same way. play a bit with voltages, speeds, what have more cache and this is what they gained.
 

TJ Hooker

Titan
Ambassador
Terascale isn't even close to GCN or RDNA. Terascale is VLIW based where as GCN and RDNA are SIMD based. All of the nVidia since at least Fermi until now are similar in design as the entire GCN family.
Ok, now you seem to be arguing that a new ISA is what differentiates a new architecture. Does that mean that both Intel and AMD haven't released a new CPU architecture in ~40 years?

Do you have a source for your last sentence?
 
Last edited: