News Nvidia's RTX 40-Series Laptops Hint at Future RTX 4060 and 4050 Desktop Specs

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

korekan

Commendable
Jan 15, 2021
86
8
1,535
this probably proofed that starting from 30xx to 40xx the performance increase is only by consuming more power. not more efficiency or new technology.
 

jp7189

Distinguished
Feb 21, 2012
323
187
18,860
Bingo! And even scaling with resolution does not heavily impact memory requirements: Textures are textures, you will be loading the same textures regardless of render resolution at the same game settings (and if you scale back texture resolution - if that's even an option the game exposes - at higher render resolutions you reduce memory footprint!), or geometry, so its buffers that scale with render resolution. Take 1080p to 4k for example: buffer size quadruples, but even if we assume a good 10 buffers (for depth and normals and Z and diffuse and whatever other buffers your render pipeline involves) and 32bpp for each buffer, that goes from ~79MB at 1080p to ~316MB at 4k. Not a huge impact to total vRAM usage.

The vast majority of vRAM is not taken up by buffers or active-use textures and geometry, but by opportunistically cached textures and geometry from the rest of the level that is crammed into any spare vRAM and overwritten (with zero performance impact) if/when actual live data needs that space. That opportunistically cached data may never make its way on screen before being overwritten, but any good engine should be trying to cache it anyway when the PCIe bus is not otherwise occupied and there is spare vRAM, because there is zero penalty from doing so and it may have a small chance of avoiding a cache miss and memory or drive read later. As DirectStorage moves from something individual developers implement to a commonly available API, even that will become less of a necessity as access overheads from out-of-vRAM data are reduced.

When you see a game 'use' large quantities of vRAM, the amount used is almost always what the game has reached by running out of data to cache for the level/chunk loaded, not the amount of data it actually needs for rendering.
I disagree. Asset pop is real and it's annoying. In open world games, more vram equals more render distance. Low vram means distant assets can't be loaded and ready. I'm sure we've all seen games that render at a great frame rate when your fov is static, but spin the view suddenly and it turns in to a stuttering mess of fuzzy/missing textures and assets while you wait for limited vram to be swapped around. That's an example of enough gpu, but not enough vram., and it was often the case for me on a 3080 10GB. Even though I had enough gpu, I had to turn down options to fit the vram. I moved up to a 3090 24GB, barely more gpu performance, but much smoother in some games.
 

InvalidError

Titan
Moderator
I wouldn't put it past nvidia to put 12GB on a 128 bit bus and just run uneven channels. We've seen that before.
There isn't really anything wrong with that as long as the drivers can shuffle stuff between memory channels to keep the net load balanced. Out of 12GB worth of stuff in VRAM, I'm pretty sure the bulk of it would be perfectly fine at half-bandwidth.

Given how Nvidia (and AMD) charge exorbitant amounts for increasingly less silicon though, I don't expect Nvidia to bother being so generous as to give people extra memory.

BTW, we have 24Gbits packages now, so a model with 4x3GB would also be possible.
 
Bingo! And even scaling with resolution does not heavily impact memory requirements: Textures are textures, you will be loading the same textures regardless of render resolution at the same game settings (and if you scale back texture resolution - if that's even an option the game exposes - at higher render resolutions you reduce memory footprint!), or geometry, so its buffers that scale with render resolution. Take 1080p to 4k for example: buffer size quadruples, but even if we assume a good 10 buffers (for depth and normals and Z and diffuse and whatever other buffers your render pipeline involves) and 32bpp for each buffer, that goes from ~79MB at 1080p to ~316MB at 4k. Not a huge impact to total vRAM usage.

The vast majority of vRAM is not taken up by buffers or active-use textures and geometry, but by opportunistically cached textures and geometry from the rest of the level that is crammed into any spare vRAM and overwritten (with zero performance impact) if/when actual live data needs that space. That opportunistically cached data may never make its way on screen before being overwritten, but any good engine should be trying to cache it anyway when the PCIe bus is not otherwise occupied and there is spare vRAM, because there is zero penalty from doing so and it may have a small chance of avoiding a cache miss and memory or drive read later. As DirectStorage moves from something individual developers implement to a commonly available API, even that will become less of a necessity as access overheads from out-of-vRAM data are reduced.

When you see a game 'use' large quantities of vRAM, the amount used is almost always what the game has reached by running out of data to cache for the level/chunk loaded, not the amount of data it actually needs for rendering.
I'm not basing the VRAM statements off of what utilities claim is used, but on actual real-world testing. Most games at 1080p are fine, a few at 1440p can exceed 8GB actual use, and a growing number are exceeding 8GB at 4K. From what I've seen, due to the number of buffers used, a game that uses perhaps 4GB of VRAM at 1080p and max settings will need just over 6GB of VRAM at 4K (Red Dead Redemption 2 and several other games that show approximate memory use follow this pattern). And a game that needs 6GB of VRAM at 1080p will need just over 8GB at 4K.

While modern games can opportunistically cache data into VRAM before it's needed, all you have to do is look at performance comparisons between cards with 8GB, 12GB, and 16GB at different settings to determine if the game is truly using more than 8GB. If the 16GB and 8GB cards perform roughly the same at 1080p but the 16GB card is twice as fast at 4K, that's a good indication the game is using more than 8GB VRAM.

The reality is that many games now have 4K textures and even 8K textures. The 8K MIPMAPS almost never get used, but even storing the 4K and lower MIPMAPS can use a lot of VRAM. For example, ONE 4K texture in VRAM needs:
4K x 4K: 64MiB
2K x 2K: 16MiB
1K x 1K: 4MiB
512 x 512: 1MiB
256 x 256: 256KiB
128 x 128: 64KiB
64 x 64: 16KiB

That's ~85MiB for a single 4K texture. Games can have literally hundreds and even thousands of textures, though not all of them are used in every scene. If you look at a game's install size and subtract any video and audio files, a 100GiB game often ends up with 50–75GiB of texture data as the primary reason for the large install size. And if you have an HD texture pack, you can get upward of 50GiB of additional storage space used.
 

Thunder64

Distinguished
Mar 8, 2016
110
148
18,760
Three years down the road all the AAA games will still be console ports from the current generation so what powers 70 native fps today probably will still power 60 native fps. My GTX980 lasted through a whole console generation or 4 PC GPU generations. When was the last AAA game that require the best and latest GPU just to run? My GPU upgrade decision going forward will be purely AI compute and CG rendering dependent.

Uhh, no? We've heard the all PC games will be console ports before. Has yet to happen. But more importantly, I don't think any AAA game has ever required the "best and latest GPU just to run". That'd be a great way to shoot oneself in the foot and kill sales. A developer could never do that as they wouldn't make money.
 

Joseph_138

Distinguished
How a mobile GPU performs, isn't necessarily an indicator of how the desktop part will perform. The thermal requirements of a mobile GPU are very different from a desktop GPU, because there's only so much cooling that you fit into a laptop. Clock speeds, and sometimes even render unit counts, are often reduced, or other compromises may be made, from their desktop equivalents to make the chip run cooler, even though the name of the part may be the same.