News Nvidia Ampere Architecture Deep Dive: Everything We Know

Chung Leong

Reputable
Dec 6, 2019
493
193
4,860
An overlooked item in Nvidia's presentation is the support for loading compressed assets directly from NVMe SSD to VRAM. That puts the PC on even ground with the next-gen consoles. No more loading corridor/elevator rides! It's going to suck for people who bought RTX 20xx cards. In future cross-platform titles they're going to run into loading screens all the time.
 
  • Like
Reactions: domih

setx

Distinguished
Dec 10, 2014
225
149
18,760
DLSS is one big scam. If you compute the game at 1080p and draw it on 4k monitor – is the game running at 1080p or 4k? According to anyone sane – it's 1080p, according to DLSS – it's 4k (because you get 4k pixels in the end!)

An overlooked item in Nvidia's presentation is the support for loading compressed assets directly from NVMe SSD to VRAM. That puts the PC on even ground with the next-gen consoles.
There is nothing remotely close to nVidia GPUs on current and next gen consoles so API and usage technique of this feature might be different. The performance gain is also questionable, as now you can just DMA SSD->RAM then DMA RAM->VRAM and on PCI-E 4 both operations would be already quite fast.
 
DLSS is one big scam. If you compute the game at 1080p and draw it on 4k monitor – is the game running at 1080p or 4k? According to anyone sane – it's 1080p, according to DLSS – it's 4k (because you get 4k pixels in the end!)


There is nothing remotely close to nVidia GPUs on current and next gen consoles so API and usage technique of this feature might be different. The performance gain is also questionable, as now you can just DMA SSD->RAM then DMA RAM->VRAM and on PCI-E 4 both operations would be already quite fast.
Techspot had an article about DLSS a few months ago.


DLSS 1.0 was indeed "far from perfect ", but DLSS 2.0 is much improved.
 

vinay2070

Distinguished
Nov 27, 2011
255
58
18,870
When can we expect benchmarks of 3080 and 3090? Also can you please include few 3440x1440 benchmarks for reference? And pcie 3 and 4 scaling? Thanks.
 

nofanneeded

Respectable
Sep 29, 2019
1,541
251
2,090
The good news about the 3090 and 3080 is the new heat pipe cooler and two opposite fans. in the past they just talked about heat chamber and not heatpipes . Now , for the first time , I might get the reference card .

More over , the Squeezed PCB is a crazy thing , Just think of RTX 2080/3090 170mm long only PCB water cooled ?
Great for SFF PC .
 
Quality of it is completely separate matter and doesn't change the fact that "DLSS 4k" is not 4k while sounding close enough to fool most not tech-savvy people.
Technically DLSS 4K isn't rendering at 4K. But it still doesn't take away the fact that we can get image quality that's close enough to native resolution without having to pay the price for the native rendering.

But if we're going to talk about DLSS 4K not being "true 4K rendering", then what about VRR that's being pushed around? Is native 4K rendering with VRR "true 4K rendering"? And if not, should we also call VRR a scam?
 

Sketro

Honorable
Jan 16, 2014
2
0
10,510
An overlooked item in Nvidia's presentation is the support for loading compressed assets directly from NVMe SSD to VRAM. That puts the PC on even ground with the next-gen consoles. No more loading corridor/elevator rides! It's going to suck for people who bought RTX 20xx cards. In future cross-platform titles they're going to run into loading screens all the time.

Yeah sure if you have PCIE4 is what i heard - without what's the bandwidth like ? Hopefully some benchmarks soon.
 

domih

Reputable
Jan 31, 2020
187
170
4,760
RTX IO is not mentioned in the article but deserves a look into.

NVidia acquired Mellanox in 2019 and as expected the new NVidia graphics cards includes serdes and network Mellanox technologies to offload storage and networking from the CPU have it on directly on the GPU. Think of your GPU able to do 100, 200... Gbps for storage or networking (from GPU memory direct to GPU memory, no CPU IP stack involved).

The future application are tremendous: a cluster of GPUs that need minimal CPU resources for processing giant AI tasks (or other intense computing) inside rack cabinets full of GPU cards.

Again no wonder NVidia wants to buy ARM: it then could sell complete processing solutions (e.g. giant AI solutions with horizontal scaling) without spending a kopek for Intel or AMD x86 CPUs and where the OS and orchestration would run on ARM-based CPU.

Offloading processing from the CPU has been a constant of this industry since the early beginnings, DMA being the initial impulse.
 

Chung Leong

Reputable
Dec 6, 2019
493
193
4,860
The performance gain is also questionable, as now you can just DMA SSD->RAM then DMA RAM->VRAM and on PCI-E 4 both operations would be already quite fast.

In the middle of a game, the GPU would be near 100% load. I don't think it can process the CPU's DMA request in a timely fashion. So we want the DMA controller on the NVMe device to handle whole operation. That's my understanding.
 

setx

Distinguished
Dec 10, 2014
225
149
18,760
Technically DLSS 4K isn't rendering at 4K. But it still doesn't take away the fact that we can get image quality that's close enough to native resolution without having to pay the price for the native rendering.
Oh, we have plenty of such things that started as "close enough to" but went completely stupid after a while. Just look how the CPU process nodes are named.
Also "close enough" is purely subjective and never works for certain things (like readability of small text).

But if we're going to talk about DLSS 4K not being "true 4K rendering", then what about VRR that's being pushed around? Is native 4K rendering with VRR "true 4K rendering"? And if not, should we also call VRR a scam?
In VRR at least the most important portion of image is rendered at that resolution while in DLSS nothing is. Pure scam.

In the middle of a game, the GPU would be near 100% load. I don't think it can process the CPU's DMA request in a timely fashion. So we want the DMA controller on the NVMe device to handle whole operation. That's my understanding.
Your understanding is completely wrong: DMA was made exactly for data transfers not to take any CPU time. Without DMA any HDD operation would bring one of your CPU's cores to 100%. The "100% GPU load" you see is 100% of compute cores, not 100% of all different parts of GPU (including video decode/encode engine, ...)
 
Quality of it is completely separate matter and doesn't change the fact that "DLSS 4k" is not 4k while sounding close enough to fool most not tech-savvy people.
Calling it DLSS 4K isn't a deceptive marketing term unless you advertise the performance with DLSS enabled without mentioning DLSS being used as being better than the competitor that does not have DLSS.

The above is assuming you could tell the difference objectively through the use of comparing screenshots of DLSS 4K having inferior quality to a native 4K image.

Having said that, if Nvidia gets DLSS to a point where you could not tell the difference in quality between a 1440p and a 4K screenshot, even when zoomed in , then is it really fooling not tech-savvy people or really anyone if you used DLSS 4K and 4K interchangeably?

Side note: If you randomly added DLSS to the beginning of something most people would ask what that meant.
Aquafina DLSS Pure Water (I'd ask a lot of questions before drinking lol)
Chevrolet DLSS Camaro
DLSS BLT
 
Calling it DLSS 4K isn't a deceptive marketing term unless you advertise the performance with DLSS enabled without mentioning DLSS being used as being better than the competitor that does not have DLSS.

The above is assuming you could tell the difference objectively through the use of comparing screenshots of DLSS 4K having inferior quality to a native 4K image.

Having said that, if Nvidia gets DLSS to a point where you could not tell the difference in quality between a 1440p and a 4K screenshot, even when zoomed in , then is it really fooling not tech-savvy people or really anyone if you used DLSS 4K and 4K interchangeably?

Side note: If you randomly added DLSS to the beginning of something most people would ask what that meant.
Aquafina DLSS Pure Water (I'd ask a lot of questions before drinking lol)
Chevrolet DLSS Camaro
DLSS BLT
I'm going to order a DLSS BLT for lunch today!
 
  • Like
Reactions: hotaru.hino

spongiemaster

Admirable
Dec 12, 2019
2,276
1,280
7,560
I have to say I'm a bit disappointed in regards to memory. 8GB has been pretty much the mainstream standard for the last two generations and worked pretty well from FHD to QHD. I was hoping this gen would finally achieve mainstream 4K, but I fear 8GB/10GB will be lacking for that.
Improved compression and NVCache should reduce the memory demands vs previous generations.
 
I have to say I'm a bit disappointed in regards to memory. 8GB has been pretty much the mainstream standard for the last two generations and worked pretty well from FHD to QHD. I was hoping this gen would finally achieve mainstream 4K, but I fear 8GB/10GB will be lacking for that.
I've been hoping for a while to see if someone can really dig into how VRAM is actually used in games. Some games, like FFXV and Call of Duty (there's an option for it), will happily eat all the VRAM but somehow performance is stable. I'm also not really convinced that the VRAM counters in games like in GTAV or Resident Evil 2 are accurate per-se. I did a test on GTAV where it claimed it would use something like 1GB of VRAM, but it ended up at the end of a benchmark run using 1.7GB. I also suspect that a good chunk of VRAM is just swap space for whatever textures are needed at the moment. Once the game no longer needs it, the textures get dumped for something else.
 

setx

Distinguished
Dec 10, 2014
225
149
18,760
Having said that, if Nvidia gets DLSS to a point where you could not tell the difference in quality between a 1440p and a 4K screenshot, even when zoomed in , then is it really fooling not tech-savvy people or really anyone if you used DLSS 4K and 4K interchangeably?
If you could not tell the difference – that means only one thing: the source is also not on 4k level on details. This is very well known in image and video processing for many years.

Calling it DLSS 4K isn't a deceptive marketing term unless you advertise the performance with DLSS enabled without mentioning DLSS being used as being better than the competitor that does not have DLSS.
Yes, it is your typical deceptive marketing: enough to fool many people but not enough to get them in trouble with courts.
 
I've been hoping for a while to see if someone can really dig into how VRAM is actually used in games. Some games, like FFXV and Call of Duty (there's an option for it), will happily eat all the VRAM but somehow performance is stable. I'm also not really convinced that the VRAM counters in games like in GTAV or Resident Evil 2 are accurate per-se. I did a test on GTAV where it claimed it would use something like 1GB of VRAM, but it ended up at the end of a benchmark run using 1.7GB. I also suspect that a good chunk of VRAM is just swap space for whatever textures are needed at the moment. Once the game no longer needs it, the textures get dumped for something else.
Honestly, the best way to look at VRAM requirements is to test performance. If you see a massive drop off in performance past a certain point while keeping resolution the same, you've probably exceeded the VRAM of your GPU. CoD definitely allocates as much VRAM as your GPU has available, but it doesn't need all that VRAM. Utilities like MSI Afterburner and GPU-Z also show allocated VRAM rather than what is actually needed / used. There aren't many games that will really use more than 8GB VRAM effectively right now at 4K, without mods, but it can be done. I'm pretty sure MS Flight Simulator makes decent use of GPUs with more VRAM for example.
 
CoD definitely allocates as much VRAM as your GPU has available, but it doesn't need all that VRAM. Utilities like MSI Afterburner and GPU-Z also show allocated VRAM rather than what is actually needed / used.
I think this point needs to be driven hard. Currently VRAM counters (I think even Task Manager's, even though it uses a different measuring point) only go after allocated VRAM. But that doesn't necessarily mean it's VRAM that's needed for the given frame the game is trying to render. I read a similar thing happens with Windows (and possibly any modern OS) with system memory management, so it wouldn't surprise me if the same thing happens with 3D apps and video cards.