News Nvidia Reveals DLSS 3.5: AI-Powered Ray Reconstruction

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Those aren't "cheats". Everything in the graphic pipeline is full of tricks to optimize performance, and using AI is not different. Is math. Is not a moral thing.
Polygon rasterization is inherently a cheat. You're using polgons to approximate the geometry of a real object, and then painting them on the screen to approximate the way light would project the image on a camera image sensor. Polygon rasterization doesn't even pretend to be physically accurate, and it deviates to such a degree that you can't honestly characterize it as merely an "optimization".

Games were always pixelated, and you could also claim that those are "cheats".
Eh, depends on how you look at it, but if we're treating the image plane as a "virtual camera", then I see nothing wrong with quantizing it in the spatial domain.

The original Tomb Raider had triangles which switched perspectives as the camera moved, jerking from one position to another, because the arithmetic used didn't had enough precision, but that's what allowed the game to run.
That & your quake example are merely optimizing a cheat. You're missing the forest for the trees, here.

My point was that graphics is inherently a cheat, so it strikes me as a bit silly for people to take such a principled stand that some pixels are "real" and others aren't. All that ultimately matters is how fast it is vs. how good it looks.

Whether to use DLSS is simply a value judgement. Would you rather game at higher resolution & framerates, or would you rather have "higher fidelity" pixels at a lower resolution & framerate? At some level DLSS just needs to look "good enough", and that's basically just the point at which it wins out over the alternative.

In a sense, what I'm actually arguing is that the whole "moral" stance some people seem to take on DLSS is drawing an artificial distinction. I think people should simply be pragmatic and use it if they think it looks good enough. If not, don't.

Based on your Tomb Raider comment, I hope we agree that this quest for the "true image" is artificial and ultimately self-defeating.
 
"For Cyberpunk 2077, running in RT Overdrive, there are several improvements. First, there's again an improvement in quality with less splotchiness. There are also clearer reflections, for example on the top of the vehicle, or in the puddles on the street. "

I find this interesting. Based on the materials being reflected I would expect the reflection in water to look diffused rather than sharp because the water appears to be moving not static. Unless its completely still it shouldn't look good.

Same with the top of the car. DLSS is if anything distorting how it should look like in real life.
 
Based on the materials being reflected I would expect the reflection in water to look diffused rather than sharp because the water appears to be moving not static. Unless its completely still it shouldn't look good.
I'd guess that's probably on the game for not bothering to model ripples in the puddle. With images this sharp, they should probably consider things like that.

However, on a still night, it's not uncommon to see sharp reflections in puddles.

Same with the top of the car. DLSS is if anything distorting how it should look like in real life.
It's not hard to find cars that glossy. It's just a question of what the game was going for.
 
  • Like
Reactions: JarredWaltonGPU
So is graphics.

I find it funny that I don't recall seeing virtually any outrage over techniques like VRS, even though it's also a smart interpolation technique. I know frame generation is more controversial, but if you're just talking about smart upsampling/reconstruction, the distinction would seem to be fairly arbitrary.

I'd rather take ray tracing with DLSS 3.5 than traditional polygon-driven rendering. That's a lot more faithful to the underlying physics, even with DLSS.

Oh, and one more thing... your brain is also doing neural reconstruction and interpolation. So, there's that.
Out of those two, FG and RR I mean, I'm glad my Ampere has RR not FG though. Not that I find FG bad per se, but I do prefer quality.
 
  • Like
Reactions: bit_user
Out of those two, FG and RR I mean, I'm glad my Ampere has RR not FG though. Not that I find FG bad per se, but I do prefer quality.
To be honest, I never tried Frame Generation (don't have a RTX 4000), so I can't say how bad the artifacts are. I expect it probably depends a lot on the game. I do know that VR uses similar techniques to good effect, but that requires a more intrusive implementation than FG might use.
 
To be honest, I never tried Frame Generation (don't have a RTX 4000), so I can't say how bad the artifacts are. I expect it probably depends a lot on the game. I do know that VR uses similar techniques to good effect, but that requires a more intrusive implementation than FG might use.
I generally don't notice the artifacts while gaming with FG on. I do (depending on the game) notice the higher latency, though. It depends on what game you're playing and how quickly you move the camera around. I'm far less concerned about the potential for artifacts than I am with the changes to latency and feel.
 
  • Like
Reactions: bit_user
I generally don't notice the artifacts while gaming with FG on. I do (depending on the game) notice the higher latency, though. It depends on what game you're playing and how quickly you move the camera around. I'm far less concerned about the potential for artifacts than I am with the changes to latency and feel.
In my opinion, the right way to use Frame Generation is how VR does it. It should only kick in when the framerate dips below a certain threshold, and never increase latency.
 
To be honest, I never tried Frame Generation (don't have a RTX 4000), so I can't say how bad the artifacts are. I expect it probably depends a lot on the game. I do know that VR uses similar techniques to good effect, but that requires a more intrusive implementation than FG might use.
I think for FG to produce really hq image, it needs frames really close together. 30+fg will look like crap, 60+fg will probably look great.
I am 36 tho, and start to appreciate snail-paced games more and more, so quality is better for me. not that doom eternal terrifies me with its pace, but I can't relax playing it. I don't need games to give me a rush in my life anymore.
 
I think for FG to produce really hq image, it needs frames really close together. 30+fg will look like crap, 60+fg will probably look great.
I am 36 tho, and start to appreciate snail-paced games more and more, so quality is better for me. not that doom eternal terrifies me with its pace, but I can't relax playing it. I don't need games to give me a rush in my life anymore.
This is true. FG has less of a bad feel to it at higher FPS. Which is also sort of dumb because the benefits at a high framerate become less noticeable. If you're playing a demanding game (like Hogwarts Legacy for example) on a lower end RTX 40-series card and at 4K, the "benefits" are definitely not there. Things actually start to break if the base framerate isn't at least 20 fps, and preferably 30 FPS or more.
 
  • Like
Reactions: bit_user
My only concern with stuff like DLSS is when doing comparisons, like benchmarks on company A's product at 2160p vs company B's product at "DLSS virtual 2160p but really 1080p". Comparisons should always be like for like with vender proprietary stuff being done on the side as a bonus. DLSS is essentially just another upscaler, a good one but still an upscaler.

And FG is terrible, it just increases latency while tricking FPS counters into thinking more frames are being rendered. It only works at high FPS and then you don't need it anyway.
 
aren't they rendering more frames though ?

No they are using AI to do a form of interframe blending. They render 2 frames then use an algorithm to generate a frame of what they think might be next while they are rendering the third frame. At high frame rates they can do that approximation every other frame. In both situations that fake frame followed by the read frame is what causes the input lag along with artifacts if the algorithm guess's wrong. The usual process of rendering a frame is called Rasterization (Ray Tracing is the other way). It's where you make a 2D (raster) image representation of 3D (vector) space using math. Since we want more then polygons we also then use math to twist and rotate textures to determine what color each rasterized pixel should be. Since we want lighting and shadows, we then analyze point light sources and do more math to calculate the shading effect they have on all those nearby pixels. There are other things that happen after this, but it should be obvious that each frame takes a lot of math to process. The speed at which we can do that math is what we call Frames Per Second (FPS). FG doesn't do any of this, it just looks at the previous results and guesses what the next result might look like. And because it push's that image into the frame buffer, it increments the FPS counter by one even though it never actually rasterized anything.

This is why it makes the FPS count go up but also introduces terrible input lag and janky artifacts. NVidia is quite literally trying to display something that hasn't happened yet.
 
No they are using AI to do a form of interframe blending. They render 2 frames then use an algorithm to generate a frame of what they think might be next while they are rendering the third frame.
IMO, they ought to do it like basically like TAA, where you rasterize all the geometry and compute the texture coordinates, then tell the renderor to borrow (or blend with, in the case of TAA) the pixel colors of the same spot on that object from the prior frame (i.e. at the same texture coordinates of the same polygon). Where you have surfaces that weren't previously visible, you can go ahead and run the fragment shaders to render them for real.

That way, you take no latency hit, because you're computing it as an extrapolation instead of interpolation. Of course, it'd probably be slower than the current Frame Generation method, since you still need to rasterize. However, I think primary rasterization probably takes a small minority of the entire frame rendering time.

The main downside of this approach is that it's very intrusive, but then so is TAA and the industry embraced that.

This is why it makes the FPS count go up but also introduces terrible input lag and janky artifacts. NVidia is quite literally trying to display something that hasn't happened yet.
If they're implementing via interpolation, then the artifacts are probably akin to what you'd see on a TV with motion smoothing enabled. Those aren't due to extrapolation error, but rather interpolation errors and errors in local transform estimation.
 
  • Like
Reactions: TJ Hooker
IMO, they ought to do it like basically like TAA, where you rasterize all the geometry and compute the texture coordinates, then tell the renderer to borrow (or blend with, in the case of TAA) the pixel colors of the same spot on that object from the prior frame (i.e. at the same texture coordinates of the same polygon). Where you have surfaces that weren't previously visible, you can go ahead and run the fragment shaders to render them for real.

That way, you take no latency hit, because you're computing it as an extrapolation instead of interpolation. Of course, it'd probably be slower than the current Frame Generation method, since you still need to rasterize. However, I think primary rasterization probably takes a small minority of the entire frame rendering time.

The main downside of this approach is that it's very intrusive, but then so is TAA and the industry embraced that.
I asked Nvidia about possible options to make Frame Gen better. In short, doing it in a better way (like not doing Frame Gen on the HUD) would require game devs to do a lot more work on integration, and that wasn't likely to happen. So Frame Gen is as "simple" as possible in order to get more uptake. I'd say the uptake is also pretty good so far.

To the OP of this thread, though, I wouldn't say FG makes latency "terrible" nor does it generally add "janky artifacts." It makes latency worse, however, and occasionally has artifacts that you might notice. Because this is the ideal latency case:

[User Input] -> [Update World] -> [Render Frame] -> [Swap Buffer]

So here, you're showing on the monitor the results of the most recent user input as quickly as possible. Frame Gen does this:

[User Input] -> [Update World] -> [Render Frame n] -> [Swap Buffer to n-1 Frame] -> [Use n-1 and n Frames for FG] -> [FG Frame] -> [Swap Buffer to FG Frame] -> [New User Input] -> [Update World] -> [Render Frame n+1] -> [Swap Buffer to Frame n]

So you're always seeing two additional frames of latency relative to non-FG. At 100 fps (with FG), that's 20ms of latency. At 60 fps, it's 33.3ms of latency, and at 40 fps, it's 50ms of latency. If the baseline is without Reflex, the difference is smaller, but that's just using the Reflex benefit to pay for the added latency.
 
  • Like
Reactions: bit_user
If they're implementing via interpolation, then the artifacts are probably akin to what you'd see on a TV with motion smoothing enabled. Those aren't due to extrapolation error, but rather interpolation errors and errors in local transform estimation.

Normal interpolation is when you already know the future and just insert a new frame between two existing ones, in FG's the future hasn't happened and they are guessing what it'll look like based on the past.

It's really important to realize that they are attempting to guess what a frame would look like as the next one is being rasterized. At 60fps you have 16.66ms of time between the display of of one frame and when the next needs to happen. Nvidia is using algorithms to guess what a frame would look like before that frame is finished and send that guess to the display buffer. Think of it as instead of rendering at 90FPS, they instead are rendering 60FPS and just inserting guess's in-between to make it look like 90FPS. It's not actually a 50% increase but you get the idea. The janky artifacts happen when those guess's are wrong, it's like what happens when a processors branch prediction makes a mistake, except the result is already in the display buffer and likely on it's way to the screen.
 
Normal interpolation is when you already know the future and just insert a new frame between two existing ones, in FG's the future hasn't happened and they are guessing what it'll look like based on the past.

It's really important to realize that they are attempting to guess what a frame would look like as the next one is being rasterized. At 60fps you have 16.66ms of time between the display of of one frame and when the next needs to happen. Nvidia is using algorithms to guess what a frame would look like before that frame is finished and send that guess to the display buffer. Think of it as instead of rendering at 90FPS, they instead are rendering 60FPS and just inserting guess's in-between to make it look like 90FPS. It's not actually a 50% increase but you get the idea. The janky artifacts happen when those guess's are wrong, it's like what happens when a processors branch prediction makes a mistake, except the result is already in the display buffer and likely on it's way to the screen.
No, this is fundamentally wrong.

Frame Generation interpolates, via AI and the OFA, between two already rendered frames. This is why it adds latency.

Frame 1 gets rendered.
Frame 3 gets rendered.
Frame 1 gets shown.
Frame 2 gets interpolated via Frame Generation.
Frame 2 gets shown.
Frame 5 gets rendered.
Frame 3 gets shown.
Frame 4 gets interpolated via Frame Generation.
Frame 4 gets shown.
...

That is what Frame Generation does. It's absolutely "normal" interpolation and doesn't try to guess at what a future frame might look like. It would be more impressive (and more likely to break) if it tried to predict a future frame. I thought (when it was first announced) that it would be doing AI prediction, but that was clarified before the RTX 4090 even hit retail shelves.
 
No, this is fundamentally wrong.

Frame Generation interpolates, via AI and the OFA, between two already rendered frames. This is why it adds latency.

Frame 1 gets rendered.
Frame 3 gets rendered.
Frame 1 gets shown.
Frame 2 gets interpolated via Frame Generation.
Frame 2 gets shown.
Frame 5 gets rendered.
Frame 3 gets shown.
Frame 4 gets interpolated via Frame Generation.
Frame 4 gets shown.
...

That is what Frame Generation does. It's absolutely "normal" interpolation and doesn't try to guess at what a future frame might look like. It would be more impressive (and more likely to break) if it tried to predict a future frame. I thought (when it was first announced) that it would be doing AI prediction, but that was clarified before the RTX 4090 even hit retail shelves.
reading from dllss-g on github, there is mention about when cpu bottlenecked, fps reported can be 2x higher then displayed due to timing issues
 
No, this is fundamentally wrong.

Frame Generation interpolates, via AI and the OFA, between two already rendered frames. This is why it adds latency.

Frame 1 gets rendered.
Frame 3 gets rendered.
Frame 1 gets shown.
Frame 2 gets interpolated via Frame Generation.
Frame 2 gets shown.
Frame 5 gets rendered.
Frame 3 gets shown.
Frame 4 gets interpolated via Frame Generation.
Frame 4 gets shown.
...

That is what Frame Generation does. It's absolutely "normal" interpolation and doesn't try to guess at what a future frame might look like. It would be more impressive (and more likely to break) if it tried to predict a future frame. I thought (when it was first announced) that it would be doing AI prediction, but that was clarified before the RTX 4090 even hit retail shelves.

If all it's doing is just regular interpolation then it's really only interframe blending with a bunch of added delay, even worse then I suspected, just there to trick FPS counters.
 
If all it's doing is just regular interpolation then it's really only interframe blending with a bunch of added delay, even worse then I suspected, just there to trick FPS counters.
I use motion smoothing on my TV, and I can assure you it's worth more than tricking some counter!

The interpolation method it's using is a heck of a lot more sophisticated than a simple weighted blending (i.e. cross-fade). Here's how they describe it:

Powered by new fourth-generation Tensor Cores and a new Optical Flow Accelerator on GeForce RTX 40 Series GPUs, DLSS 3 is the latest iteration of the company’s critically acclaimed Deep Learning Super Sampling technology and introduces a new capability called Optical Multi Frame Generation.

Optical Multi Frame Generation generates entirely new frames, rather than just pixels, delivering astounding performance boosts. The new Optical Flow Accelerator incorporated into the NVIDIA Ada Lovelace architecture analyzes two sequential in-game images and calculates motion vector data for objects and elements that appear in the frame, but are not modeled by traditional game engine motion vectors. This dramatically reduces visual anomalies when AI renders elements such as particles, reflections, shadows and lighting.

Pairs of super-resolution frames from the game, along with both engine and optical flow motion vectors, are then fed into a convolutional neural network that analyzes the data and automatically generates an additional frame for each game-rendered frame — a first for real-time game rendering. Combining the DLSS-generated frames with the DLSS super-resolution frames enables DLSS 3 to reconstruct seven-eighths of the displayed pixels with AI, boosting frame rates by up to 4x compared to without DLSS.

Source: https://nvidianews.nvidia.com/news/...red-frame-generation-for-up-to-4x-performance

I also found these slides:

how-nvidia-dlss-3-works.jpg


nvidia-dlss-3-motion-optical-flow-accelerator.jpg


nvidia-dlss-3-without-optical-flow.jpg


nvidia-dlss-3-motion-optical-flow-estimation.jpg


Source: https://www.nvidia.com/en-us/geforce/news/gfecnt/20229/dlss3-ai-powered-neural-graphics-innovations/

I can tell you from some experience that good quality optical flow is neither easy nor cheap, especially with effects like particles and translucency. It's a higher bar than what's needed for video compression, since a codec only cares about minimizing residuals and not about what most accurately models the actual object & camera motion.
 
  • Like
Reactions: JarredWaltonGPU
No they are using AI to do a form of interframe blending. They render 2 frames then use an algorithm to generate a frame of what they think might be next while they are rendering the third frame. At high frame rates they can do that approximation every other frame. In both situations that fake frame followed by the read frame is what causes the input lag along with artifacts if the algorithm guess's wrong. The usual process of rendering a frame is called Rasterization (Ray Tracing is the other way). It's where you make a 2D (raster) image representation of 3D (vector) space using math. Since we want more then polygons we also then use math to twist and rotate textures to determine what color each rasterized pixel should be. Since we want lighting and shadows, we then analyze point light sources and do more math to calculate the shading effect they have on all those nearby pixels. There are other things that happen after this, but it should be obvious that each frame takes a lot of math to process. The speed at which we can do that math is what we call Frames Per Second (FPS). FG doesn't do any of this, it just looks at the previous results and guesses what the next result might look like. And because it push's that image into the frame buffer, it increments the FPS counter by one even though it never actually rasterized anything.

This is why it makes the FPS count go up but also introduces terrible input lag and janky artifacts. NVidia is quite literally trying to display something that hasn't happened yet.
so it is more frames indeed, just might be slightly off in quality if the algorithm can't make a good approximation. I have no problem with calling it more fps, that's what it is. But it's not part generational increase, that's for sure stupid marketing.
 
  • Like
Reactions: bit_user
I use motion smoothing on my TV, and I can assure you it's worth more than tricking some counter!

Watching pre-rendered non-interactive media is very different then playing an interactive game.

When everything has already been rendered it's smoother to insert the new pseudo frames into the middle of the already generated frames to generate a smoother experience. NTSC media is also 29.97 FPS which gives a lot of space for an interposing frame to reach 60fps that most displays can do.

so it is more frames indeed, just might be slightly off in quality if the algorithm can't make a good approximation. I have no problem with calling it more fps, that's what it is. But it's not part generational increase, that's for sure stupid marketing.

No it's a pseudo frame or rather just an educated guess on what it might look like. There is no rasterization, no analysis of vertices, mapping of textures or shading. It simply looks at two images it's already generated then guess's what a middle might look like and inserts that in the middle, causing high input latency.

This has been gone over many times, I was giving nVidia the benefit of a doubt that they were trying some sort of prerendering guesswork as a speedup. After reading deeper into the technical details, yeah it's just faking out the FPS counter and giving them marketing material to explain the malicious segmentation. At least DLSS upscaling has a point, those on lower end hardware with higher resolution screens can have a better experience.
 
Last edited:
No it's a pseudo frame or rather just an educated guess on what it might look like. There is no rasterization, no analysis of vertices, mapping of textures or shading. It simply looks at two images it's already generated then guess's what a middle might look like and inserts that in the middle, causing high input latency.
To be fair, the AI network generally does a very good job at determining what the middle frame should look like in probably something like 99.9% of cases. Where it fails is with things like camera changes (i.e. switching from camera 1 to camera 2, so that there's no actual in between state that would make sense), or with major shifts in view.

That second is why FG tends to break down and have issues when you're not running at higher FPS. At say 100 fps base, even if you're moving the mouse around pretty quickly, the fully rendered frame to frame changes won't be too large. If you're at 20 fps base, a rapid mouse movement could swing the viewport enough that there's very little overlap between the two frames.

Ultimately, in these cases, the result is the same: the "generated" frame can end up with a bunch of artifacts. And if you're running at 100 fps, it barely matters — one single "bad" frame shown for 10ms on your monitor isn't really perceptible. If you have a high speed capture of the feed and step through it frame by frame, you can find the errors, but in normal use it's not really a problem. But if you're getting a Frame Generation result of maybe 30–50 fps, where the errors could be visible for 20–33 ms, you'll start to see the problems more.

Is FG good or bad, then? That's the real question, and I'd put the answer somewhere in between. It's interesting, and can make games look smoother. But it never (in my experience) really makes them feel more responsive.
 
  • Like
Reactions: bit_user
Status
Not open for further replies.