News Jensen says DLSS 4 "predicts the future" to increase framerates without introducing latency

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Confirm that Reflex 2 uses in-painting? It absolutely does.
Okay, that's confirmed here:


However, Reflex 2 is updating the frame between the time it's rendered and when it's sent to the display, like Ed was saying about ASW. So, the amount of object motion it's incorporating is from less than one frame interval. In that case, we very likely are talking about just a tiny number of edge pixels they're having to make decisions about. It's less than what you're needing to fill in when an entire frame interval has passed.

More details to come soon, but if Nvidia can create a fast algorithm that works there, it could do it for other things as well.
You might still be right about what they're doing (and thank you for that), but I just wanted to point out that there's a meaningful difference in scale, here.

And if you’re running at a low FPS and make a big camera shift, it will break down and look bad I’m sure. But it’s intended for games probably running well over 100 FPS. That’s my take anyway.
I think this is going to be the key thing to understand about DLSS4 - just how low is the minimum fps that works for each kind of game. It might be a pretty drastic fall-off, like it's looking okay until it suddenly doesn't.
 
When did we just give up on, you know... actually rendering the frames that the game generates? Faking resolutions, making up frames that never existed, that doesn't sound like magical new tech. It makes it sound like you can't figure out how to make better hardware so you are trying to hide the fact with trickery.
It's like you're discovering for the first time that it is, and has always been, trickery all the way down. One of our old tricks isn't scaling well anymore and we are using new tricks. But if you think that traditional rasterization isn't itself a pyramid of trickery, you should educate yourself.

This has always been the way of things. Only your ignorance and false expectations are to blame for it being a surprise to you.
 
  • Like
Reactions: bit_user
It's like you're discovering for the first time that it is, and has always been, trickery all the way down. One of our old tricks isn't scaling well anymore and we are using new tricks. But if you think that traditional rasterization isn't itself a pyramid of trickery, you should educate yourself.
Yeah, polygons, textures, lighting models, shadow maps... it's all just tricks and approximations. Path tracing is the least hacky, but gamers generally aren't wild about it, due to the resource requirements. That shows where their priorities lie.

The goal is pretty much always balancing visual fidelity against computational costs. Anything which advances that frontier should be viewed in a positive light, although we'd be wise to wait for thorough reviews and investigations.
 
  • Like
Reactions: dipique
I feel like hardware is starting to hit the physical limit of what is possible with silicon and transistors. The problem now is more with coding, art, game design and optimisation. The recent obsession with FPS is strange (big number must always be better), most modern games look quite bad even if they do run smoothly because they have too much visual clutter. Colour schemes are muddy and smeared, character design is dull, there are random particles everywhere. There are not enough proper artists or they are too rushed. A 1950s Disney film looks better than a 2025 game. They aim for 'photo realism' but that is not possible without experencing severly diminishing returns.


I can't tell the difference between 80 and 150FPS, it looks the same to me. Perhaps there is a very slight difference in FPS games. People look like weird rubber mannequins even with excellent GPUs and shadows are usually all wrong. I have played a few games that looked good and were obviously made with skilled artists, like Sable or Cryptmaster but most look almost the same as 2015. We have not yet realised the full potential of the NES or C64 it will be decades before we can even use a GTX 1650 properly.
 
In fact, all frame interpolation (thinking up) technologies like DLSS are desperate crutches, in an attempt to hide the unfortunate fact - modern NVidia video cards, with a practical monopoly on the 3D accelerator market and the absence of real competitors, have long fallen into a technological, architectural failure and cannot in reality provide the generation of a minimum 60 fps (without drops) in ultra graphics quality (as was drawn and conceived in game studios) even in mass models like 4070 in a banal resolution of 2.5k, not to mention 4k everywhere, which has long been in practice.

There are plenty of 4k monitors with a refresh rate above 120 Hz, but there are practically no available GPUs that are capable of providing 60 fps in games in 2022-2024 without DLSS in ultra quality. What we see with the 5xxx series is another attempt at cheating through a sharp increase in consumption, which has already become insane (575W in the case of the 5090). But the increase in productivity is again insufficient even in this version to provide 60Hz in all releases in 4k in ultra quality, not to mention 120Hz+.

Now they have come up with a newfangled "AI", which is essentially still an interpolator of frames by inventing additional details that the gpu simply does not have time to draw. This is virtual reality of virtual reality. This is a pun.

In fact, gpu makers cannot keep up with the progress in the resolution of screen panels of modern monitors, laptops and TVs / projectors, so they come up with all sorts of similar artificial crutches to hide the fact of technological failure. And with the introduction of 8k monitors, their failure will take the form of EPIC FAIL. That is why they deliberately delayed introducing DisplayPort 2.0 (with UHBR20 mode support) into their chips for 5 years and for this reason into monitors - they could not even keep up with 4K resolution, and 8K is a disaster for them. Although 8k gives ideal fonts (like from a laser printer) and a picture (almost analog, like from the real world) on screens up to 36" in 2D and has long been needed by all people, including offices - real "retina" quality.

All this is very sad - but the problem is that the performance growth curve per 1W of consumption is gradually slowing down, having long ago moved from exponential growth to an almost flat line in the last decade. Therefore, cheating is happening everywhere with increasing power, which has now been forced to be reduced in laptops due to the official "green policy", since laptops are sold several times more than desktop PCs.

Humanity urgently needs new chip manufacturing technologies that will provide a jump in performance per 1W by several orders of magnitude at once in order to move to really useful neural networks in everyday life at the local level (which are now passed off as "AI") and so that there is at least some strong progress in visualization tools.

With silicon, almost finished, but there are no new technological advances today (for industrial scale) to ensure the growth of performance per 1W of consumption of at least 35%+ year on year, as was the case in the 80s, 90s and early 2000s. So the world is expecting technological stagnation, and the euphoria with "AI" will soon pass, as soon as they run into the understanding of the dead end (primarily the extreme energy inefficiency of mass application) of current neural networks.
 
Ok, let me give you more context:

maybe I was a bit rude to broadly classify all UE 5 devs as lazy.

But its true that UE5 provides all the tech features needed and thats why game companies are dropping their own engines.

Its is true that you have a lot of stuff to work on and fine tune. Now imagine doing it in an inferior/less sophisticated game engine but still making the game look breathtaking. Now you understand the context a bit better i guess?

and have you seen the majority of the latest games? do you want some examples?

Batman Arkham Shadows
Star wars outlaws
CP 2077 when it first launched
Even Elden Ring - good game but not groundbreaking graphics.
Skull & Bones

Except for the polished version of CP 2077, Wukong and a few others, you really cant say game graphic peaked in 2024. But they again require beefy GPUs to run at max settings.

See RD2, AC blackflags, BF1, COD AF, NFS Rivals and you can see how great the graphics were and more importantly how great they ran and mid tier GPUs as well. Even Witcher 3 with its next gen texture pack is gorgeous in today's standards.

Do you still feel gamers shouldnt/cant complain about unoptimised games? I feel that I am justified in voicing this opinion.
Since you're confessing, I'll do the same: I don't play that many games, so I'm not the best to judge.

I bought most of my home-lab hardware for my job as a technical architect and infrastructure designer. It just so happens that a lot of the stuff I got for testing infrastructure that our research scientists would then use to work on, also happens to be useful for gaming. I tried to be clever about that 🙂

And it's mostly my kids who profit from the leftovers once I retired them, after testing. And those tests included plenty of games, which I still have little time or patience to actually play. It's my kids and their friends who provide me with the most valuable feedback on performance and quality, including artistic or even political angles. That's a good return on what I give them, too.

I'm far more interested in the meta-game or the meta-layers, to see where the technology and the politics of the industry is heading and the games themselves are mere data points.

I've also been in the industry since 1980, and happily dived into its ancient past up to WWII, so it would be safe to say that I take the long view and see what others regard as paradigm changes simply as bumpier steps in a longer evolution.

As to complaining: I've always felt that while you should listen to complaints, any time spent on formulating your own is better spent on trying to find a cause or even a solution.

That doesn't always work out even for me.

There is a good recent show on Moore's Law is Dead (Broken Silicon 291) with a games developer somehow managing to comment on the impact of the various AI improvement technologies on their work and the ever wider spread in terms of the technical capabilities of the gaming hardware they want to support.

It shows that they certainly aren't lazy, but face battles very bravely where they are a key ingredient with very little power to direct things.

Yet clearly without UE they'd have no product and without the console/PC vendors they wouldn't even have customers, who they clearly need to make happy at scale.

So they'll follow paradigm changes but mostly because there is no real choice. Doing your own engine ultimately leads to doing your own hardware and restarting with an alternative to the design described by John von Neumann in 1945 for a theoretically optimal design. Tiny steps is the only thing even the biggest giants can do these days. And DLSS will have a lot of offshots, most of which won't survive.

The end result may be hard to recognize as an evolutionary product, but true revolutions are hard to pay for. So going straight from bumped and shaded triangles to AI paintings isn't very likely to happen. Getting good enough results from AIs trained on shaded triangles at much lower computational cost then doing those triangles, may be the break through for gaming all day on a solar powered laptop, a pair of augmented glasses or an augmented set of contacts.

I respect Nvidia for trying, I'll buy the better product and only as long as I get a benefit, directly or indirectly.
 
I think Blackwell still has the OFA, but it's fixed function just like on Ada. The new DLSS framegen models deliver higher perf and better quality, according to one of the videos/images from Jensen's keynote IIRC. Will it support Ampere and Turing? I would be shocked if they allow that. I think there are performance requirements and some other stuff that makes multi frame generation require Blackwell... but also I'm pretty sure that's again just locking a feature that could work on older architectures to a new architecture.

As for framegen / MFG, I was right in what I initially said. It's still using interpolation. So if anyone wants to argue that it's not... well, for now that's wrong. I definitely think extrapolation or projection or whatever you want to call it is being researched and will happen at some point. Because think about this:

You render frame 1 initially. Now, frame 2 is a special case, maybe you just skip that one frame, but after rendering two frames you now have at least some semblance of a pattern with motion vectors and such. So, take frames 1 and 2, and project where that's going and use AI to create frame 3. If rendered frame 4 continues the trend, all is well and things should look fine, and you haven't added latency.

But what if frame 4 has a major change in camera position or viewport compared to frame 2? Well, it would be just as big and noticeable of a change as from frame 2 to 4 as from projected frame 3 to 4. Which resets the pattern, but it shouldn't really look any worse than the current interpolation approach between two wildly divergent frames.

Basically, project every other frame based on the past trend (faking a trend if necessary) and then use a fast in-painting algorithm to make up the difference. Intel has said it's researching this as well. Like I said, I think this is very much a matter of "when" not "if."

And of course, multi frame projection would be much harder to pull off than a single frame projection. I don't think projecting more than one or maybe two frames is viable. Interpolating three frames, though... that's reasonably easy if the algorithm and hardware are fast enough.
My main takeaway from your description to me is that the cost of misprediction sinks drastically with the frame rate.

When you're at 15 FPS computationally and try to push that to 60, false predictions might result in total visual chaos.

When you're at 60 frames computationally and aim for 240 frames visually, false predictions might at worst result in flickers like the ones we got from antialiasing flip-flopping at the edges.

That's mostly because unless you just happen to sit in the middle of an explosion, actionable outside actions (from the viewer's perspective) is limited by what humans can actually perceive and thus can be largely predicted accurately (unless it's a vast multi-player title) because they are generated. While user action (player or other human) are more likely to result in mispredictions, but won't be as drastic when visual accuracy is critical (e.g. aiming instead of fleeing).

So there is a nice correlation between your ability to enjoy the nicest graphics and their ability to produce it while wrong pixels are less obvious in a panic.

Generally, I'd say we'll see distinct branches of evolution for how certain types of games and certain types of visual approaches will profit more or less from frame generation. E.g. flight simulation even in VR should be super easy, because everything can be pre-computed and you can turn your head only so fast without loosing the ability to focus your eyes (makes me wonder why it's still so bad).

I wonder if for additional quality game engines should start to actually communicate predictions via gaming APIs so that GPUs can be told the current and the expected next states of the world e.g. via motion vectors with a certainty factor for the elements they are supposed to render...

That would become really messy so perhaps not...

Large scale intergalactic battles already attempt the near impossible of duplicating vast amounts of space in real-time without quantum communications, so there is no easy solution there until they close that little gap.
 
I feel like hardware is starting to hit the physical limit of what is possible with silicon and transistors. The problem now is more with coding, art, game design and optimisation. The recent obsession with FPS is strange (big number must always be better), most modern games look quite bad even if they do run smoothly because they have too much visual clutter. Colour schemes are muddy and smeared, character design is dull, there are random particles everywhere. There are not enough proper artists or they are too rushed. A 1950s Disney film looks better than a 2025 game. They aim for 'photo realism' but that is not possible without experencing severly diminishing returns.


I can't tell the difference between 80 and 150FPS, it looks the same to me. Perhaps there is a very slight difference in FPS games. People look like weird rubber mannequins even with excellent GPUs and shadows are usually all wrong. I have played a few games that looked good and were obviously made with skilled artists, like Sable or Cryptmaster but most look almost the same as 2015. We have not yet realised the full potential of the NES or C64 it will be decades before we can even use a GTX 1650 properly.
The computer industry has always skirted what is phyiscally possible, but thanks to evolving technology that hasn't been a static target, even if it didn't flow straight forward.

How higher frame rates actually result in reduced "realness" has been a nasty surprise for film producers when they tried using 60 vs 24Hz recording equipment, I seem to remember.

Likewise quite a few games that looked very fluid and believeable on fixed refresh CRTs (perhaps even interlaced), look quite quite wooden and disappointing on high-refresh LCDs. The uncanny valley never seems to be bridged but keeps shifting.

And once you're really immerse in a game, realism is mostly in your head.
 
  • Like
Reactions: bit_user
Humanity urgently needs new chip manufacturing technologies that will provide a jump in performance per 1W by several orders of magnitude at once in order to move to really useful neural networks in everyday life at the local level (which are now passed off as "AI") and so that there is at least some strong progress in visualization tools.
There are perhaps a few things humanity needs more urgently than that. I doubt the planets salvation can be achieved by better compute.
 
  • Like
Reactions: bit_user
And I have no doubt, because ultimately, we are all perfect biological machines, but very, very slow, when the tasks become very complex and the potential of 100 heads is better than one becomes of little significance according to the same Amdahl limitation...

Humanity will most likely never reach the stars (if we consider the resettlement of the excess population to other star worlds), without real AI. But real AI will be the end of humanity. This is certainly true.

In this case, before all this and real "AI" orders of magnitude superior to the human brain, there are still centuries of effort, at best. For now, we are talking about the fact that silicon technologies have naturally (as predicted) reached a dead end - both physical and economic. And there is still no alternative...
 
Optical flow is used to ensure scene objects shift correctly based on their depth and prior motion, and this works identically whether that motion is from head motion (camera shift) or from object motion (scene shift).
It doesn't work identically, because head motion can be measured, which is a superior solution to the extrapolation you'd do based on optical flow.

It's been done in the HMD since 2019 (release of the Quest 1).
That's not simply a HMD. That's an entire, self-contained system.

Nobody has done so for the last half a decade, at least not without being laughed out of the room.
But it clearly shows how little importance the positional data has. What's going to make you barf isn't lagging parallax.

I really do recommend actually following the earlier link, it explains the functioning of ASW and why it is implemented that way. This is well-known old tech at this point, so it is good to see it implemented outside of VR (with or without NN assistance with inpainting).
Send me a link that works and I'll look at it.

This description of ASW includes the artifacts I mentioned above:

  1. Rapid brightness changes. Lightning, swinging lights, fades, strobe lights, and other rapid brightness changes are hard to track for ASW. These portions of the scene may waver as ASW attempts to find recognizable blocks. Some kinds of animating translucency can cause a similar effect.
  2. Object disocclusion trails. Disocclusion is a graphics term for an object moving out of the way of another. As an object moves, ASW needs something to fill the space the object leaves behind. ASW doesn't know what's going to be there, so the world behind will be stretched to fill the void. Since things don't typically move very far on the display between 45 fps frames, these trails are generally minimal and hard to spot. As an example, if you look closely at the extrapolated image from the screenshots here you'll see a tiny bit of warping to the right of the revolver.
  3. Repeated patterns with rapid movement of them. An example might be running alongside an iron gate and looking at it. Since parts of the object look similar to others, it may be hard to tell which one moved where. With ASW, these mispredictions should be minimal but occasionally noticeable.

Source: https://developers.meta.com/horizon/blog/asynchronous-spacewarp/

Now, the way DLSS3 gets around # 3 is that it has analytical motion vectors. So, it doesn't get tripped up by things that confuse optical flow. The neural network does need to be well-trained to know when to use each one.

@jarredwalton , # 2 is what I was trying to describe. If Nvidia does predictive frame gen, they must use some kind of technique to combat it, or else they're claiming you'd see it.
 
That is why they deliberately delayed introducing DisplayPort 2.0 (with UHBR20 mode support) into their chips for 5 years
How do you know that? Maybe it had to do with licensing costs, lack of demand, etc.

cheating is happening everywhere with increasing power, which has now been forced to be reduced in laptops due to the official "green policy", since laptops are sold several times more than desktop PCs.
Huh? Last I checked, there's no shortage of fire-breathing gaming laptops and even performance-tier professional laptops that will burn through their battery in no time, if you put them on the highest power plan.

Humanity urgently needs new chip manufacturing technologies that will provide a jump in performance per 1W by several orders of magnitude at once
Just because we got addicted to the pace of Moore's Law doesn't mean it has to continue. I think every last one of us wishes it would. That said, I wouldn't underestimate the industry's capacity to forge ahead. The pace of improvements slowed and costs & complexity are rising, but we're not at the end of the road, yet.
 
When you're at 15 FPS computationally and try to push that to 60, false predictions might result in total visual chaos.

When you're at 60 frames computationally and aim for 240 frames visually, false predictions might at worst result in flickers like the ones we got from antialiasing flip-flopping at the edges.
Yes. The differences between frames are roughly inversely-proportional to the frame rate. For instance, in video compression, you can usually achieve comparable quality at twice the framerate without doubling the bitrate.

I wonder if for additional quality game engines should start to actually communicate predictions via gaming APIs so that GPUs can be told the current and the expected next states of the world e.g. via motion vectors with a certainty factor for the elements they are supposed to render...

That would become really messy so perhaps not...
Well, I mentioned above that we could have a fusion of prediction and native computation, where framegen decides which pixels it's having trouble predicting and tells the game engine to regenerate them. At worst, this might require re-rasterization, but you might just used the predicted depth, normal, and texture coordinates to re-shade them. Not sure if the game engine could reuse prior shadow maps or what to do about some of the lighting effects...

As for the engine providing better predictive hinting, it could provide per-pixel acceleration vectors, not just velocity. Changes in viewer pose could be explicitly sent for each predicted frame, as you'd do for VR.
 
It doesn't work identically, because head motion can be measured, which is a superior solution to the extrapolation you'd do based on optical flow.
As I already explained, the technique works with zero head motion. Stick the HMD on a pole and set objects within the scene moving and disocclusion inpainting will occur exactly the same as with the HMD in motion.
HMD motion is ADDED to the motion vector field created via optical flow (global value applied, modulated by the depth buffer). If there is no motion, nothing is added and the MVec field is used as-is.
But it clearly shows how little importance the positional data has. What's going to make you barf isn't lagging parallax.
Nobody makes orientation-only HMDs BECAUSE head motion is mandatory to maintain orthostereo and not have a barf-inducing experience. It's not an optional extra or a nice bonus, it's a baseline requirement.

Claiming position tracking is of 'little importance' would be like claiming a mouse is of 'little importance' to a gaming machine because the keyboard exists.
Send me a link that works and I'll look at it.
It remains https://developers.meta.com/horizon/blog/asynchronous-spacewarp/ . If you can't access it, the problem is on your end.
Now, the way DLSS3 gets around # 3 is that it has analytical motion vectors. So, it doesn't get tripped up by things that confuse optical flow. The neural network does need to be well-trained to know when to use each one.
The analysis performed to create those 'analytical motion vectors' is... Optical Flow. That's literally how you take a series of images with motion in them and generate a motion vector field.
 
As I already explained, the technique works with zero head motion. Stick the HMD on a pole and set objects within the scene moving and disocclusion inpainting will occur exactly the same as with the HMD in motion.
If you do in-painting, that part is the same.

HMD motion is ADDED to the motion vector field created via optical flow (global value applied, modulated by the depth buffer). If there is no motion, nothing is added and the MVec field is used as-is.
So you agree that head motion is not derived from optical flow. There's a step you left out, which is that you also need to subtract prior head motion from optical flow, so that you're just seeing the motion of the objects, themselves.

It remains https://developers.meta.com/horizon/blog/asynchronous-spacewarp/ . If you can't access it, the problem is on your end.
Now, I think you're confused. That's the link I posted.

Anyway, I think the website is mucking with the links, because clicking it takes me here: https://www.meta.com/quest/?cjevent...&utm_campaign=creatoraffiliate&utm_parent=frl

The analysis performed to create those 'analytical motion vectors' is... Optical Flow.
No, it's not. I already explained this part. Optical flow is coming along after the fact. Motion vectors produced for TAA are generated during the rendering process and based on actual texture coordinates - not just trying to visually correlate parts of consecutive frames, which is how optical flow works and why it has such problems with stuff like repeating patterns.

Nvidia already explained this, if you'd bother to read how DLSS3 works. They said they had to combine the analytical motion vectors with the optical flow vectors, because each is better in different circumstances.
 
we'd all rather have the vram we're paying for instead of supporting an irresponsible, uncaring and self-serving upper class
 
I feel the immediate future for PC gaming graphical advancements is going to shift to majority software based improvements over hardware as the hardware and power limits of existing hardware are being reached and for purposes such as gaming cost consideration for the majority of gamers does become a factor.

I feel fortunate that at the stage of life I am at that I prefer single player RPG or RTS types of games where the pace is much slower and in the event that the newer lag inducing digital technology if needed would not affect my gaming to the degree as it would someone playing online competitive FPS titles.

But what I will add even though there is a small percentage overall of gamers that regardless of the advancement in the software things like DLSS and frame generation really bothers I will say in my opinion the majority of gamers as long as the image on their screen looks really good and plays at a buttery smooth FPS they could care less whether that image was created by hardware alone, a combination of hardware and software solutions or just software solutions alone.

What will make them care even less is if they can buy a $400 graphic card to meet their expectations instead of needing a $2000 graphic card leaving more money in their pockets.

What threatens PC gaming more than anything is a gamer needing to spend $3000+ to begin to build an upper tier gaming PC.
 
So you agree that head motion is not derived from optical flow.
... Why on earth would head motion be derived from optical flow? I don't know what the heck you're doing, but it sure isn't anything anyone's actually implemented.
There's a step you left out, which is that you also need to subtract prior head motion from optical flow, so that you're just seeing the motion of the objects, themselves.
Absolutely not. Trying to do so will just make a mess of the whole thing, and there is no point doing so in the first place. The beauty of the optical flow based technique is it is entirely image based, all motion in the prior frame is... in the prior frame, there is no need to 'remove' any motion, regardless of whether the head is moving or not. There is absolutely no reason to make your work harder for yourself by wasting time removing motion you'd need to add back in again to perform the correct image shifting.

The technique requires taking into account motion from the camera and motion from objects in the scene. You can't pick and choose one or the other, you need both to perform the correct shifting.
Anyway, I think the website is mucking with the links, because clicking it takes me here: https://www.meta.com/quest/?cjevent...&utm_campaign=creatoraffiliate&utm_parent=frl
Then the problem is on your end.
No, it's not. I already explained this part. Optical flow is coming along after the fact.
It's not after the fact, Optical Flow is the fact. It's literally how the MVec field is generated for ASW.
Nvidia already explained this, if you'd bother to read how DLSS3 works. They said they had to combine the analytical motion vectors with the optical flow vectors, because each is better in different circumstances.
We're discussing DLSS4's frame generation, not DLSS 3's frame interpolation. Since this is the comment section on an article on DLSS 4 frame generation.
 
The technique requires taking into account motion from the camera and motion from objects in the scene. You can't pick and choose one or the other, you need both to perform the correct shifting.
You need to know how much of the motion from frame (t-2) -> (t-1) was due to head motion vs. object motion in the world, in order to adjust the object motion vectors based on the change in head position from (t-1) to (t). However you want to formulate it, that's the essence of what needs to be done.

It's not after the fact, Optical Flow is the fact. It's literally how the MVec field is generated for ASW.
It's one way to do it. If all you have is a fully-rendered image, then it's the best you can do. If you can be more intrusive in the rendering pipeline, then you can generate analytical vectors the same way it's traditionally done for TAA.

Since you're been stubbornly refusing to look at how TAA is actually implemented, here's a nice description:


There's no optical flow, anywhere! They explain how the motion vectors are computed:

Calculating velocity – vertex shader​

During the geometry pass get the difference in position from the current frame and the previous frame. First create 2 extra vertex shader outputs and render the current and previous scene to these outputs by using the view and translation matrices of the current and previous scene.

Code:
//these 2 should be in screen space
outBlock.newPos = projection * view * translation * position;
outBlock.prePos = projection * previousView * previousTranslation * position;

Be sure to save the view and translation matrices from the previous frame to use here. Also note that when calculating velocity, you do not want to use the jittered positions which can cause the velocity to be inaccurate.

Calculating velocity – fragment shader​

First move the outputs from the vertex shader from screen space (-1 to 1) to UV space (0 to 1).

Code:
vec2 newPos = ((inBlock.newPos.xy / inBlock.newPos.w) * 0.5 + 0.5);
vec2 prePos = ((inBlock.prePos.xy / inBlock.prePos.w) * 0.5 + 0.5);

Then calculate velocity as (newPos – prePos) and output that to the velocity render texture

NOTE: velocity values can be incredibly small which would require the use of a high precision texture (which can take a lot of memory). A way to get around that is to first multiply the velocity by a large number and in the next pass when reading from the velocity texture, divide the value by that number to restore the original values. This is done for those values to be saved to a lower precision texture (0.01 -> 1(lower precision))

We're discussing DLSS4's frame generation, not DLSS 3's frame interpolation. Since this is the comment section on an article on DLSS 4 frame generation.
Which is obviously going to be based on their work in DLSS 3. Until we have a more detailed explanation, we can't say for sure how it will work, but it's a fair assumption that it'll have a lot common with DLSS 3.
 
You need to know how much of the motion from frame (t-2) -> (t-1) was due to head motion vs. object motion in the world, in order to adjust the object motion vectors based on the change in head position from (t-1) to (t). However you want to formulate it, that's the essence of what needs to be done.
Completely unnecessary. The past inter-frame motion is the inter-frame motion, the origin of that motion is irrelevant. Whether an area of the image shifts due to camera motion or in-scene motion makes no difference the image needs to shift anyway and disoccluded areas require inpainting anyway.
You may be confused because because the process is not interpolative. There is no 'existing head motion' for the generated frame to 'remove', because any head motion you want to add to the frame had not yet occurred when that frame was rendered. The entire goal is to forward-project all motion from the previous frame (regardless of origin) and then add on top of that any additional motion that occurred measured from the IMU (and that could be 'none at all').
TAA stuff
DLSS 4 (and ASW) are not using TAA. The technique you described requires you to have rendered the subsequent frame in order to do any work with it, i.e. it is an interpolative scheme. ASW and DLSS4 are not interpolative, so that technique is fundamentally incompatible.
Which is obviously going to be based on their work in DLSS 3. Until we have a more detailed explanation, we can't say for sure how it will work, but it's a fair assumption that it'll have a lot common with DLSS 3.
DLSS3 is an interpolative scheme. It is fundamentally different in operation than a forward frame synthesis technique.


Remember, the goal for DLSS4 (like ASW) is to generate frame N using frame N-1 and frame N-2. That's a forward frame synthesis scheme, you do not have to render frame N+1 before you can generate frame N.
DLSS3 on the other hand is an interpolative scheme, generating frame N from N-2 and N+1. You need to have rendered frame N+1 before you can generate frame N.
 
Completely unnecessary. The past inter-frame motion is the inter-frame motion, the origin of that motion is irrelevant.
Agree to disagree.

You may be confused because because the process is not interpolative. There is no 'existing head motion' for the generated frame to 'remove', because any head motion you want to add to the frame had not yet occurred when that frame was rendered.
Not confused. I get the difference between interpolation and extrapolation.

The entire goal is to forward-project all motion from the previous frame (regardless of origin) and then add on top of that any additional motion that occurred measured from the IMU (and that could be 'none at all').
So, you're going to extrapolate motion due to prior changes in viewer pose, regardless of whether it matches the viewer's current pose? That sounds wrong.

DLSS 4 (and ASW) are not using TAA.
How are you so sure about DLSS 4? Did you find a detailed description of it?

The technique you described requires you to have rendered the subsequent frame in order to do any work with it,
At minimum, TAA requires a pair of frames that are consecutive in time. Same as optical flow. Motion vectors from prior frames can be extrapolated to predict future motion. They can also be used for interpolation, of course.
 
Remember, the goal for DLSS4 (like ASW) is to generate frame N using frame N-1 and frame N-2. That's a forward frame synthesis scheme, you do not have to render frame N+1 before you can generate frame N.
DLSS3 on the other hand is an interpolative scheme, generating frame N from N-2 and N+1. You need to have rendered frame N+1 before you can generate frame N.
Just to clarify, again (as I'm partly the source of the "misinformation"), DLSS 4 Multi Frame Generation is using interpolation between two rendered frames, just like DLSS 3 Frame Generation. A future algorithm that tries to project and predict things is almost certainly being researched, but that algorithm is not DLSS 4.

Reflex 2 on the other hand uses something similar to ASW plus in-painting. But in that case, it's sampling camera position updates right before final rendering and warping that plus in-painting. So it's not projection per se. (It's also predicting the camera position before rendering it seems.)
 
  • Like
Reactions: -Fran-
Just to clarify, again (as I'm partly the source of the "misinformation"), DLSS 4 Multi Frame Generation is using interpolation between two rendered frames, just like DLSS 3 Frame Generation. A future algorithm that tries to project and predict things is almost certainly being researched, but that algorithm is not DLSS 4.

Reflex 2 on the other hand uses something similar to ASW plus in-painting. But in that case, it's sampling camera position updates right before final rendering and warping that plus in-painting. So it's not projection per se. (It's also predicting the camera position before rendering it seems.)
Well damn, that makes DLSS4 a lot more boring.
 
  • Like
Reactions: JarredWaltonGPU
The computer industry has always skirted what is phyiscally possible, but thanks to evolving technology that hasn't been a static target, even if it didn't flow straight forward.

How higher frame rates actually result in reduced "realness" has been a nasty surprise for film producers when they tried using 60 vs 24Hz recording equipment, I seem to remember.

Likewise quite a few games that looked very fluid and believeable on fixed refresh CRTs (perhaps even interlaced), look quite quite wooden and disappointing on high-refresh LCDs. The uncanny valley never seems to be bridged but keeps shifting.

And once you're really immerse in a game, realism is mostly in your head.
In fairness, that's just because your eye can recognize 24fps and identifies media at different frame rates as feeling "wrong"/"off". Younger viewers don't have that same reaction; 60fps doesn't inherently "feel" less real, the the GG and the Boomers just weren't used to it and drew the wrong conclusions.