Without getting into any specifics (yet), let me just say... that this whole article is wrong. Basically, I got the wrong answer, we wrote it up. Whether or not I can update/fix/correct in the near term is unknown. Stay tuned.... <sigh>
Except for the polished version of CP 2077, Wukong and a few others, you really cant say game graphic peaked in 2024. But they again require beefy GPUs to run at max settings.
Thanks for letting us know. I'm eager to see your update!Without getting into any specifics (yet), let me just say... that this whole article is wrong. Basically, I got the wrong answer, we wrote it up. Whether or not I can update/fix/correct in the near term is unknown. Stay tuned.... <sigh>
No. DLSS, much like TAA, uses motion vectors, which you provide to the algorithm. That is how the can do the extrapolation of the frame using a bit of motion data for the generated new frame. If they're now doing 2 or 3 deep extrapolation*, it means they're pretty darn confident in their algorithm.I think it's purely image-based. So, it's only going to extrapolate based on what was visible in the previous frame.
The motion vectors are intra-frame. They're 2D. They can only refer back to what was previously visible.No. DLSS, much like TAA, uses motion vectors, which you provide to the algorithm. That is how the can do the extrapolation of the frame using a bit of motion data for the generated new frame.
That would indicate to me that they've shifted to doing optical flow in the same fashion as Intel. That should also mean they could use this to rework regular frame generation and have it work on anything with tensor cores. I get that they won't because they don't need to, but to me that's just another example of how nvidia is capitalizing on market and mind share.We have also sped up the generation of the optical flow field by replacing hardware optical flow with a very efficient AI model. Together, the AI models significantly reduce the computational cost of generating additional frames.
Classical optical flow algorithms tend to be rather expensive and have weaknesses where they get the wrong answer. By using a neural optical flow implementation, not only can Nvidia potentially achieve better accuracy, but they can also tune it to pick up on precisely the details (e.g. lighting effects) they want and have it disregard others. I'd guess that's what really motivated the change, but perhaps they also wanted to reclaim some die area previously used for the hardware optical flow engine.That would indicate to me that they've shifted to doing optical flow in the same fashion as Intel. That should also mean they could use this to rework regular frame generation and have it work on anything with tensor cores. I get that they won't because they don't need to, but to me that's just another example of how nvidia is capitalizing on market and mind share.
DLSS 3.0 (FG) runs on the 50 series so the hardware is either still there or nvidia is an even worse company than I imagine.perhaps they also wanted to reclaim some die area previously used for the hardware optical flow engine.
Oh I'm certain it never entered into the equation at all.Being able to port it to other hardware is an interesting side-benefit, but I doubt it was the main reason.
Oh, that's a good point. They have a whole Optical Flow SDK, which they must support. So, either you're right that Blackwell still contains the hardware, or else they are now faithfully emulating the older functionality on their CUDA/Tensor cores, when such APIs are in use.DLSS 3.0 (FG) runs on the 50 series so the hardware is either still there or nvidia is an even worse company than I imagine.
Eh, maybe for things like Nintendo Switch 2 or the rumored desktop APU they're likely working on with MediaTek. They might prefer not to burn die space on hardware optical flow engines, in those chips.Oh I'm certain it never entered into the equation at all.
I think Blackwell still has the OFA, but it's fixed function just like on Ada. The new DLSS framegen models deliver higher perf and better quality, according to one of the videos/images from Jensen's keynote IIRC. Will it support Ampere and Turing? I would be shocked if they allow that. I think there are performance requirements and some other stuff that makes multi frame generation require Blackwell... but also I'm pretty sure that's again just locking a feature that could work on older architectures to a new architecture.Oh, that's a good point. They have a whole Optical Flow SDK, which they must support. So, either you're right that Blackwell still contains the hardware, or else they are now faithfully emulating the older functionality on their CUDA/Tensor cores, when such APIs are in use.
Optical flow has been present in GPUs since they started adding video encoder FFBs, as producing an optical flow field is a mandatory part of all modern video CODECs (since at least MPEG2). It's exactly this FFB that is used to generate the motion vector field for ASW in VR applications.TAA doesn't use optical flow and neither did DLSS until Ampere GPUs added a hardware optical flow engine. The motion vectors used by TAA and DLSS2 were analytical. What makes it possible is that you know the screen space texture coordinates of each object, so you can compute the correct motion vector, whereas optical flow is merely a guess that's based on visual similarity and can easily get confused.
Using the MVec field for ASW has nothing to do with 'lighting boundaries'. It is to allow for reprojection of all moving elements of the scene regardless of whether that motion is doe to head motion or in-scene motion. In other words: if you rigid-mount a HMD and run an application using ASW, moving objects within a scene will still have smooth motion extrapolation.The reason they added optical flow to the mix was to deal with hard lighting boundaries, which usually don't follow what object surface textures are doing. The combination of both techniques gives you the best of both worlds.
False. ASW accounts for in-scene object motion as a fundamental requirement. Image synthesis (to account for disocclusion from object parallax) is a fundamental requirement. The technique would not work without doing both. I linked the page explaining how it works already, but I'll do it again too.Some of these VR tricks don't attempt to estimate the world state at a new time point, but merely compensate for head movement.
I am for sure a gamer. I spend a lot of hours playing games. And I don't find dlss to be visually degrading at all. I have a 4090 and don't really need to use it but I do anyway because it makes the games look better when I turn it on.to this day i find it surreal how rare it is that journalists don't mention how bad DLSS looks visually.
this isn't tech anyone who actually plays games uses. (and just because the amd version looks worse, that doesn't excuse NVIDIA for pushing this shovelware at us.)
I don't think that's why the stock dropped. Most of what the 5000 series is good for is good for servers and creators. The money makers. They just happen to be good for gaming too.Jensen used a lot of words to attempt to justify not using at least 16gb VRAM on everything above the 5060 and why it's not a bad thing there is very little on paper (and in practice perhaps) difference, outside of the Titan class 5090, for the 5000 series over the 4000 series.
Perhaps their 5% stock drop today is a result of that as well.
No, video codecs don't use optical flow. They employ motion vectors, but those vectors are optimized to minimize the residuals from macroblock motion compensation, which is a different problem than optical flow is meant to solve.Optical flow has been present in GPUs since they started adding video encoder FFBs, as producing an optical flow field is a mandatory part of all modern video CODECs (since at least MPEG2).
That's a plus, but it's not necessary simply to avoid motion sickness. Optical flow is irrelevant for avoiding motion sickness, as the only thing which tells you how much the wearer's head has moved since the frame started being rendered is the HMD's tracking system, which is focused on the wearer's pose within the real environment and not at all at what's happening in the virtual environment.It is to allow for reprojection of all moving elements of the scene regardless of whether that motion is doe to head motion or in-scene motion.
Then you can't do that in the HMD (due to such compensation requiring depth information, which is absent from the video signal), which puts it at a disadvantage vs. other techniques. VR users tend to be relatively stationary, so the changes in orientation will be much more important than changes in position. Simple VR implementations only track orientation, not position, showing the relative lack of importance in matching position changes vs. orientation changes, for wearer comfort.ASW accounts for in-scene object motion as a fundamental requirement. Image synthesis (to account for disocclusion from object parallax) is a fundamental requirement.
I already tried this link before, but it simply took me to a product listing for Meta Quest. Even if they implemented it like you're saying, that doesn't make object motion compensation a fundamental requirement - it just shows they went further than others.I linked the page explaining how it works already, but I'll do it again too.
That's exactly what they did. Instead of manufacturing it on a smaller process node, they kept using virtually the same node as the RTX 4000 series but just made the dies bigger.Would it not be possible to just.. go bigger? Physically I mean. If it's not that feasible to go smaller anymore.
Personally I'm not that much opposed to owning a bigger computer.
The extra memory bandwidth and capacity from going to 512-bit is almost certainly for the AI bros.So if this new compression technique doesn't require more vram, why does the 5090 have 8gb more then the 4090?
The 5090, Like bit_user pointed out is not just for gaming. As a matter of fact, more Ai and Content Creators are using them rather than gamers.So if this new compression technique doesn't require more vram, why does the 5090 have 8gb more then the 4090? People are going to buy it anyway given all the other improvements. They don't need to compel the elitist to upgrade from last gen to this gen. It just seems like more marketing speak.
This is a contrived example that doesn't really happen in practical use. Think about this: how many frames are required for a person to step out from behind an obstacle? Particularly if a game is rendering at 30+ FPS, it's not like they pop out for a single frame and then disappear one frame later. There will be dozens of frames where the other person is coming out and then returning.Here's a thought experiment, showing where frame extrapolation breaks down. Imagine you're playing a first-person game of some sort. You're standing near a corner or some kind of large obstacle that someone could hide behind. If they step out from behind it, then the algorithm isn't going to know what to do in those trailing-edge pixels that are newly revealed in each successive, extrapolated frame.
You might be right that they try to do some sort of AI in-painting, but models like what Adobe uses for that are probably huge and complex, nowhere near realtime. More likely, they just smear the object or reuse previous frame's pixels at that location. Basically, you'd see this sort of ghosting effect, along that trailing edge. Worse yet, it'd probably flicker as each real frame corrects it, drawing even more attention to the artifact. At high native frame rates, the effect might be subtle enough that you wouldn't really notice, but when the native framerate is low, then it'd be very pronounced.
Similar to what I said before, I think the proper solution to this problem is just to natively rasterize and shade these areas. Assuming Nvidia still uses tile-based rasterization, they could actually do this without a ton of overhead, though you would need to reprocess the geometry for that frame. With ray-tracing, it's even easier just to shoot some assorted rays where & when you need them, although there's again the problem of not only needing to do the geometry transforms, but also building/updating the BVH structure.
Can you please try to confirm that with Nvidia? As I said, in-painting is a lot harder than run-of-the-mill DLSS and requires a much bigger model (think Stable Diffusion) that I doubt they could inference in the sort of time budgets we're talking about.What will really happen is that things shift and, in most situations, you'll have edges of maybe a few pixels where the correct data is missing. Those would get filled in by a fast in-painting algorithm, and they'll be visible for maybe 10–20 ms at most. Then a fully rendered frame would come along and you get the correct pixels everywhere.
Yet is has been successfully implemented in practice for 7 years.No, video codecs don't use optical flow. They employ motion vectors, but those vectors are optimized to minimize the residuals from macroblock motion compensation, which is a different problem than optical flow is meant to solve.
It is necessary, because both scene and head motion causes disocclusion, which required 'inpainting' to fill in the disoccluded areas. But you can't identify the disoccluded areas just from IMU data alone, because that would mean repositioning the camera and rerendering the scene for the new viewpoint, which is the entire exercise you are trying to avoid in the first place. Optical flow is used to ensure scene objects shift correctly based on their depth and prior motion, and this works identically whether that motion is from head motion (camera shift) or from object motion (scene shift). The technique is motion origin agnostic by default: it would literally be more difficult (both conceptually and computationally) to try and compensate for head motion alone and not scene motion.That's a plus, but it's not necessary simply to avoid motion sickness. Optical flow is irrelevant for avoiding motion sickness, as the only thing which tells you how much the wearer's head has moved since the frame started being rendered is the HMD's tracking system, which is focused on the wearer's pose within the real environment and not at all at what's happening in the virtual environment.
It's been done in the HMD since 2019 (release of the Quest 1).Then you can't do that in the HMD
Nobody has done so for the last half a decade, at least not without being laughed out of the room. Releasing a HMD today with orientation-only tracking would be seen about as favourably as releasing a monitor with only green subpixels.Simple VR implementations only track orientation, not position
Confirm that Reflex 2 uses in-painting? It absolutely does. More details to come soon, but if Nvidia can create a fast algorithm that works there, it could do it for other things as well.Can you please try to confirm that with Nvidia? As I said, in-painting is a lot harder than run-of-the-mill DLSS and requires a much bigger model (think Stable Diffusion) that I doubt they could inference in the sort of time budgets we're talking about.