News Benchmarking Blackwell and RTX 50-series GPUs with Multi Frame Generation will require some changes, according to Nvidia

Admin · Jan 15, 2025

We do a lot of benchmarking, and Nvidia is bringing some new features to its RTX 50-series GPUs that require changes in how to collect performance data. Here's what you need to know about Nvidia's "best practices" as well as our plans for future testing of GPUs.

Benchmarking Blackwell and RTX 50-series GPUs with Multi Frame Generation will require some changes, according to Nvidia : Read more

evdjj3j · Jan 15, 2025

You don't.

Gururu · Jan 15, 2025

Thanks for the overview on your forward approach. Do you think we will eventually offload frame gen to NPUs integrated on the CPU (APU) so that we can return to pure GPU rasterization? I feel like nVidia is all over the place, even now pushing their own ARM CPUs. There cannot be a world where NPUs exist on both CPU and GPU with AI functions working on both simultaneously. Seems crazy to me.

LolaGT · Jan 15, 2025

Leave out the smoke and mirrors garbage entirely and test what matters, native resolution horsepower.
Testing faked frames will give faked results, which is exactly what nvidia wants you to do, sell the smoke and mirrors to the masses.

CelicaGT · Jan 15, 2025

I will view any performance metrics provided by OEM (NVIDIA) software as suspect until proven otherwise by third party, open source alternatives, as always.

Additionally, and not necessarily relevant to the main topic, but FSR4 sounds like it is leaps and bounds better than prior implementations. This may make an interesting comparison, once everything from everyone is released and in play.

JarredWaltonGPU · Jan 15, 2025

Gururu said:
Thanks for the overview on your forward approach. Do you think we will eventually offload frame gen to NPUs integrated on the CPU (APU) so that we can return to pure GPU rasterization? I feel like nVidia is all over the place, even now pushing their own ARM CPUs. There cannot be a world where NPUs exist on both CPU and GPU with AI functions working on both simultaneously. Seems crazy to me.

I can't see NPUs doing framegen as a viable approach for a variety of reasons. Basically, you have the frames in the GPU memory, and transferring those over to system RAM for the NPU to work on would be a bottleneck. Plus, NPUs are so much slower than GPUs right now on AI stuff. Like, 50 TOPS for INT8 compared to potentially thousands of TOPS. NPUs would need to be 10X~50X faster, and of course Nvidia in particular has no desire to have things run on anything other than GPU.

I'm not saying it could never happen, but short-term, it's a long way off AFAICT. Framegen (frame smoothing) running on a display would be more viable. Which has been done already, just not called framegen.

CelicaGT said:
I will view any performance metrics provided by OEM (NVIDIA) software as suspect until proven otherwise by third party, open source alternatives, as always.

Additionally, and not necessarily relevant to the main topic, but FSR4 sounds like it is leaps and bounds better than prior implementations. This may make an interesting comparison, once everything from everyone is released and in play.

FrameView is fine, AFAICT. I've tried PresentMon and OCAT in the past, and FrameView basically gave the same results. It's critical for me because I want to get the power data (and temp and clocks as well), which OCAT and PresentMon don't provide. But FrameView uses the same core code, just with some additional data collection going on.

However... this new "MsBetweenDisplayChange" business isn't working for me. I just looked at recent benchmarks, and the RTX 4060 has a bunch of 0 followed by double the time results on some games. Could be I'd need to retest with the latest FrameView revision, which just came out (preview anyway for Blackwell).

I looked at some results for the RX 6600 and everything was basically unchanged or better ("faster") with the new metric, but only on 1% lows. So I figured it would be fine and started reparsing my other recent tests... and started getting divide by zero errors. So I looked at the files and discovered the MsBetweenDisplayChange column is sometimes borked for an unknown (by me) reason.

tamalero · Jan 15, 2025

LolaGT said:
Leave out the smoke and mirrors garbage entirely and test what matters, native resolution horsepower.
Testing faked frames will give faked results, which is exactly what nvidia wants you to do, sell the smoke and mirrors to the masses.

This!

Its sad that were approaching the POST TRUTH world, where everything is made up to fool people.
News, Video, Audio, Pictures, Ar, Capabilities, Power..etc..

CelicaGT · Jan 15, 2025

JarredWaltonGPU said:
I can't see NPUs doing framegen as a viable approach for a variety of reasons. Basically, you have the frames in the GPU memory, and transferring those over to system RAM for the NPU to work on would be a bottleneck. Plus, NPUs are so much slower than GPUs right now on AI stuff. Like, 50 TOPS for INT8 compared to potentially thousands of TOPS. NPUs would need to be 10X~50X faster, and of course Nvidia in particular has no desire to have things run on anything other than GPU.

I'm not saying it could never happen, but short-term, it's a long way off AFAICT. Framegen (frame smoothing) running on a display would be more viable. Which has been done already, just not called framegen.

FrameView is fine, AFAICT. I've tried PresentMon and OCAT in the past, and FrameView basically gave the same results. It's critical for me because I want to get the power data (and temp and clocks as well), which OCAT and PresentMon don't provide. But FrameView uses the same core code, just with some additional data collection going on.

However... this new "MsBetweenDisplayChange" business isn't working for me. I just looked at recent benchmarks, and the RTX 4060 has a bunch of 0 followed by double the time results on some games. Could be I'd need to retest with the latest FrameView revision, which just came out (preview anyway for Blackwell).

I looked at some results for the RX 6600 and everything was basically unchanged or better ("faster") with the new metric, but only on 1% lows. So I figured it would be fine and started reparsing my other recent tests... and started getting divide by zero errors. So I looked at the files and discovered the MsBetweenDisplayChange column is sometimes borked for an unknown (by me) reason.

Good information, Thanks Jarred. Also thanks for the prompt reply.

Giroro · Jan 15, 2025

Personally, I don't even look at the frame gen or DLSS results, because it isn't a benchmark of hardware performance - but Nvidia wants to trick the masses into believing that it is.
I'll go one further and say coverage of DLSS and the related gimmicks are the primary reason I haven't watched a Digital Foundry video in about 6 years (and longer for LTT).
The way the money works for "tech reviewers" (and influencers in general) is super shady, scummy - and the way the big players go out of their way to try and mask infomercials as "reviews" and launder their marketing dollars would probably illegal in the US/EU if enforcement existed. Nvidia is definitely a major player at the heart of that whole ecosystem, at least in the PC space. It's annoying.

Is tech like DLSS or frame gen important or at least interesting enough to customers that it should be featured? Sure, absolutely.
But until there is a universal benchmark that runs equally between GPU architectures and accurately represents which one performs better, then I think those features should be discussed separately from benchmarks of GPU performance.
Maybe somewhere out there there's some synthetic benchmark that can objectively measure AI upscaling or frame gen quality/performance like the tests used for video encoding speed/quality, but I'm not aware of it.

Giroro · Jan 15, 2025

tamalero said:
This!

Its sad that were approaching the POST TRUTH world, where everything is made up to fool people.
News, Video, Audio, Pictures, Ar, Capabilities, Power..etc..

I appreciate that humor that "post resolution world" was one of the taglines used in the marketing copy Nvidia originally gave to influencers to use to sell DLSS. I see what you did there.

rluker5 · Jan 15, 2025

I remember reading an article a while back when the author used some program to analyze the quality of video encoding vs what was perfect. It was used to compare AMD, Nvidia and Intel's encoders.

Could something like this be used to measure how accurate these fake frames are vs native, no upscaling on canned benchmarks where it would be run vsync for the accuracy test, then unlimited frame rate for the frame rate benefit?

evdjj3j · Jan 15, 2025

rluker5 said:
I remember reading an article a while back when the author used some program to analyze the quality of video encoding vs what was perfect. It was used to compare AMD, Nvidia and Intel's encoders.

Could something like this be used to measure how accurate these fake frames are vs native, no upscaling on canned benchmarks where it would be run vsync for the accuracy test, then unlimited frame rate for the frame rate benefit?

https://www.tomshardware.com/news/amd-intel-nvidia-video-encoding-performance-quality-tested

Jabberwocky79 · Jan 15, 2025

I recognize that upscaling and AI features have their place, but I personally think that benchmarking a GPU should be relegated only to raw native resolution FPS. And I don't understand all of the technical details, but IMO, let's just keep benchmarking as simple as possible. The consumer can compare apples to apples and make a decision based on price, desired resolution, and minimum FPS that they consider acceptable.

The upscaling software should be separate and entirely subjective because it directly affects image quality. If a game requires software solutions in order to reach a person's minimum viable FPS, then they should shop with that in mind. Some people might prefer the look of one solution over another, so that would, hypothetically, be the wild card that determines which brand they want to buy.

OllyR · Jan 15, 2025

@JarredWaltonGPU

this gpu round do the RIGHT thing for all the consumers. Leave out all the fakes garbages and marketing tricks and test what it matters for us, the native resolution performances and image quality, plain and simple. RT is now after some gpu generations a diffuse technology in many games and so it's fine to consider also this, but please leave out everything that it's not a "standard" RT. I'm sure that soon after you listened what NVIDIA told the press on the 8 of Jan and what's happened after this you know that something isn't right with their perfomances and they know that they have to found some alternative ways to help sell these cards.

You have a great responsability this turn and I'm sure you know, not only for now but for this entire videocards industry also in the next years. Things are becoming more and more shady. And also these upscaler, frame generators and other similar fakes, change continuosly so we can't form an opinion or say a true reliable benchmark only based on them or they various revisions. Lets assume that either NVIDIA,AMD,INTEL all will try to improve these revision after revision so give them a separate space of consideration for a fair comparison.

And please double verify whatever proprietary tool a specific GPU maker says you have to use for benchmark their cards...

Trake_17 · Jan 15, 2025

You subdivide categories and let people make of it what they will for themselves. It's hardly different than benchmarking at a different resolution. You turn it on for one set of tests, you turn it off for another and let people know so that can tell what is apples to apples. You develop a method to compare the interpolated images between platforms to let people gauge their quality, the same as people do when comparing iPhone 16 to Pixel 9. Seems pretty doable

JarredWaltonGPU · Jan 15, 2025

rluker5 said:
I remember reading an article a while back when the author used some program to analyze the quality of video encoding vs what was perfect. It was used to compare AMD, Nvidia and Intel's encoders.

Could something like this be used to measure how accurate these fake frames are vs native, no upscaling on canned benchmarks where it would be run vsync for the accuracy test, then unlimited frame rate for the frame rate benefit?

Probably not for launch, due to time constraints... and I need to revisit as I think I screwed up one or two settings. LOL. But I did this a while back:
https://www.tomshardware.com/news/amd-intel-nvidia-video-encoding-performance-quality-tested

OllyR said:
this gpu round do the RIGHT thing for all the consumers. Leave out all the fakes garbages and marketing tricks and test what it matters for us, the native resolution performances and image quality, plain and simple. RT is now after some gpu generations a diffuse technology in many games and so it's fine to consider also this, but please leave out everything that it's not a "standard" RT. I'm sure that soon after you listened what NVIDIA told the press on the 8 of Jan and what's happened after this you know that something isn't right with their perfomances and they know that they have to found some alternative ways to help sell these cards.

You have a great responsability this turn and I'm sure you know, not only for now but for this entire videocards industry also in the next years. Things are becoming more and more shady. And also these upscaler, frame generators and other similar fakes, change continuosly so we can't form an opinion or say a true reliable benchmark only based on them or they various revisions. Lets assume that either NVIDIA,AMD,INTEL all will try to improve these revision after revision so give them a separate space of consideration for a fair comparison.

And please double verify whatever proprietary tool a specific GPU maker says you have to use for benchmark their cards...

I always do my best to test things the "right" way. I've done individual gaming tests with and without upscaling and framegen, I've done all the reviews primarily at native resolution. You might want to read the other articles about this Nvidia Editors' Day as well, where I've clearly laid out potential problems.

People will use framegen and MFG and upscaling, though. You can't just wholly discount it, even if it doesn't totally match what marketing might suggest. That's what I've been saying since the 40-series first appeared, nothing has changed there. I don't get the first line where you basically imply that I've been doing the WRONG thing since the 40-series launched.

King_V · Jan 15, 2025

So, all you need to do is capture videos of every benchmark and then dissect them. Easy, right? [cough]

I suddenly pictured you saying Rodney Dangerfield's bit here: (at 2:00, if it doesn't automatically start there)

View: https://www.youtube.com/watch?v=2Yec9xYJxpE&t=120s
(can't embed, so have to follow link)

OllyR · Jan 15, 2025

JarredWaltonGPU said:
Probably not for launch, due to time constraints... and I need to revisit as I think I screwed up one or two settings. LOL. But I did this a while back:
https://www.tomshardware.com/news/amd-intel-nvidia-video-encoding-performance-quality-tested

I always do my best to test things the "right" way. I've done individual gaming tests with and without upscaling and framegen, I've done all the reviews primarily at native resolution. You might want to read the other articles about this Nvidia Editors' Day as well, where I've clearly laid out potential problems.

People will use framegen and MFG and upscaling, though. You can't just wholly discount it, even if it doesn't totally match what marketing might suggest. That's what I've been saying since the 40-series first appeared, nothing has changed there. I don't get the first line where you basically imply that I've been doing the WRONG thing since the 40-series launched.

No, I absolutely don't want to say you have done something wrong (sorry for my english...)

I only would like to tell you to continue, as always, to give us consumers a right good way to compare these cards, because you know and in fact you remember also this in the article that with these upscalers and frame gens,etc there is more the necessity to change something to better evalutate and compare.

Maybe in the past generation there was a little bit less the necessity of this because the uplift in performance was clearly evident with or without these techs but this round it's way more important as all these gen cards (except BM) try more and more to rely on these for claiming perfomances.

Don't get me wrong whatever can give us better games performances and a better image quality is good and to consider but I think you understand what I would like to say. That shady lie from NVIDIA keynote of a 5070 similar as a 4090, is not a new marketing trick (they have done the same before) but it's from something like this that we always need your "help" because in that case it was evident that it was a "lie" but maybe next time it will be more complicated to verify something . As you said probably we start to need video of benchmarks (and even with this there is also the problem of the tool used for encode it for distribution...). Every year it will be more difficult to compare if software (or AI) prevail vs hardware for give more performance and quality or resolution. You have to do a lot more of work for compare, thanks again for this

jkhoward · Jan 15, 2025

I agree with the posters above, I don’t bother with DLSS FSR XESS benchmarks, I don’t care for “added frames”, 9 out of 10 times the game feels worse. If doing those benches gets you more hours on the clock, so be it. Otherwise, I wouldn’t waste your time. If anything, just make a DLSS FSR XESS review between 3 similar cards and leave it out of the card review.

JarredWaltonGPU · Jan 15, 2025

jkhoward said:
I agree with the posters above, I don’t bother with DLSS FSR XESS benchmarks, I don’t care for “added frames”, 9 out of 10 times the game feels worse. If doing those benches gets you more hours on the clock, so be it. Otherwise, I wouldn’t waste your time. If anything, just make a DLSS FSR XESS review between 3 similar cards and leave it out of the card review.

This is basically my intent. Depending on time, I may or may not be able to hit anything with DLSS 2/3/4 on the 5090 review initially, but I do want to look at it. I think it's also very much worth discussing the experience of DLSS 4 and MFG. I tried it at the Nvidia event in the game where it was being showcased, and it ran "great" — but it was a comparison between 120 FPS on a 4090 using framegen and 240 FPS on a 5090 or something similar. Which, realistically, you can't really tell the difference.

How that will extend down to a card like the 5070, or the eventual 5060-class GPUs, is another matter entirely. I suspect you'll want a base framerate of around 50~60 FPS without any framegen to get a good result. Because turning on framegen will probably end up at around 80~90 FPS, which means the rendered FPS now sits at 40~45 FPS. MFG will probably end up at around ~140 FPS, which means a rendered framerate of only 35 FPS.

Someone watching you play will say the 140 FPS mode looks smoother. But when you're in control, the responsiveness and latency at a native 50~60 FPS will likely end up feeling better than an MFG 140 FPS.

It's going to vary by game as well. I've noticed this with DLSS 3 framegen, where one game feels pretty good at a generated 70 FPS while another feels laggy at a generated 100 FPS. And there are games where even without framegen, they can feel laggy at 50 FPS (A Quiet Place: The Road Ahead comes to mind...)

-Fran- · Jan 15, 2025

"So, all you need to do is capture videos of every benchmark and then dissect them. Easy, right? *cough*"

Haha. I've been proposing everywhere the use of VMF to compare raw game footage as the* only viable way to get a consitent metric of "image quality" which is impartial (or I'd imagine it is) when analyzing differences.

Out of a full game section, you'd now need to find the one which presents the most graphical challenged to upscalers and FFrames. A couple seconds of something with a moving object that aligns perfectly from footage to footage should suffice, I'd say.

Still, it would definitely take a lot of extra time, but it would be a very nice follow up where you can also detail the rabbit hole journey. A full piece on its own.

Also, thanks for being forthcoming with this information Jarred, as I find it really valuable to put into perspective your analysis in you future reviews. If you don't mind me, suggesting it, in case you haven't thought of it before (low chance at that), don't forget to always link this to your reviews going forward 😀

Regards.

thestryker · Jan 15, 2025

I think reviewing scaling and frame generation software is imperative to the graphics environment, but should be completely divorced from video card reviews.

Since upscaling technologies look the same on every card they can run on this helps from a logistics standpoint. There's two main things I think are important with upscaling: overall image quality and performance at a normalized image quality. I've also yet to see anyone directly compare image quality of the two XeSS modes to one another. I imagine the LNL handhelds will have an image quality or performance advantage over AMD in anything with XeSS support.

As for frame generation I think it's important to test from a quality and viability standpoint, but without a precise way to measure input latency performance numbers don't mean anything. Even then the experience can vary and is subjective which just makes it so hard to quantify.

JarredWaltonGPU said:
It's going to vary by game as well. I've noticed this with DLSS 3 framegen, where one game feels pretty good at a generated 70 FPS while another feels laggy at a generated 100 FPS. And there are games where even without framegen, they can feel laggy at 50 FPS (A Quiet Place: The Road Ahead comes to mind...)

This is something Tim from HUB has mentioned consistently regarding frame generation. The most recent one I recall him talking about was Stalker 2 where the input was somewhat slow natively so using frame generation didn't feel any different than native. This has got to add a layer of nightmare for any reviewer because all you can really do is convey your experience with the technology in a single title as opposed to any sort of general recommendation.

palladin9479 · Jan 15, 2025

And thus the other shoe just dropped, nVidia is providing "guidance" for it's affiliated partners, which I suspect they are required to follow under NDA or lose future access to products. This is why I'm going to wait until Steve on GN or Jay over at JTC do fully independent non-affiliated, unguided, and no-NDA testing. Steve was talking about how their team is looking to buy or borrow cards for the testing and report back to people about it.

CaptRiker · Jan 15, 2025

screw frame gen... only numbers I want to see between the generations is raw power.. ie turn off frame gen and DLSS.. I want to see RAW fps between the gen's.. I have a 4080.. I want to see raw 5080 and 5090 to compare to my raw 4080 data.

JarredWaltonGPU · Jan 15, 2025

palladin9479 said:
And thus the other shoe just dropped, nVidia is providing "guidance" for it's affiliated partners, which I suspect they are required to follow under NDA or lose future access to products. This is why I'm going to wait until Steve on GN or Jay over at JTC do fully independent non-affiliated, unguided, and no-NDA testing. Steve was talking about how their team is looking to buy or borrow cards for the testing and report back to people about it.

The whole point of me writing about this (I didn't have to) was to disseminate information on the matter. Nvidia has not required anything, but as it rightly notes, you can't measure MFG using any other tools right now. We are dependent on software to get this data, or else you have to use high speed cameras and spend 10X the effort. For little gain, frankly.

This is no different than AMD's AFMF / AFMF2, incidentally, except the only metrics you can get there have to come from the drivers — not even AMD's own PresentMon branch OCAT gives frametimes with AFMF enabled (last time I checked). It's effectively useless to try to provide proper benchmarks of AFMF2 outside of experiential reporting and video capture, which can turn into a massive time sink.

GN and J2C don't have a monopoly on information, and they're only independent insofar as getting paid directly from YouTube and others makes you "independent." I consider myself about as independent as I could be. No one at Future ever tells me how I should do my testing; they know enough to realize I know what I'm doing. I provide equal opportunity for AMD, Nvidia, Intel, Asus, MSI, etc. to send me information; what I do with it is my own decision. And the day I'm told I have to write something positive about a particular company is the day I go looking elsewhere.

There's this mistaken impression among some that "old journalism" is biased and stuff like YouTube isn't, but in my experience a lot of the YouTube stuff ends up being highly opinionated and subjective and chases views more than any traditional coverage. It's like 10X more biased. There's good and bad YouTubers for sure, and good and bad traditional coverage, but the idea that a salaried individual writing for a site like Tom's Hardware is somehow more biased is laughable. I should go full YouTube and get paid 10X as much (if the channel gets big enough...) That's Jarred's 2 Centz, anyway. LOL

thestryker said:
This is something Tim from HUB has mentioned consistently regarding frame generation. The most recent one I recall him talking about was Stalker 2 where the input was somewhat slow natively so using frame generation didn't feel any different than native. This has got to add a layer of nightmare for any reviewer because all you can really do is convey your experience with the technology in a single title as opposed to any sort of general recommendation.

The more I poke at games that use Unreal Engine 5, the more annoyed I get. If you have extreme hardware, it can work well enough, but I look at something like Stalker 2 or MechWarrior 5 Clans and compare that with Indiana Jones and the Great Circle and I'm shocked at how poorly UE5 tends to run. And not just poor performance, but the latency always feels worse in my experience. The games are playable, and humans are able to adapt to the higher latency (console gamers have been doing it for decades), but when you swap between certain engines, the difference in image fidelity, performance, and latency can at times be striking. Or stryking in your case. 😛

News Benchmarking Blackwell and RTX 50-series GPUs with Multi Frame Generation will require some changes, according to Nvidia

Administrator

Distinguished

Commendable

Reputable

Distinguished

Splendid

Distinguished

Distinguished

Splendid

Splendid

Distinguished

Distinguished

Prominent

Distinguished

Splendid

Illustrious

Distinguished

Splendid

Glorious

Judicious

Splendid

Commendable

Splendid

Share this page