News Don't waste money on a high-end graphics card right now — RTX 4090 is a terrible deal

Heat_Fan89 · Nov 27, 2024

That has been my plan all along. I'm even considering going the prebuilt route again either with Lenovo Legion or HP Omen but i'll need to weigh the cost of both a DIY and prebuilt.

bit_user · Nov 27, 2024

Elusive Ruse said:
I think the biggest chunk of performance uplift will come from GDDR7.

Maybe, but I pretty clearly recall analysis of RTX 4070 Ti vs RTX 4070 Ti Super vs RTX 4080 showing that the base RTX 4070 Ti wasn't really bottlenecked by its 192-bit memory interface.

Flayed said:
GDDR7 + 512 bit bus width should be impressive.

I remember thinking Radeon VII's performance should be nuts with 1 TB/s of HBM2, yet it ended up not being much faster than Vega 64. So, it can be hard to predict these things, without doing a detailed performance analysis that shows just how much a GPU is being held back by memory bandwidth.

bit_user · Nov 27, 2024

abufrejoval said:
...

Are you getting paid by the word? I count 1515 of them!
: D

Seriously, I'd need to run this through ChatGPT and have it summarize for me. At 686 words, the article itself is less than half this long!

Deleted member 2961511 · Nov 27, 2024

Elusive Ruse said:
I think the biggest chunk of performance uplift will come from GDDR7.

Agreed. I also expect the biggest uplift to be in the Ray Tracing sector. Which company will be preferred, though? Samsung?

Right now, how many companies are able to manufacture GDDR7 RAMs on a grand scale?

JarredWaltonGPU · Nov 27, 2024

valthuer said:
Agreed. I also expect the biggest uplift to be in the Ray Tracing sector. Which company will be preferred, though? Samsung?

Right now, how many companies are able to manufacture GDDR7 RAMs on a grand scale?

I keep getting more and more disappointed by the lack of revolutionary graphics thanks to RT. Even fully RT (or path traced if you want to pretend that's what the games are doing) titles don't look massively better than rasterization. Like Black Myth Wukong. Does full RT look better than rasterization? Sure. But it also runs like 1/5 as fast or whatever.

For the professional sector, the improvements in RT are definitely important. For games, we still need more stuff that's like Control (where the RT effects were clear and obvious and better) and less like Diablo IV, Avatar, Star Wars Outlaws, etc. where the RT just tanks performance for limited graphical improvements.

Deleted member 2961511 · Nov 27, 2024

JarredWaltonGPU said:
I keep getting more and more disappointed by the lack of revolutionary graphics thanks to RT. Even fully RT (or path traced if you want to pretend that's what the games are doing) titles don't look massively better than rasterization. Like Black Myth Wukong. Does full RT look better than rasterization? Sure. But it also runs like 1/5 as fast or whatever.

For the professional sector, the improvements in RT are definitely important. For games, we still need more stuff that's like Control (where the RT effects were clear and obvious and better) and less like Diablo IV, Avatar, Star Wars Outlaws, etc. where the RT just tanks performance for limited graphical improvements.

Yeah, i hear ya.

I was hoping that ray traced graphics would gradually improve, but, even though i hate admitting it, in most cases, the end result does not justify the performance hit.

Last time Ray Tracing really impressed me in gaming, was last year, with Hogwarts Legacy and Alan Wake II.

But the latter will not be hitting playable FPS at 4k Ultra RT native for the foreseeable future.

Well, at least not until 6090 gets released.

P.S. It's been long since the last time i played it, but, if memory serves, and in addition to being super demanding at Unobtainium settings, Avatar did not even have Ray Tracing settings - which is quite understandable, considering the fact it's an AMD promoted game.

Deleted member 2986452 · Nov 27, 2024

bit_user said:
Have you not read that it's made on virtually the same process node as the 4090 and only 22% bigger?

I have not. I literally know nothing about the tech specs... all I know is that the 5000 series is "Coming Soon™."

If I was sitting here with a poor performance GPU I'd probably have paid more attention... but as I have said in previous posts the only reason I'd upgrade to a 5090 is to get a decent resale on my 4090 while I still can.

bit_user said:
Where did you even get the idea that the 5090 would be anything like such an upgrade?

I didn't... it was an assumption based on the previous gen. If the performance boost is ridiculously low I can see myself waiting on the 6000 series.

bit_user said:
P.S. I find your username a little ironic for a Nvidia fan, given that Red was an ATI thing. Prior to that, AMD's colors were black, white, and green.

Source: https://en.wikichip.org/wiki/File:AMD_Athlon_XP_logo.svg

I'm aware... had this processor in my first build in 2001... the 1800+ at 1.53 ghz. I have this exact decal on my current build as a throwback to the old days. There's a guy online that makes retro PC stuff and it's pretty awesome.

As for my username... well... it's a red vs blue thing. "TeamRed2024" is not only in reference to AMD and Intel... but also props to our 45th and 47th President, Mr. Donald J. Trump. Additionally... I have no loyalty. I think AMD makes great processors... but Nvidia is the GPU king.

:cheese:

💯

Elusive Ruse · Nov 27, 2024

bit_user said:
Maybe, but I pretty clearly recall analysis of RTX 4070 Ti vs RTX 4070 Ti Super vs RTX 4080 showing that the base RTX 4070 Ti wasn't really bottlenecked by its 192-bit memory interface.

I remember thinking Radeon VII's performance should be nuts with 1 TB/s of HBM2, yet it ended up not being much faster than Vega 64. So, it can be hard to predict these things, without doing a detailed performance analysis that shows just how much a GPU is being held back by memory bandwidth.

More CUDA cores and faster memory will definitely help but I agree that one should believe it when one sees it.

JarredWaltonGPU · Nov 27, 2024

valthuer said:
P.S. It's been long since the last time i played it, but, if memory serves, and in addition to being super demanding at Unobtainium settings, Avatar did not even have Ray Tracing settings - which is quite understandable, considering the fact it's an AMD promoted game.

Unless I'm mistaken, Avatar: Frontiers of Pandora uses ray tracing, but it's the usual AMD-promoted use of RT effects, so they offer only very minor overall improvements (and less of a performance hit). Mostly it's just shadows I think.

But the game doesn't even show RT as options to enable/disable! It's a bit whack, I admit. There's a "BVH quality" setting IIRC, which again doesn't really tell you what it does, but BVH (bounding volume hierarchy) is for sure an RT thing. Ultimately, Avatar is the perfect example of a game with RT where the RT effects are basically meaningless AFAICT.

Deleted member 2986452 · Nov 27, 2024

JarredWaltonGPU said:
Unless I'm mistaken, Avatar: Frontiers of Pandora uses ray tracing, but it's the usual AMD-promoted use of RT effects, so they offer only very minor overall improvements (and less of a performance hit). Mostly it's just shadows I think.

Avatar is the perfect example of a game with RT where the RT effects are basically meaningless AFAICT.

I've got Avatar and I agree.

Maybe it's just me but I wanna say that RT and PT are kinda gimmicky IMO. Sometimes it's just hard to see the visual improvements. I am 50 though and my eyes aren't what they used to be... but I've taken various titles and turned RT on and off and sometimes had difficulty seeing any change... and definitely not what I would consider game making/breaking.

bit_user · Nov 27, 2024

TeamRed2024 said:
I have not. I literally know nothing about the tech specs... all I know is that the 5000 series is "Coming Soon™."

If I was sitting here with a poor performance GPU I'd probably have paid more attention... but as I have said in previous posts the only reason I'd upgrade to a 5090 is to get a decent resale on my 4090 while I still can.

I didn't... it was an assumption based on the previous gen. If the performance boost is ridiculously low I can see myself waiting on the 6000 series.

Thanks for confirming. Sorry if my post sounded a little harsh, but I was genuinely wondering if there was information out there to the contrary.

Deleted member 2986452 · Nov 27, 2024

bit_user said:
Thanks for confirming. Sorry if my post sounded a little harsh, but I was genuinely wondering if there was information out there to the contrary.

All good... no offense taken.

I do feel like we are reaching that point in PC hardware where it's just so good that upgrades are spread out longer and longer when compared to say... the hardware of 10-20 years ago and even cellular phones.

Definitely a good thing... but I think as a 4090 owner I might very well be good to stand pat on the 5000 series... barring the astromical generational upgrade probability that doesn't appear likely.

JarredWaltonGPU · Nov 27, 2024

TeamRed2024 said:
I've got Avatar and I agree.

Maybe it's just me but I wanna say that RT and PT are kinda gimmicky IMO. Sometimes it's just hard to see the visual improvements. I am 50 though and my eyes aren't what they used to be... but I've taken various titles and turned RT on and off and sometimes had difficulty seeing any change... and definitely not what I would consider game making/breaking.

I'm there with you on the age, and if you just look at a game (ie, in a blind taste test sort of way), most people wouldn't know if it used RT / PT or rasterization. If you put them side by side, you can see a few differences. If you do screenshots of select areas, RT and PT can certainly look better. But when you factor in the performance hit it all becomes very hard to justify.

abufrejoval · Nov 27, 2024

bit_user said:
Are you getting paid by the word? I count 1515 of them!
: D

Seriously, I'd need to run this through ChatGPT and have it summarize for me. At 686 words, the article itself is less than half this long!

I wish! But no, I just learned how to touch type from Mrs. Pinkerton of West Muskingum High School in 1980.

And just like Mr. Safford I was already a piano player before that, which helps with dexterity.

Perhaps the loquatiousness comes from being quatrilingual, and that's not counting Latin as a first foreign language.

Deleted member 2961511 · Nov 27, 2024

@JarredWaltonGPU I know it may be a bit too early, but, given all the rumours we've heard so far, what would be your personal projection on the performance uplift from 4090 to 5090?

JarredWaltonGPU · Nov 27, 2024

valthuer said:
@JarredWaltonGPU I know it may be a bit too early, but, given all the rumours we've heard so far, what would be your personal projection on the performance uplift form 4090 to 5090?

So, I believe N4P is supposed to do something like 15~20 percent better than N4 on its own. Meaning, same power, you get 15% more performance and density. This is totally just ballparking things, so I may have some numbers wrong but let's take things in parts.

AD102 is 609mm^2 with 76.3 billion transistors while GB202 is rumored to be 744mm^2. That's a 22% larger die, and if we also assume 15% more transistors that ends up being around 40% more total transistors.

Power is widely rumored to be 600W for the 5090. I think 4090 / AD102 was still somewhat power constrained, so giving it 33% more power to work with will definitely help. And N4P is more efficient as well, so potentially a 40~50% boost in total performance with higher power use.

That would also dovetail into the memory side of things. Even if 5090 was GDDR6X, going from 384-bit to 512-bit means 33% more bandwidth. Conservatively, I expect at least 28 Gbps GDDR7 (which will be readily available in 32 Gbps form). So, 33% higher clocks and a 33% wider interface combine to yield 78% more total bandwidth.

Undoubtedly, the large L2 cache will continue. Will it be improved? Maybe, which could mean even higher effective bandwidth.

Nvidia tends to keep things pretty balanced on the memory bandwidth improvements, so if it actually boosts memory bandwidth by 78% (or more), I suspect it also thinks there are some architectural improvements that make it so the GPU cores need the additional bandwidth.

But there's still the question of AI and RT hardware. Nvidia has been banging that drum since 2018, and while AI has paid off, I'm not convinced RT has. Yet Nvidia keeps putting faster and 'better' RT hardware into every RTX generation. So maybe the RT side of things needs the big boost in bandwidth more than the rasterization needs it? I don't know.

I still expect to see at least a 30~40 percent increase in performance, relative to the 4090, for the right workloads. Meaning, 4K and maxed out settings, possibly with RT, will see sizeable gains. And I think 1080p will be completely CPU limited and 1440p will be largely CPU limited. I also suspect Nvidia will double down on framegen and the OFA will get some needed improvements to make it so that most of the 50-series GPUs will realize an 80~90 percent increase in frames to monitor with framegen relative to non-framegen.

So that's my guesses. We'll see if I'm even remotely correct in maybe ~5 weeks. LOL.

Deleted member 2961511 · Nov 27, 2024

JarredWaltonGPU said:
So, I believe N4P is supposed to do something like 15~20 percent better than N4 on its own. Meaning, same power, you get 15% more performance and density. This is totally just ballparking things, so I may have some numbers wrong but let's take things in parts.

AD102 is 609mm^2 with 76.3 billion transistors while GB202 is rumored to be 744mm^2. That's a 22% larger die, and if we also assume 15% more transistors that ends up being around 40% more total transistors.

Power is widely rumored to be 600W for the 5090. I think 4090 / AD102 was still somewhat power constrained, so giving it 33% more power to work with will definitely help. And N4P is more efficient as well, so potentially a 40~50% boost in total performance with higher power use.

That would also dovetail into the memory side of things. Even if 5090 was GDDR6X, going from 384-bit to 512-bit means 33% more bandwidth. Conservatively, I expect at least 28 Gbps GDDR7 (which will be readily available in 32 Gbps form). So, 33% higher clocks and a 33% wider interface combine to yield 78% more total bandwidth.

Undoubtedly, the large L2 cache will continue. Will it be improved? Maybe, which could mean even higher effective bandwidth.

Nvidia tends to keep things pretty balanced on the memory bandwidth improvements, so if it actually boosts memory bandwidth by 78% (or more), I suspect it also thinks there are some architectural improvements that make it so the GPU cores need the additional bandwidth.

But there's still the question of AI and RT hardware. Nvidia has been banging that drum since 2018, and while AI has paid off, I'm not convinced RT has. Yet Nvidia keeps putting faster and 'better' RT hardware into every RTX generation. So maybe the RT side of things needs the big boost in bandwidth more than the rasterization needs it? I don't know.

I still expect to see at least a 30~40 percent increase in performance, relative to the 4090, for the right workloads. Meaning, 4K and maxed out settings, possibly with RT, will see sizeable gains. And I think 1080p will be completely CPU limited and 1440p will be largely CPU limited. I also suspect Nvidia will double down on framegen and the OFA will get some needed improvements to make it so that most of the 50-series GPUs will realize an 80~90 percent increase in frames to monitor with framegen relative to non-framegen.

So that's my guesses. We'll see if I'm even remotely correct in maybe ~5 weeks. LOL.

Thanks for the detailed reply, Jarred! Really appreciate you took the time to lay out all that info!

bit_user · Nov 27, 2024

abufrejoval said:
I wish! But no, I just learned how to touch type from Mrs. Pinkerton of West Muskingum High School in 1980.

And just like Mr. Safford I was already a piano player before that, which helps with dexterity.

Perhaps the loquatiousness comes from being quatrilingual, and that's not counting Latin as a first foreign language.

You have tons of great knowledge and insight to share, which is why I sometimes wish you'd focus your posts a little more so that more people (myself, at least) would actually read them and gain the benefits of doing so. However, I don't come here to read novelettes.
: )

P.S. I'm both a former piano player and learned to touch type in school (I forget if it was 7th or 8th grade). I remember when one of my high-school friends noticed I was even touch-typing all the shift-symbols, which I had learned from writing code. Sadly, for most of my writing, it's my brain that tends to be the bottleneck.

bit_user · Nov 27, 2024

JarredWaltonGPU said:
So, I believe N4P is supposed to do something like 15~20 percent better than N4 on its own. Meaning, same power, you get 15% more performance and density. This is totally just ballparking things, so I may have some numbers wrong but let's take things in parts.

AD102 is 609mm^2 with 76.3 billion transistors while GB202 is rumored to be 744mm^2. That's a 22% larger die, and if we also assume 15% more transistors that ends up being around 40% more total transistors.

Here's what TSMC said about N4P vs. N5:

"N4P will deliver an 11% performance boost over the original N5 technology and a 6% boost over N4. Compared to N5, N4P will also deliver a 22% improvement in power efficiency as well as a 6% improvement in transistor density."

Source: https://pr.tsmc.com/english/news/2874

We know the "4N" node, used by Ada, was already improved over regular N5, though I'm not sure they ever said by how much. To that end, I think the node actually used by Blackwell is "4NP", which presumably has some improvements over baseline N4P. Anyway, I'd ballpark this using their N5 -> N4P numbers, with the caveat that it might actually overestimate the improvements.

So, compounding the 6% density improvement with the 22% area increase would yield a 29.3% increase in number of transistors, which is basically the 1.3x I mentioned in reply to @TeamRed2024 . Scaling that by anywhere between 6% and 11% will get you between 37% and 44% more theoretical performance, assuming the ratio of hardware blocks stays the same and they don't decide to burn some of the die area on disproportionately beefing up one on two specific areas (such as they famously did in Turing, the last time we saw a major die size increase).

JarredWaltonGPU said:
N4P is more efficient as well, so potentially a 40~50% boost in total performance with higher power use.

Nope, you're double-counting, now! TSMC's performance and efficiency figures tell you either how much more performance at the same power, or how much less power at the same performance. Both of these figures assume the exact same design is translated over to the new process node, which I think is a reasonably safe bet for Blackwell (assuming we're talking about performance per unit of area).

JarredWaltonGPU said:
Undoubtedly, the large L2 cache will continue. Will it be improved? Maybe, which could mean even higher effective bandwidth.

My numbers assume they will scale it in proportion to everything else. If you assume it'll get even bigger, then you need to deduct some from the theoretical compute estimate, to compensate.

JarredWaltonGPU said:
Nvidia tends to keep things pretty balanced on the memory bandwidth improvements, so if it actually boosts memory bandwidth by 78% (or more), I suspect it also thinks there are some architectural improvements that make it so the GPU cores need the additional bandwidth.

Don't forget that Nvidia uses these same GPU dies for inferencing, in server-based products. For instance, the AD102 shows up in their L40 accelerator cards. So, it's possible some of the specs might be aimed more at AI use cases than client rendering workloads. I think it'll be very telling to see what they do with memory clocks, particularly if they indeed go with a 512-bit interface.

JarredWaltonGPU said:
maybe the RT side of things needs the big boost in bandwidth more than the rasterization needs it? I don't know.

Between the RTX 4070 Ti, the RTX 4080, and the Supers, we don't have enough data to say? I think the answer is probably there and someone just needs to dig it out.

Again, my numbers from naively extrapolating are 37% to 44% faster at native rendering, on GPU-bottlenecked cases. Also, that assumes 22% more power, which is basically how more area translates directly into more throughput. If the rumors about 512-bit GDDR7 are correct, the growth in memory power will be even greater than that number. So, if power doesn't increase by more than 22%, then my numbers are probably going to be an overestimate. They're probably an overestimate anyhow, because I think we rarely see real-world performance improve as much as the specs on paper.

abufrejoval · Nov 28, 2024

bit_user said:
P.S. I'm both a former piano player and learned to touch type in school (I forget if it was 7th or 8th grade). I remember when one of my high-school friends noticed I was even touch-typing all the shift-symbols, which I had learned from writing code. Sadly, for most of my writing, it's my brain that tends to be the bottleneck.

My primary school in Berlin experimented with a syllable based technique to teach reading and writing, instead of spelling things letter by letter.

It wasn't long lived because evidently it caused terrible issues with orthography and even in my family I was the only one taught that way.

But a positive side effect was that it basically turned me into a speed reader.

Even late in my school career I noticed that when everyone was given a text to read, I was typically finished reading it in about 50% of the time it took the next fastest one, perhaps 10-20% of what the bulk of people needed. If it was better than being bored, I'd have re-read it two or three times by then and formed an opinion.

And I didn't particularly try to be fast nor make it a sport.

My typing was greatly aided by a terrible handwriting: I hated it , and my teachers hated it, too.

Speed reading and fast typing came from two traits which could have turned into serious handicaps, but turned into probably significant advantages, mostly thanks to technology and computer keyboards being important for a few decades.

None of that helps with video, where I feel like a luddite in the presence of my kids.

JarredWaltonGPU · Nov 29, 2024

bit_user said:
Here's what TSMC said about N4P vs. N5:

"N4P will deliver an 11% performance boost over the original N5 technology and a 6% boost over N4. Compared to N5, N4P will also deliver a 22% improvement in power efficiency as well as a 6% improvement in transistor density."

Source: https://pr.tsmc.com/english/news/2874

We know the "4N" node, used by Ada, was already improved over regular N5, though I'm not sure they ever said by how much. To that end, I think the node actually used by Blackwell is "4NP", which presumably has some improvements over baseline N4P. Anyway, I'd ballpark this using their N5 -> N4P numbers, with the caveat that it might actually overestimate the improvements.

Yeah, I said 4N and N4P, which obviously isn't quite right. I'm not sure we have precise statements from TSMC or Nvidia about how much better 4N really is compared to N5 or any other nodes, and the same goes for 4NP. Again, I'm just ballparking and putting out some thoughts here, not trying to be 100% accurate because we absolutely do not know what artchitectural changes might be happening, and thus could easily be off by 10~20 percent.

bit_user said:
Nope, you're double-counting, now! TSMC's performance and efficiency figures tell you either how much more performance at the same power, or how much less power at the same performance. Both of these figures assume the exact same design is translated over to the new process node, which I think is a reasonably safe bet for Blackwell (assuming we're talking about performance per unit of area).

Not really. As you say, it's higher perf at same power, or lower power at same perf. I'm saying we'll get even higher performance while using even more power. If 5090 was 450W, it would get a modest performance bump. Adding transistors increases power, but not linearly, and the voltage-frequency curve also matters.

RTX 4090 has 450W TGP, but it rarely hits that, even in GPU limited situations, and power will scale with voltage limits and other stuff. So, if we have about 30% more transistors, running at up to 33% higher total TGP? It's not going to be just ~30% more performance. It will compound, to some extent. How much remains to be seen.

I think 1.3 * 1.3 is definitely the high water mark, assuming it really has 30% more transistors, so that would be up to ~70% higher performance. But that's in a hypothetical case where power was a major limiting factor in performance on the 4090, which as noted is rare.

Realistically, with architectural updates and everything else, I'm still betting on around 50% more performance than the 4090, in workloads that aren't CPU limited. Again, the massive increase in memory bandwidth with a 512-bit interface just hints that there's more behind the scenes than might be immediately obvious. Nvidia isn't dumb and it wouldn't stick a 512-bit interface with GDDR7 onto GB202 if it wasn't beneficial.

Granted, it does need wider interface just to increase capacity, at least for the professional/data center markets. And we don't absolutely know it's GDDR7 memory or how fast that memory is clocked. Maybe we'll get something like 24 Gbps GDDR7, which would only be 52% more bandwidth than 21 Gbps GDDR6X on a 384-bit interface.

And this is why I was very loose and "spitballing" things above. There's just too much we don't know yet. No one outside of TSMC and Nvidia would know. So we all make tons of guesses and assumptions that will inevitably prove to be off the mark. Gen on gen, though, 30~50 percent seems the safe bet. That's what most Nvidia GPU architectures do... though that's looking at "comparable" parts, and lately Nvidia has done screwy things where you get fewer cores and higher clocks, maybe less memory as well, etc.

I'll say this right now: 8GB should be dead on anything over $300. There's no reason to gimp performance that way. It has come back to haunt the RTX 4060 and 4060 Ti, as well as the RX 7600. Just put that weaksauce to bed and let's see 12GB and more as the standard for add in graphics cards!

bit_user · Nov 29, 2024

JarredWaltonGPU said:
Again, I'm just ballparking and putting out some thoughts here, not trying to be 100% accurate because we absolutely do not know what artchitectural changes might be happening, and thus could easily be off by 10~20 percent.

Same. However, when you're talking about something as concrete as density, it definitely not going to be greater than the 6% improvement they quoted between N5 and N4P. Likewise, the performance deltas they quoted between the two are hard upper limits.

JarredWaltonGPU said:
Not really. As you say, it's higher perf at same power, or lower power at same perf. I'm saying we'll get even higher performance while using even more power.

Their statements were assuming you take the exact same design and merely port it from one node to the next. So, I was taking the figures as essentially transistor-normalized estimates (i.e. assuming the same transistor count). If you're running at the iso-power point in the curve, but you also increase the number of transistors by 29%, then you're automatically using 29% more power.

JarredWaltonGPU said:
Adding transistors increases power, but not linearly,

If you run them at the same frequency, it basically would. The only way you get sub-linear power scaling is if you're building in some assumptions about lower utilization. However, I'm not really concerned about low-utilization games.

JarredWaltonGPU said:
the voltage-frequency curve also matters.

I believe that's baked into TSMC's power/performance numbers. So, if we start by taking their iso-power data point and treating it as a per-transistor figure, then we get to use their V/F assumptions.

JarredWaltonGPU said:
if we have about 30% more transistors, running at up to 33% higher total TGP? It's not going to be just ~30% more performance. It will compound, to some extent.

I accounted for that by multiplying that 1.3 figure by either 1.06 or 1.11, which depends on whether 4N was closer to N5 or N4.

JarredWaltonGPU said:
I think 1.3 * 1.3 is definitely the high water mark,

It's a fantasy, unless you think they're really going to go nuts with power. These numbers must be based on something and you can't tell me where you're getting 30% more perf/W. The part I quoted gave us clear guidance on how much more efficient N4P is. Even the low end of the range of 1.06 to 1.11 might even be an overestimate, when using 4N as a baseline.

JarredWaltonGPU said:
Nvidia isn't dumb and it wouldn't stick a 512-bit interface with GDDR7 onto GB202 if it wasn't beneficial. Granted, it does need wider interface just to increase capacity, at least for the professional/data center markets.

I think it's mainly about capacity and maybe some AI use cases. Especially when you consider how the RTX 6000 ADA and L40 are using a lower memory clock, you have to assume the Blackwell equivalents will also be using lower-clocked memory. So, for running AI workloads at a lower memory clock, maybe the wider datapath is justified.

JarredWaltonGPU said:
And this is why I was very loose and "spitballing" things above. There's just too much we don't know yet.

The thing I feel most confident about saying is that the perf/$ won't be worse. So, we can look at the launch price rumors to get an idea about the lower end of the performance increase.

JarredWaltonGPU said:
Gen on gen, though, 30~50 percent seems the safe bet.

In my prediction of 37% to 44%, I feel more comfortable with the lower end of that range. As I said, it depends a lot on what they do with power. Will they really push TGP 30% higher? How much more of that power budget is memory going to take? If the memory needs more than a 30% increase, that will leave less for the GPU, itself. That will cut into the range I stated.

JarredWaltonGPU said:
That's what most Nvidia GPU architectures do...

Remind me of the RTX 2080 Ti's improvement vs. GTX 1080 Ti, again?

W0bbo · Nov 29, 2024

A person looking for a 4090 likely wants the absolute best in BOTH of the following 1) overall performance (in gaming that includes latency, not just high and smooth framerates) and 2) high end features beyond rasterization, like ray tracing, upscaling, video encoding and AI inference performance.

That likely rules out anything Intel or AMD will offer next gen, period. Yes, Intel has shown quite decent AI and RT performance in Arc and will likely continue with Battlemage, but the only SKUs on offer will likely be up to the 70-class at best. Even if Intel does offer a 90 class competitor eventually, it will not likely be able to compete with a 4090. AMD is also unlikely because as the article explains it isn't even certain they will top the 7900XTX with their next GPUS.

It also probably rules out anything Nvidia will offer in its next lineup besides the 5080 and up, outside of potential new features for rtx50 generally. This is an example of a clickbait headline for an otherwise reasonable and thoughtful article. The argument stated here makes perfect sense for precisely everyone EXCEPT the 4090 buyer! It's EXCACTLY THOSE PEOPLE WHO WANT A 4090 RIGHT NOW for whom the argument in this article doesn't really hold, because the only GPU that will beat it will likely be the 5090, which will probably cost more and potentially be less available to purchase when it first arrives. Even CES is months away, and we don't know for certain that the 5090 will be available immediately upon announcement.

bit_user · Nov 29, 2024

W0bbo said:
The argument stated here makes perfect sense for precisely everyone EXCEPT the 4090 buyer! It's EXCACTLY THOSE PEOPLE WHO WANT A 4090 RIGHT NOW for whom the argument in this article doesn't really hold, because the only GPU that will beat it will likely be the 5090, which will probably cost more and potentially be less available to purchase when it first arrives. Even CES is months away, and we don't know for certain that the 5090 will be available immediately upon announcement.

This is basically my position. There are lots of unknowns about the RTX 5090, but you can be reasonably sure of strong demand at launch and that it will be even more expensive. In terms of perf/$, I think it should be an improvement, but maybe not by a lot.

So, if the RTX 4090 is already at the top of your price range or you want something soon and aren't willing to pay ridiculous scalper premiums, then I think there's a decent argument to be made for the RTX 4090. When a good opportunity to upgrade comes along, I assume they will hold their resale value reasonably well, especially if I'm right about the RTX 5090 not offering much better perf/$.

W0bbo · Nov 29, 2024

bit_user said:
In my prediction of 37% to 44%, I feel more comfortable with the lower end of that range. As I said, it depends a lot on what they do with power. Will they really push TGP 30% higher? How much more of that power budget is memory going to take? If the memory needs more than a 30% increase, that will leave less for the GPU, itself. That will cut into the range I stated.

Remind me of the RTX 2080 Ti's improvement vs. GTX 1080 Ti, again?

It DEPENDS! Lol I completely hate the way you guys are framing this discussion, this is hardly any better than a youtube comment section of AMD fanboys, sorry to say.

When you say "performance" what exactly are we talking about!? RT? AI? Plain old raster? And if we ARE talking plain old raster, are we including 1% lows, frame latency, or power draw? Talking about "performance" as if its all one thing is AMD-biased BS that completely ignores the real reasons why people buy new GPUs, and conveniently bypasses Nvidia's strengths, especially in the RTX generations.

Let's take the 1080ti vs the 2080ti, since you brought it up. the 2080ti COMPLETELY DESTROYS the 1080 ti in RT performance, especially pathtracing and EXTRA ESPECIALLY in real-world use cases because the 1080ti cannot use DLSS! In a game with heavy RT features, a 1080ti would be basically UNUSABLE and a 2080ti would still provide a decent 60fps at 1440p to this very day. So right there the 2080ti is providing a MASSIVE GENERATIONAL LEAP over the 1080ti, regardless of what the silly techpowerup general benchmarks would say.

Everything I just wrote in the paragraph above would apply even moreso to AI, where a 1080ti would be almost worthless. People don't generally buy new GPUs every generation, comparing the minute performance uplift from two years ago! Most people buy new GPUs more than one generation apart, they expect performance increases but MORE IMPORTANT are the transformational FEATURE IMPROVEMENTS from gen-to-gen, like frame generation and DLSS in general! The way techtubers and tech experts on social media talk about GPUs is significantly disconnected from the real reasons why people buy GPUs in the real world!

News Don't waste money on a high-end graphics card right now — RTX 4090 is a terrible deal

Honorable

Titan

Titan

Deleted member 2961511

Guest

Splendid

Deleted member 2961511

Guest

Deleted member 2986452

Guest

Estimable

Splendid

Deleted member 2986452

Guest

Titan

Deleted member 2986452

Guest

Splendid

Honorable

Deleted member 2961511

Guest

Splendid

Deleted member 2961511

Guest

Titan

Titan

Honorable

Splendid

Titan

Titan

Share this page