News Spitballing Nvidia's GB202 GPU die manufacturing costs — die could cost as little as $290 to make

Remember that's TSMC's cost per wafer. Doesn't include dicing the wafer into dies, testing, binning, packaging, testing again, etc, and the rest of the card (more component, more packaging, more testing, HSF, etc).
It's sort of like estimating car manufacturing cost from engine block casting cost - there's a positive correlation, but not much more.
 
Remember that's TSMC's cost per wafer. Doesn't include dicing the wafer into dies, testing, binning, packaging, testing again, etc, and the rest of the card (more component, more packaging, more testing, HSF, etc).
It's sort of like estimating car manufacturing cost from engine block casting cost - there's a positive correlation, but not much more.
Yeah, probably should have mentioned all the additional cost points much sooner in the article, then go in-depth on die cost calculations later.

That said, those big dies probably aren't quite as expensive as some of us were thinking in our heads. I maybe had a number more like $400-$450. It really is hard to say since NVIDIA's contract pricing is unknown on 4NP and most nodes (NDA/Proprietary contract protections).
 
so that's 290$ plus roughly 1-200 million in research and development. 2 grand seems like a steal. *sarcasm*
All that r and d plus they threw everything at it inculding the kitchen sink to barely get 30% atypical delta gains means the margins are atypically even higher. The already recuperated the r and d cost 1000 fold over!
How do you guarantee the public pay for such atypical margin ask? You create fud around scarcity and throttle all stock to maximize the maximum profits per silicon sold. Nvidia tactics 101!
 
The thing is, you can calculate the difference in cost at TSMC between the 4090 and 5090. Just put both ICs into the costing calculator and subtract the smaller from the larger. Then add a markup for extra power, 8GB RAM, DDR7, and profit margins for NVidia(61%, 2021), AIBs(12%), Retailers(5%), to get the resulting price-increases from the 4090. The cost of dicing and wiring up the chip should be included in that.

I did the same calculation with the 9070xt and figured it should cost about $100 more (at retail) than a 7800xt, if yields are 90% for both dies.
 
Partner card MSRP tells us everything we need to know about the margins nvidia is making on these products. The lowest MSRP on any partner card that is known so far is $2200 (that may change with official launch, but with limited availability I assume stores are going to happily take a premium). There's no doubt that the nvidia FE cooling and board design costs more money than the partner ones, but they're still MSRP of $2000.

It also sounds like GDDR7 availability isn't great right now so AIBs are getting memory with the GPUs from nvidia.

It's extremely unlikely any material cost explains the $400 difference between the 4090 and 5090. The board partner MSRP difference is also higher than we've seen in the past. All of these things really seem to indicate that nvidia wants even higher margins than they got with the 40 series. They're able to do this due to their market dominance of course, but it seems very unhealthy for the industry as a whole.

For as much as people love to complain about the ~8 years of Intel quad cores they never really raised the price point (i7-920/860 ~$284 i7-7700K ~$339). In the graphics space everyone's being told to pay more (and some times to get less) which is just never sustainable and something I'm not sure any of the modern companies understand.
 
so that's 290$ plus
That's just the wafer cost. You didn't even test, cut, shave, or mount the dies. Not to mention adding GDDR7, VRM, PCB, thermal solution, and other components. Add assembly, testing, and packaging. I'd be surprised if we're not up in the $500 to $600 range.

roughly 1-200 million in research and development. 2 grand seems like a steal. *sarcasm*
Really? Jensen can't scratch his butt for less than a couple mil. You're off by several $Billion !!

According to this, RTX 4000 cost $5.2 Billion to develop. Blackwell (not clear whether they're distinguishing between server and client or including both) was rumored to cost $9 Billion.

Then, there's software. Not just drivers, but all the tools & libraries they provide game devs. Not to mention development of things like DLSS models. Furthermore, I don't even know how they fund CUDA development.
 
  • Like
Reactions: renz496
All that r and d plus they threw everything at it inculding the kitchen sink to barely get 30% atypical delta gains means the margins are atypically even higher.
The lackluster gains are mostly because it's the same node as Ada, which was already huge. Unlike in previous generations when they stayed on the same node for more than one generation, the large dies they used in RTX 4000 didn't leave them much headroom to make them bigger.

How do you guarantee the public pay for such atypical margin ask? You create fud around scarcity and throttle all stock to maximize the maximum profits per silicon sold.
Except you forgot that RTX 4090 launched at $1600. It had no trouble maintaining this price floor, even at times when they weren't scarce.

Nvidia doesn't need to play games with perceived scarcity, simply because they have a premium solution that elite gamers and AI bros demand. I'm sure if they could get more units on store shelves, they would.
 
The lackluster gains are mostly because it's the same node as Ada, which was already huge. Unlike in previous generations when they stayed on the same node for more than one generation, the large dies they used in RTX 4000 didn't leave them much headroom to make them bigger.


Except you forgot that RTX 4090 launched at $1600. It had no trouble maintaining this price floor, even at times when they weren't scarce.

Nvidia doesn't need to play games with perceived scarcity, simply because they have a premium solution that elite gamers and AI bros demand. I'm sure if they could get more units on store shelves, they would.
View: https://youtu.be/wMd2WHKnceI?si=C0BwcLwnGSS_6Q6O


Gamer's Nexus take on it. It seems there was as few as 350 5090s for the the whole USA FYI.
 
Then why did they hype it so hard? It seems that more tech reviewers got more stock than customer allocation from the looks of it.
They hyped it because it's a launch. When you launch a new GPU, CPU, whatever... that's newsworthy. It causes lots of articles and videos to be produced that will support demand for the product over the rest of its life. If they launched it quietly, then waited to try and hype it after more inventory had built up, they wouldn't have been able to get as much publicity and mindshare for it.

Basically, the only other option they had was simply to delay the launch. I guess they decided it was better to go ahead with it and sustain all the publicity and buzz from the CES announcements.

AMD is doing the opposite, but I'm pretty sure that's not about building up inventory and just because their drivers (or FRS 4, more specifically) aren't ready and the reviews wouldn't be good if it launched right now.
 
They hyped it because it's a launch. When you launch a new GPU, CPU, whatever... that's newsworthy. It causes lots of articles and videos to be produced that will support demand for the product over the rest of its life. If they launched it quietly, then waited to try and hype it after more inventory had built up, they wouldn't have been able to get as much publicity and mindshare for it.

Basically, the only other option they had was simply to delay the launch. I guess they decided it was better to go ahead with it and sustain all the publicity and buzz from the CES announcements.

AMD is doing the opposite, but I'm pretty sure that's not about building up inventory and just because their drivers (or FRS 4, more specifically) aren't ready and the reviews wouldn't be good if it launched right now.
Wasn't the point keeping Blackwell on a similar node was to have higher yields not less? So not only we are 26 months from 4000 series launch we have less stock. That's atypically longer generational wait, more scarce, lower delta generation gains with even higher margins.

AMD is complicit in this shyte show!
 
Wasn't the point keeping Blackwell on a similar node was to have higher yields not less?
I've seen speculation that the limiting factor is the availability of GDDR7.

AMD is complicit in this shyte show!
AMD has pretty low marketshare in gaming dGPUs. Their datacenter products are making them virtually all of their profits. I'm just glad they seem committed to staying in this space.

I wish they hadn't cancelled their flagship RDNA4 GPU, but they really can't afford another underwhelming showing like they had with RDNA3. So, maybe that was the right call. We just have to wait and hope that UDNA is better.
 
The lackluster gains are mostly because it's the same node as Ada, which was already huge.
I don't know that we can actually give them a pass like that since there appears to be virtually no IPC gain and this isn't the first time they've kept nodes. Without knowing the ins and outs of the design it seems like they emphasized that which would benefit AI predominantly.

The way the 5080 scales with overclocking makes me wonder if they were just hedging their bets on what the market would look like. Of course this also allows them more room for refreshes down the road as well which makes sense given the current time between architectures.
 
  • Like
Reactions: aberkae
I don't know that we can actually give them a pass like that since there appears to be virtually no IPC gain and this isn't the first time they've kept nodes.
Eh, don't give them a pass, if you don't want to.

SER (Shader Execution Reordering) looks like it could deliver solid gains, but requires code modifications.

Some things which have benefited them in the past are tiled rendering and texture compression. The former is one reason for Maxwell's standout performance, in spite of being their 3rd generation on 28 nm and not a lot bigger than its predecessor. Improved Texture compression was a big reason for Pascal's improvements over Maxwell.

This time, apart from ray tracing, there aren't low-hanging fruits like that. In terms of their CUDA cores, there aren't lots of ways they can improve IPC without blowing up die area or power. GPUs are mainly about going wider, not deeper.

The way the 5080 scales with overclocking makes me wonder if they were just hedging their bets on what the market would look like. Of course this also allows them more room for refreshes down the road as well which makes sense given the current time between architectures.
GDDR7 should give it bandwidth to spare. As for why it's not clocked higher, out of the box, I'd guess that probably had more to do with the TDP they were targeting for this market segment.
 
  • Like
Reactions: renz496
GDDR7 should give it bandwidth to spare. As for why it's not clocked higher, out of the box, I'd guess that probably had more to do with the TDP they were targeting for this market segment.
It can clock up a fair bit without increasing the power budget which means they were just being conservative. Given the complete lack of competition it makes sense.
Eh, don't give them a pass, if you don't want to.

SER (Shader Execution Reordering) looks like it could deliver solid gains, but requires code modifications.

Some things which have benefited them in the past are tiled rendering and texture compression. The former is one reason for Maxwell's standout performance, in spite of being their 3rd generation on 28 nm and not a lot bigger than its predecessor. Improved Texture compression was a big reason for Pascal's improvements over Maxwell.

This time, apart from ray tracing, there aren't low-hanging fruits like that. In terms of their CUDA cores, there aren't lots of ways they can improve IPC without blowing up die area or power. GPUs are mainly about going wider, not deeper.
I understand returns are harder to come by, but that hardly means they couldn't. The RT performance didn't particularly scale differently than raster despite using new RT cores either.

It just seems pretty obvious that they made the decision to optimize the architecture for the market that has made them a multi-trillion dollar company rather than the consumer market. This absolutely makes sense from a company perspective, but doesn't make it less terrible from a consumer/tech enthusiast angle.

I don't really see it changing either unless AMD/Intel actually make a real market play. Even then nvidia just has to price aggressively which is still a win for consumers.
 
  • Like
Reactions: aberkae
I understand returns are harder to come by, but that hardly means they couldn't.
Really? Like we saw with Rocket Lake, a microarchitecture is pretty well optimized for a process node. Nvidia has the added problem that they can't afford to burn a lot more die area on a new microarchitecture, because that would mean fewer CUDA cores and there's no way IPC is going to increase enough to compensate for that.

IMO, they're pretty much stuck with Ada's microarchitecture, until they move to a smaller process node.
 
Really? Like we saw with Rocket Lake, a microarchitecture is pretty well optimized for a process node.
I understand the point you're trying to make but backporting a design which was made for a more advanced node isn't comparable to a native design.
Nvidia has the added problem that they can't afford to burn a lot more die area on a new microarchitecture, because that would mean fewer CUDA cores and there's no way IPC is going to increase enough to compensate for that.
That all depends on the design and whether or not clock scaling can be maintained. I don't think there's a realistic ROI on spending the time and resources to make it work though because it wouldn't benefit the vast majority of their revenue stream.
IMO, they're pretty much stuck with Ada's microarchitecture, until they move to a smaller process node.
From a technical standpoint I completely disagree, but from a realistic standpoint it would be foolish for them to do otherwise.
 
Lol this isen't looking good for Nvidia
View: https://youtu.be/J72Gfh5mfTk?si=wt19DjDh9ceR_Iwm

Also the 5090 isen't worlds better in ai compute either.
NVIDIA GeForce RTX 5090 Founders Edition Review - The New Flagship - GPU Compute | TechPowerUp https://search.app/MeCEc2zuxnmkuHY89
If the rate limiting step of supply is the GDDR7 vram then why did they go with 32 and not 24 gigs of vram. For 1 x5090 32 one they could have sold 1.5x 5090 at 24 gig one. Although that would put the supply at around 500 units for the whole USA. Also isen't their fabs out their begging for business with better nodes. With all that cash on hand they could have imo diversified their fabs. The GDDR7 being the culprit does makes some sense though.
 
GDDR7 should give it bandwidth to spare. As for why it's not clocked higher, out of the box, I'd guess that probably had more to do with the TDP they were targeting for this market sesegment.
When you are paper launching a product just to have an MSRP staple just to make the aib cards that more attractive from higher oc/ with better acoustics this is probably the reason they clocked the way they did.
 
If the rate limiting step of supply is the GDDR7 vram then why did they go with 32 and not 24 gigs of vram. For 1 x5090 32 one they could have sold 1.5x 5090 at 24 gig one.
These decisions were all made probably long before whatever issues have occurred that would be limiting the GDDR7 supply. I'm sure that situation will be rectified rather quickly, but (if it is responsible for the card shortage) it's certainly not good for the launch.

As for your arithmetic, the correct multiple is 1.333 24 G cards instead of one 32 GB card. Remove 8 GB from the 32 GB card. Do it a total of 3 times and you have enough GDDR7 to make another 24 GB card.

With all that cash on hand they could have imo diversified their fabs.
There are rumors that Nvidia is getting back with Samsung's Foundry. Won't affect RTX 5000 products, but future ones.
 
  • Like
Reactions: aberkae
These decisions were all made probably long before whatever issues have occurred that would be limiting the GDDR7 supply. I'm sure that situation will be rectified rather quickly, but (if it is responsible for the card shortage) it's certainly not good for the launch.

As for your arithmetic, the correct multiple is 1.333 24 G cards instead of one 32 GB card. Remove 8 GB from the 32 GB card. Do it a total of 3 times and you have enough GDDR7 to make another 24 GB card.


There are rumors that Nvidia is getting back with Samsung's Foundry. Won't affect RTX 5000 products, but future ones.
Lol without coffee Nvidia has me doing some multi frame generation math.
 
  • Like
Reactions: bit_user