News Nvidia addresses significant Blackwell yield issues, production ramps in Q4

Status
Not open for further replies.
Last edited:
  • Like
Reactions: toffty
I think he means that if not all transitors are working, see if they can be used in GPUs.

Won't happen, two different use cases / architectures.
Blackwell is equivalent to Ada and being used in gaming and data center. They may do another data center only architecture like Hopper but it’s not Blackwell. Data center that aren’t focused on AI had no motivation to select Hopper over Ada as it basically only had advantages in AI. If you have traditional computing to do, you need general purpose ALUs, not fixed function matrix math accelerators. The H100 tripled the price of getting roughly 17000-18000 cuda cores compared to an a100 ada($7000 vs $20,000). The AI fad chasers happily paid up for the extra matrix accelerators though.
 
  • Like
Reactions: KyaraM
All I am hearing : Low Blackwell yields -> cut down blackwell gaming GPUs (a man can hope!)
The low yields they are talking about are for the datacenter parts, which would not be downbinned as gaming parts. They are not the same. We are still waiting for any kind of news on the gaming parts.
 
  • Like
Reactions: jp7189 and KyaraM
The low yields they are talking about are for the datacenter parts, which would not be downbinned as gaming parts. They are not the same. We are still waiting for any kind of news on the gaming parts.
Bingo. the chances of the Blackwell dies for consumer cards using TSMC's not-EMIB process at all is slim to none. They will be good ol' monolithic dies with offboard GDDR as usual, complex packaging would shove the price up far more than any potential savings from modular dies.

I'm guessing Nvidia are probably none too happy about being the guineapigs for CoWoS-L rather than sticking with CoWoS-S. Particularly as they were looking at Intel's fabs for packaging at the start of this year, who have years of experience with EMIB - bet there's behind closed doors 'well, we could have told ya that would happen's going on.
 
Bingo. the chances of the Blackwell dies for consumer cards using TSMC's not-EMIB process at all is slim to none. They will be good ol' monolithic dies with offboard GDDR as usual, complex packaging would shove the price up far more than any potential savings from modular dies.

I'm guessing Nvidia are probably none too happy about being the guineapigs for CoWoS-L rather than sticking with CoWoS-S. Particularly as they were looking at Intel's fabs for packaging at the start of this year, who have years of experience with EMIB - bet there's behind closed doors 'well, we could have told ya that would happen's going on.
They had to go to the newer CoWoS to get the capacity at all. The old CoWoS still hasn’t built out enough capacity because they totally switched focus to the new version instead of scaling out the old version. I HIGHLY doubt Intel’s solution offered competitive bandwidth or they would’ve gone for the cheaper prices and larger available capacity in Intel packaging.
 
Even the 4090 is not a full die, is something like 12% cut down from the full ada die.
What original commenter meant is that if blackwell is more cut down than ada, for 70, 80 and 90 cards, that would hopefully mean more supply that generation, and less inflated prices
 
They had to go to the newer CoWoS to get the capacity at all. The old CoWoS still hasn’t built out enough capacity because they totally switched focus to the new version instead of scaling out the old version. I HIGHLY doubt Intel’s solution offered competitive bandwidth or they would’ve gone for the cheaper prices and larger available capacity in Intel packaging.
CoWoS-L is not just a 'newer version' of CoWoS-S, they're two different packaging technologies that TSMC decided to use confusingly similar names for. TSMC has built out FAR more CoWoS-S capacity than CoWoS-L.

CoWoS-S is multiple chips mounted to a monolithic silicon interposer. The 'limitation' is that the interposer can only be so large, but interposer sizes have long ago surpassed the reticule size limit through multiple aligned patterning steps (double-size and quadruple-size, and TSMC have already announced 6x reticule size interposers). Because its all-silicon, there is no differential thermal expansion because the CoTE is the same for the interposer and the dies. Fanout needs to be routed through the interposer, but that also means the hosted dies do not need to handle fanout.
CoWoS-L is TSMC's totally-not-EMIB technique. Multiple dies sit on a traditional substrate, but additional dies are embedded under the substrate to handle- some-but-not-all inter-die links. These embedded dies are only large enough to handle the interconnects, far smaller than the overall chip. In theory this means they are cheaper than a monolithic interposer even with multiple link dies in one chip. The penalties are that hosted dies need to still handle fanout themselves, hosted dies need to handle two different types of bonding (to the embedded dis via microbumps or similar, and to the organic interposer via solder balls), and the chip has different CoTEs across it so thermal cycling wants to flex and buckle it.

Intel have been shipping EMIB since 2017, so have plenty of experience in hot to work around these limitations to achieve those theoretical cost savings. TSMC have not, and in this case potential cost savings have to face the cost penalties of both a delay in shipping products, shipping a small number low-yield (and thus low margin) products, and the up-front costs of respins and fabbing new dies to deal with packaging issues.
 
  • Like
Reactions: thestryker
CoWoS-L is not just a 'newer version' of CoWoS-S, they're two different packaging technologies that TSMC decided to use confusingly similar names for. TSMC has built out FAR more CoWoS-S capacity than CoWoS-L.

CoWoS-S is multiple chips mounted to a monolithic silicon interposer. The 'limitation' is that the interposer can only be so large, but interposer sizes have long ago surpassed the reticule size limit through multiple aligned patterning steps (double-size and quadruple-size, and TSMC have already announced 6x reticule size interposers). Because its all-silicon, there is no differential thermal expansion because the CoTE is the same for the interposer and the dies. Fanout needs to be routed through the interposer, but that also means the hosted dies do not need to handle fanout.
CoWoS-L is TSMC's totally-not-EMIB technique. Multiple dies sit on a traditional substrate, but additional dies are embedded under the substrate to handle- some-but-not-all inter-die links. These embedded dies are only large enough to handle the interconnects, far smaller than the overall chip. In theory this means they are cheaper than a monolithic interposer even with multiple link dies in one chip. The penalties are that hosted dies need to still handle fanout themselves, hosted dies need to handle two different types of bonding (to the embedded dis via microbumps or similar, and to the organic interposer via solder balls), and the chip has different CoTEs across it so thermal cycling wants to flex and buckle it.

Intel have been shipping EMIB since 2017, so have plenty of experience in hot to work around these limitations to achieve those theoretical cost savings. TSMC have not, and in this case potential cost savings have to face the cost penalties of both a delay in shipping products, shipping a small number low-yield (and thus low margin) products, and the up-front costs of respins and fabbing new dies to deal with packaging issues.
Most of everything you said doesn’t disagree with what I said. I know how the two different versions of CoWoS work, but it’s absolutely true that they stopped scaling out CoWoS-S to focus on CoWoS-L. It’s also true that if Intel actually offered competitive bandwidth on their solution that Nvidia would’ve made a pretty horrid decision to turn their back on the packaging deal Intel was offering. You’re not going to convince me that Intel offered the same latency and bandwidth key performance indicators and Nvidia said “ehh we’d rather just pay more at TSMC without getting any benefit from paying more”.. There has to be a reason they picked the TSMC packaging deal at much worse prices. I assume that reason is TSMC’s solution offered better theoretical performance when the deal was being inked.
 
Most of everything you said doesn’t disagree with what I said. I know how the two different versions of CoWoS work, but it’s absolutely true that they stopped scaling out CoWoS-S to focus on CoWoS-L.
1) They're not "two different versions of CoWoS" any more than a Kaby Lake-S die and a Strix Point HX die are both 'two different versions of a CPU'. Other than both being packlaging technologies, TSMC likes to slap 'CoWoS' in front of everything as their particular branding.
2) TSMC are continuing to expand their packaging operations. They doubled capacity in 2024, and are building out future sites to meet the continued growth in demand.
It’s also true that if Intel actually offered competitive bandwidth
There is zero evidence beyond your unsourced assertion that EMIB bandwidth is is any way different to CoWoS-L. Both will be limited by the SerDes on the host dies and the protocols used for inter-die communication.
There has to be a reason they picked the TSMC packaging deal at much worse prices.
1) Why do you think Intel's offering was a lower bid or TSMC's was a 'worse deal'?
2) When you're already fabbing and packaging with one supplier, it is generally desirable not to switch to another supplier with radically different design rules and tapeout tools to avoid incurring additional cost and time penalties.
 
  • Like
Reactions: thestryker
Status
Not open for further replies.