News Nvidia explains the missing ROPs — defective silicon in 0.5% of RTX 5090 and 5070 Ti GPUs

So, they're going with incompetence instead of deceptive, malicious intent. This is as expected.

These cores are validated, under NVIDIA's guidance, before they leave the fab. The missing defective ROPs would definitely show up - just as easily as they do in GPU-Z. Someone made the decision to release them as 5090s anyway.

I'm wondering if any other models (other than 5090, 5090D, 5070Ti) are affected. What if this was also a rare issue with the RTX 4090??
 
Last edited:
Ooops, looks like some of the stockpiled 5080Ti refresh chips being saved for next year accidentally got mixed in the 5090 batch! Seriously, this was deliberate, A penny pinching, control freak Nvidia did not release these chips on accident.
 
So, they're going with incompetence instead of deceptive, malicious intent. This is as expected.

These cores are validated, under NVIDIA's guidance, before they leave the fab. The missing ROPs would definitely show up - just as easily as they do in GPU-Z. Someone made the decision to release them as 5090s anyway.

I'm wondering if any other models (other than 5090, 5090D, 5070Ti) are affected. What if this was also a rare issue with the RTX 4090??
THey announced that only 5090 and 5070Ti are affected, sounded like the downbinning have issues occurred and they knew that's 0.5% out there is slightly defective, but "hey, it have no effect on AI, plus most users won't notice, let's just sell it as if they're all good"
 
So, they're going with incompetence instead of deceptive, malicious intent. This is as expected.

These cores are validated, under NVIDIA's guidance, before they leave the fab. The missing ROPs would definitely show up - just as easily as they do in GPU-Z. Someone made the decision to release them as 5090s anyway.

I'm wondering if any other models (other than 5090, 5090D, 5070Ti) are affected. What if this was also a rare issue with the RTX 4090??
Exactly, they made the decision to either meet launch date, hope it goes unnoticed, get the oops do over I’m sorry … or at worst face the inevitable class action lawsuit … which they could easily quash before it got to that point. The fact is they did not intend to waste this silicon as they diverted some of the more lucrative AI silicon to fulfill this launch. Let’s be real the chances of them missing such an easy check is about reasonable as someone on an auto assembly line not noticing you are missing a cylinder. I mean this is one the key differences in gaming Blackwells vs AI only center chip components … and you don’t notice it’s not complete? These are basic automated checks. They chose to deal with the issue afterwar, period.
 
If they already know what percentage of the GPUs are missing the ROPs, then that means they were tracking how many lower specced GPUs were leaving the factory.
Did any reviewers get sent one? I bet not!

Usually Nvidia loves releasing different performing GPUs as the same model, but typically they only do it for the low end cards.
Although since Nvidia is an AI company who only cares about AI, to them a $2000 RTX 5090 is a low end card, especially with it's razor-thin 400% profit margins.
 
  • Like
Reactions: scottslayer
So, they're going with incompetence instead of deceptive, malicious intent. This is as expected.

These cores are validated, under NVIDIA's guidance, before they leave the fab. The missing defective ROPs would definitely show up - just as easily as they do in GPU-Z. Someone made the decision to release them as 5090s anyway.

I'm wondering if any other models (other than 5090, 5090D, 5070Ti) are affected. What if this was also a rare issue with the RTX 4090??
You had me scared there for a second. I just ran GPU-Z and my 4090 does have the correct number of ROPS i.e. 176. Not like I would have been able to do anything about it at this point.
 
Given that nvidia supposedly knows the quantity of incorrect chips (and that it affects the 5070 Ti too) that leads me to believe it's one of two things: someone didn't do their job in QA or someone figured who'd notice and shipped it anyways.
Did any reviewers get sent one? I bet not!
TPU actually had the same model as the person making the original report on their forums and found it had the same problem. This is likely what amplified the message about missing ROPs and caused an official statement to land already.
 
  • Like
Reactions: helper800
Will those missing ROPs even show up on software like GPU-Z, etc. or they simply shows what’s pre-defined stats linked to the GPU model identifier? If not, how would consumers even know for certain their card has the missing ROPs besides consistently testing/benchmarking multiple times? What are the acceptable proofs that AIB will accept to do an exchange?
They should’ve just discounted the initial batch of cards by 4-5% from the MSRP to make up for their mistakes.
 
Will those missing ROPs even show up on software like GPU-Z, etc. or they simply shows what’s pre-defined stats linked to the GPU model identifier? If not, how would consumers even know for certain their card has the missing ROPs besides consistently testing/benchmarking multiple times? What are the acceptable proofs that AIB will accept to do an exchange?
They should’ve just discounted the initial batch of cards by 4-5% from the MSRP to make up for their mistakes.
If you don't have NVIDIA drivers installed (just using the built-in Windows drivers), GPU-Z will list the expected ROPs for that model of card.
As soon as you install NVIDIA's drivers, GPU-Z can read the actual ROPs and will list them accordingly, so everyone who owns an RTX 5090 should check theirs.
 
Nvidia is lying. They clearly knew about the missing ROPs because during testing the group of 8 ROPs were deliberately turned off in the chip's firmware. The ROPs aren't malfunctioning on the GPU, that is what would happen if they were 'missed' in testing, they have been deliberately disabled.
Someone at Nvidia made the decision to allow GPUs with up to one cluster(8) of ROPs to be shipped out to their AIB partners. The worst case scenario would be that maybe 20-30% of the users receiving defective cards would actually RMA their card. They would make more money sending the defects out than discarding or down-binning the GPU into lesser cards. Based upon how many of these cards have been spotted in the wild I suspect the error rate was closed to 10-15%, not basis points.
 
Nvidia is lying. They clearly knew about the missing ROPs because during testing the group of 8 ROPs were deliberately turned off in the chip's firmware. The ROPs aren't malfunctioning on the GPU, that is what would happen if they were 'missed' in testing, they have been deliberately disabled.
Someone at Nvidia made the decision to allow GPUs with up to one cluster(8) of ROPs to be shipped out to their AIB partners. The worst case scenario would be that maybe 20-30% of the users receiving defective cards would actually RMA their card. They would make more money sending the defects out than discarding or down-binning the GPU into lesser cards. Based upon how many of these cards have been spotted in the wild I suspect the error rate was closed to 10-15%, not basis points.
I agree that they knew about the defective chips before they were released to AiBs.

The thing I'm not sure about is whether NVIDIA would have to make adjustments to the firmware/BIOS of these cards due to the missing/defective ROPs. Another question is whether they actually had to physically disable those defective ROPs before the chip is ready. If either (or both) of these things are true this shows malicious intent.

I still haven't seen a response from anyone I consider technically savvy enough on the chip fabrication process to know what they're talking about...yet.
 
If it doesn't have any effect on compute and AI it's easy to see how these chips could pass through QA if their QA was to measure compute and AI performance and these chips were intended to be AI processors.

Still going to be interesting to see how many stories of "I bought a card from a scalper now I'm screwed" pop up.
 
Question, if the 5070Ti was affected, how would the 5080 not be affected? I was under the assumption that it shares more commonality with with the 5070 than the 5090.

And I love how Nvidia was basically saying 'C'mon, guys, it wasn't that big of deal... I mean, we're gonna fix it, but we're only talking about 4% here! Everything's fiiiiine...'