News Nvidia's RTX 5090 power cables may be doomed to burn

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
1) If it have manufacturing defect and go bad day 1, there's no visual clues of that
2) If it wears out, there's no visual clues.
If you put something together and don't stress-test its known potential weak points before calling it good, that is your loss. You don't change brake lines on a car and never test the brake lines until you get into an emergency braking situation at 80mph. You test it parked on the ground with both feet on the brake pedal and brake boost going, the most pressure those lines are ever going to see.

So what do you mean by monitoring isn't necessary? it is bloody present in 3090Tis, and AIB like ASUS have foreseen such issues are coming so they got that circuitry and software warning to try keep users safer, why on earth Nvidia is not obligated to do the same to consumer?
Users are perfectly safe as-is. It is only their $2000 GPUs and what may be in direct contact with the wires that might not.

As I have written before, Nvidia ditched the 3090's 3x2 balance likely because their own engineering reports say balance issues affect so few cards overall that it isn't worth bothering with. The vast majority of balance issues are caused by bad or defective cables. Nvidia likely gets PSU/cable manufacturers to cover the bulk of RMA costs unless Nvidia's own adapters are involved, not much skin off their own back.

If AIBs want to add extra harware to make some buyers feel better, good for them and you. Ignore Nvidia's reference design if you disapprove Nvidia's design choices and buy one of those instead.
 
As I have written before, Nvidia ditched the 3090's 3x2 balance likely because their own engineering reports say balance issues affect so few cards overall that it isn't worth bothering with. The vast majority of balance issues are caused by bad or defective cables. Nvidia likely gets PSU/cable manufacturers to cover the bulk of RMA costs unless Nvidia's own adapters are involved, not much skin off their own back.

If AIBs want to add extra harware to make some buyers feel better, good for them and you. Ignore Nvidia's reference design if you disapprove Nvidia's design choices and buy one of those instead.
I genuinely would love to know your reasoning for continuing to seemingly defend this clearly flawed design choice. It's a pure capitalist screw the consumer move nothing more. It doesn't benefit anything but the bottom line of giant corporations.
 
Last edited:
LMAO, if a $2000 product have a known weak point that is unguarded and fixed by the vendor, it’s the vendor’s fault, not users. It’s like collecting a brand new car from dealer that need to have brake tested by user before going on the road. I don’t know in your country. In mine if a car requires that, the vendor is in serious trouble
 
It's a pure capitalist screw the consumer move nothing more. It doesn't benefit anything but the bottom line of giant corporations.
I think they tried to use the most cheapest connector and push it so it would fail early especially when changing the card on an upgrade. Like they wanted to have the end user buy another power supply when changing the video card.

But looking at what they did really scrutinise the overall quality of their power supply and questions that if its even good to install.
 
I genuinely would love to know your reasoning for continuing to seemingly defend this clearly flawed design choice.
I have already spelled it out something like 10 times in this thread already. As far as the fundamental math and specs are concerned, the connector and cables are perfectly fine as-is. Hundreds of licensed engineers at Nvidia, PCI-SIG, Intel, cable, AIBs, PSU manufactuers etc. have gone over it and signed off on their respective parts of the ecosystem.

Unless there is incontrovertible smoking-gun evidence of a problem with the design, there is no inherent problem. And for all of the investigating derB, zoid, GN, Linus, etc. have done, none have positively identified any specific issue or combination thereof because all of their attempts at "reproducing" meltdowns have to go well beyond reasonable. One of GN's first successfully attempts at melting down a connector had it plugged about half-way in.
 
have already spelled it out something like 10 times in this thread already. As far as the fundamental math and specs are concerned, the connector and cables are perfectly fine as-is. Hundreds of licensed engineers at Nvidia, PCI-SIG, Intel, cable, AIBs, PSU manufactuers etc. have gone over it and signed off on their respective parts of the ecosystem.

Just because a group of people decided one thing means its correct.

For example:
People bought into the WHO's weaponized medicine movement that ended up wrecking society and the bio terrorist put in jail and the press was too embarrassed at their compliance that they never said anything about it. Was is right for people to commit crimes against humanity? No it wasn't. But people agreed to commit the crime for profit.
 
  • Like
Reactions: Peksha
I think they tried to use the most cheapest connector
There are cheaper options than MiniFitJr for ~50A and some variant of XT-60 would likely have worked better.

The main reason for sticking to MiniFitJr-based connectors is that PSU and cable manufacturers already use MiniFitJr pins for ATX, EPS12, 12VO, 6/8-pin cables, proprietary power cables for OEMs like Dell, modular cables, etc. Spares them the trouble of setting up a completely different cable assembly line for a connector only one GPU brand currently uses and few others use for anything else.
 
I have already spelled it out something like 10 times in this thread already. As far as the fundamental math and specs are concerned, the connector and cables are perfectly fine as-is. Hundreds of licensed engineers at Nvidia, PCI-SIG, Intel, cable, AIBs, PSU manufactuers etc. have gone over it and signed off on their respective parts of the ecosystem.

Unless there is incontrovertible smoking-gun evidence of a problem with the design, there is no inherent problem. And for all of the investigating derB, zoid, GN, Linus, etc. have done, none have positively identified any specific issue or combination thereof because all of their attempts at "reproducing" meltdowns have to go well beyond reasonable. One of GN's first successfully attempts at melting down a connector had it plugged about half-way in.
Hundreds of engineers at Boeing signed off the MCAS also, but ended up...

Nvidia is THE dominant player in PCI-SIG, and while the initial Math might be working in their lab environment and pre-spec 3090Ti with the balancing and sensing circuit, they failed to look at reasonable use case in domestic environment. AIBs are voicing out that they don't have much freedom to deviate from reference design much for the power rail and obviously some arn't quite sure and added more for their higher tier products.

Intel have basically nothing on their side to verify the issue yet and IIRC, their own RPL disaster is of similar nature, so it doesn't surprise me a bit that they failed on consumer protection.

Cable and PSU manufacturers don't have a say in the spec. They have to comply to that hard pushed standard or they will be left out of business, say even Seasonic or Corsair, if they come out saying they have some skeptism on the standard and refuse to provide any cable/PSU for that? They lost a gen or two sales since ppl trust Nvidia and the standard, until they are screwed and reports floating out left and right, literally forced to board the ship or close up now.
 
Just because a group of people decided one thing means its correct.
It is multiple groups of people across multiple sectors of the industry and across multiple countries. Not one monolithic group.

People bought into the WHO's weaponized medicine movement that ended up wrecking society and the bio terrorist put in jail and the press was too embarrassed at their compliance that they never said anything about it.
If it is COVID you are talking about, the disaster in the USA was mainly caused by the idiot-in-chief and his followers, against the ~95% consensus of medical professionals within the USA.
 
I have already spelled it out something like 10 times in this thread already. As far as the fundamental math and specs are concerned, the connector and cables are perfectly fine as-is. Hundreds of licensed engineers at Nvidia, PCI-SIG, Intel, cable, AIBs, PSU manufactuers etc. have gone over it and signed off on their respective parts of the ecosystem.
Did I say anything about the connector or cables? Nope I didn't.

The reality is that there is an easy and cheap way for nvidia to minimize the possibility of failure and they're simply not doing it. This is where the problem lies and it's obviously a problem or there wouldn't be anywhere near as many reports of failures as we're seeing. The why does not matter when there's again a cheap and easy way for nvidia to minimize the possibility.
 
Cable and PSU manufacturers don't have a say in the spec. They have to comply to that hard pushed standard or they will be left out of business, say even Seasonic or Corsair, if they come out saying they have some skeptism on the standard and refuse to provide any cable/PSU for that? They lost a gen or two sales since ppl trust Nvidia and the standard, until they are screwed and reports floating out left and right, literally forced to board the ship or close up now.
What would be interesting to find out if the failed connectors had the lower current contacts installed since they are way cheaper than the higher current ones.

But the black and red plastic ones get brittle when heat stressed compared to the white ones like the one used as the atx connector on a lot of motherboards.
 
Hundreds of engineers at Boeing signed off the MCAS also, but ended up...
Out of those hundreds, only a handful worked on MCAS. A handful of employees at only one company operating under the same oversight pressure, often from the same assumption trap.

With industry-wide standards like PCIe 5.0 HPWR, every player with a stake in cables and connectors has their own people looking at it for their own internal reasons with a bias towards their own priorities and liabilities.

Cable and PSU manufacturers don't have a say in the spec. They have to comply to that hard pushed standard or they will be left out of business, say even Seasonic or Corsair, if they come out saying they have some skeptism on the standard and refuse to provide any cable/PSU for that?
In Canada, licensed engineers are required by law to notify relevant authorities when they see something potentially dangerous and i'd expect it to be similar in most civilized countries.

If the HPWR connector was intrinsically unsafe, engineers from all manufacturers would be raising hell about it instead of risking their license signing off on something that is known to not be fit for purpose. They aren't.
 
Circuit breakers protect against short-circuits and sustained overloads...
Exactly. Same principle. Your argument is "this feature against this thing going wrong isn't necessary because if nothing goes wrong this thing won't go wrong." You may as well argue that circuit breakers aren't needed because short-circuits and sustained overloads don't happen if things work properly. Short circuits happen when things are wrong, old, damaged, faulty. Overloads are user error.

Hundreds of licensed engineers at Nvidia, PCI-SIG, Intel, cable, AIBs, PSU manufactuers etc. have gone over it and signed off on their respective parts of the ecosystem.
Boeing and the FAA both signed off on the 737 Max single sensor, so that's hardly a killer blow. Meanwhile various non-affiliated electrical engineers keep looking at the 10% safety overhead of 12VHPWR and the assumption that the 6 power lines are always balanced so just tie them together and say it's total nuts.

And don't say "hundreds have" like it's a fact unless you've evidence it was that many. For a start, Intel had nothing to do with it:
First off, PCIe 5.0 and the 12VHPWR connector are not Intel spec. It was developed by the PCI-SIG for a spec sponsored by Nvidia and Dell. It appears in the Intel spec after the fact because Intel had to make it part of the spec since the PCI-SIG were requiring consumer to use the connector for powering graphics cards.

http://jongerow.com/12VHPWR/
The screenshot of the 12VHPWR spec title page shows that the only sponsors were NVIDIA and Dell.

As I have written before, Nvidia ditched the 3090's 3x2 balance likely because their own engineering reports say balance issues affect so few cards overall that it isn't worth bothering with.
Right, it took them until after the 3090 to work that one out after all those years of making cards, and then when all the headlines came out saying 4090s melt connectors they added sense pins because all those engineering reports hadn't spotted that particular fault, and then just kept right on at it so when the 5090 is instantly doing the same thing even with the sense pins it's just bad luck.

You say they worked out they didn't really need it. I say they discovered that if they kept it the cards would have kept crashing at a rate where telling users it's their cable and they're doing it wrong just wouldn't cut it.

12VHPWR can do it on paper. The trouble is the leeway and demands are so unreasonably tight that it's not proving practical. If an experienced builder can reconnect a cable four times to a 4090 without a fault and then connect it to a 5090 and burn out both ends, it's not good enough for the real world.
 
  • Like
Reactions: Peksha and YSCCC
With industry-wide standards like PCIe 5.0 HPWR, every player with a stake in cables and connectors has their own people looking at it for their own internal reasons with a bias towards their own priorities and liabilities.

If the HPWR connector was intrinsically unsafe, engineers from all manufacturers would be raising hell about it instead of risking their license signing off on something that is known to not be fit for purpose. They aren't.
Since this appeared just while I posted my reply, and before I go to bed:

1) Standards don't work like that. I've worked and contributed towards standards, akin to this field, involving big names that have been bandied around here. Definitely not everybody who is going to be affected by a standard gets involved. Definitely the big names get their way and if the smaller players have 'concerns' they get 'considered'.

2) 12VPHWR isn't intrinsically unsafe. As above, it works on paper, it just turns out to be falling over a lot more after contact with reality. Funny how it's about the only electrical standard out there with a 10% safety margin.
 
Exactly. Same principle. Your argument is "this feature against this thing going wrong isn't necessary because if nothing goes wrong this thing won't go wrong." You may as well argue that circuit breakers aren't needed because short-circuits and sustained overloads don't happen if things work properly. Short circuits happen when things are wrong, old, damaged, faulty. Overloads are user error.


Boeing and the FAA both signed off on the 737 Max single sensor, so that's hardly a killer blow. Meanwhile various non-affiliated electrical engineers keep looking at the 10% safety overhead of 12VHPWR and the assumption that the 6 power lines are always balanced so just tie them together and say it's total nuts.

And don't say "hundreds have" like it's a fact unless you've evidence it was that many. For a start, Intel had nothing to do with it:

The screenshot of the 12VHPWR spec title page shows that the only sponsors were NVIDIA and Dell.


Right, it took them until after the 3090 to work that one out after all those years of making cards, and then when all the headlines came out saying 4090s melt connectors they added sense pins because all those engineering reports hadn't spotted that particular fault, and then just kept right on at it so when the 5090 is instantly doing the same thing even with the sense pins it's just bad luck.

You say they worked out they didn't really need it. I say they discovered that if they kept it the cards would have kept crashing at a rate where telling users it's their cable and they're doing it wrong just wouldn't cut it.

12VHPWR can do it on paper. The trouble is the leeway and demands are so unreasonably tight that it's not proving practical. If an experienced builder can reconnect a cable four times to a 4090 without a fault and then connect it to a 5090 and burn out both ends, it's not good enough for the real world.
Exactly, and for those who worried about the burning out of connectors in Nvidia is likely under the same if not more pressure from top management compared to Boeing, after all Boeing is the one of the duopoly in aerospace with roughly 50% of market share, while Nvidia is THE monopoly in GPU space, things got even worse when Boeing is literally taking hundreds of souls as a consequnce, and for Nvidia? some hefty money of the consumer going down the drain isn't remotely comparable to the MCAS in consequence, nobody will go to jail even if they sign off the incompetant design.

So why will they not have management pressure to just sign it to keep their job at the top valued company in the world is a concept beyond my understanding
 
This is where the problem lies and it's obviously a problem or there wouldn't be anywhere near as many reports of failures as we're seeing.
"Many reports" means nothing without knowing exactly how many of those cards are out there. A thousand sounds like a lot... out of a million, that would be a perfectly acceptable 0.1% failure rate.

A thousand people who can afford $2000 GPUs are likely to generate a disproportionate amount of noise and tech sites love good drama.
 
Right, it took them until after the 3090 to work that one out after all those years of making cards, and then when all the headlines came out saying 4090s melt connectors they added sense pins because all those engineering reports hadn't spotted that particular fault, and then just kept right on at it so when the 5090 is instantly doing the same thing even with the sense pins it's just bad luck.
The sense pins don't really do a whole lot with regards to protecting anything. Two of the sense pins are to dictate how much power the add in card is allowed to request from the PSU. The other two are detecting a cable is present and an optional detection that the rails are operating as expected.
 
"Many reports" means nothing without knowing exactly how many of those cards are out there. A thousand sounds like a lot... out of a million, that would be a perfectly acceptable 0.1% failure rate.

A thousand people who can afford $2000 GPUs are likely to generate a disproportionate amount of noise and tech sites love good drama.
It's hilariously sad that you think a bad design decision directly leading to any failures is acceptable.
 
Since this appeared just while I posted my reply, and before I go to bed:

1) Standards don't work like that. I've worked and contributed towards standards, akin to this field, involving big names that have been bandied around here. Definitely not everybody who is going to be affected by a standard gets involved. Definitely the big names get their way and if the smaller players have 'concerns' they get 'considered'.
Your role in standards-setting doesn't matter.

In countries like Canada that take licensed engineers' role in preserving public safety seriously, licensed engineers have to notify authorities about any obvious safety concerns regardless of their relationship with any standards or companies.

If HPWR was deemed unsafe by licensed engineers, then products using the connector couldn't legally be made or imported into any countries with licensed engineers agreeing that the connector is unsafe unless the engineers involved in it are all willing to risk losing their professional license if something goes wrong.

2) 12VPHWR isn't intrinsically unsafe. As above, it works on paper, it just turns out to be falling over a lot more after contact with reality. Funny how it's about the only electrical standard out there with a 10% safety margin.
If safety margins were only 10%, cable and PSU manufacturers wouldn't have demonstrated their cables pushing 1000+W. The connector's own nominal ratings have margins to accommodate manufacturing tolerances and wear.
 
Your role in standards-setting doesn't matter.

In countries like Canada that take licensed engineers' role in preserving public safety seriously, licensed engineers have to notify authorities about any obvious safety concerns regardless of their relationship with any standards or companies.

If HPWR was deemed unsafe by licensed engineers, then products using the connector couldn't legally be made or imported into any countries with licensed engineers agreeing that the connector is unsafe unless the engineers involved in it are all willing to risk losing their professional license if something goes wrong.


If safety margins were only 10%, cable and PSU manufacturers wouldn't have demonstrated their cables pushing 1000+W. The connector's own nominal ratings have margins to accommodate manufacturing tolerances and wear.
ON PAPER that is. For engineers in Boeing are also obligated to such if not more strict law, and guess what happened to those who voice out? Silenced, "Suicide", Sacked and broke as Boeing literally pressure against their partners to hire said engineers/managers and the MCAS still gets pushed through and killed 300+ innocent ppl. Remember what the reaction when the drama first rolled out? "The pilots are poorly trained, not the plane is at fault".

Engineers are human, they have their mortgages to pay off and a life/career to risk speaking out, what happens if they didn't voice out and say, eventually it burns hell? For 99% of the engineers - NOTHING, it's oversight, you can't proof they deliberately ignored the risk, maybe you can revoke a few licese and they go to Mcdonalds to work afterwards. But if they decide to screw the trillion dollar company and voice out publicly? Others who decide to keep silent will act like what you just demoed and it's an open case without conclusion for some time at least, and the voicing out guy will be instantly jobless if not mysteriously "suicide".
 
It's hilariously sad that you think a bad design decision directly leading to any failures is acceptable.
Everything is a compromise between the bare minimum to make something work and how it should ideally be done.

That is why most PCBs are missing tons of components especially in power distribution and bypass. Engineers calculated the conservative if not ideal amount of capacitors, phases, chokes, etc. based on projected CPU/GPU chip properties, then removed most of what they could get away with to cut costs after they got production candidate silicon to tune the final configuration with.

Every month, someone asks me to repair a device (or I have to fix one of my own) and it turns out to be the manufacturer cheaping out on the 10-47uF 35-50V ~$0.05 bootstrap capacitor in the AC-DC converter that isn't even a low-ESR type and gets fried by ripple current from the flyback transformer. Those are deliberately engineered to fail. So predictable that there is a formula to estimate capacitor lifespan as a function of operating conditions and capacitor specs.
 
Engineers are human, they have their mortgages to pay off and a life/career to risk speaking out, what happens if they didn't voice out and say, eventually it burns hell? For 99% of the engineers - NOTHING, it's oversight, you can't proof they deliberately ignored the risk
When something serious happen, investigations get launched and those usually do a fairly good job combing through records to find out how long ago the company should have first heard about the problem.

In Canada, engineers can report severe issues directly to their union if they fear retaliation by their employer and let the union pick it up from there.
 
When something serious happen, investigations get launched and those usually do a fairly good job combing through records to find out how long ago the company should have first heard about the problem.

In Canada, engineers can report severe issues directly to their union if they fear retaliation by their employer and let the union pick it up from there.


Pls stop yours FUD in this and other threads and show how you fearlessly left your 5090, sent to you as a gift from NV, to render all night under full load without supervision.

We can even enhance the effect - I will also do this experiment by disconnecting two of the three 8pin cables from the GPU. You will cut off (with all necessary safety) 4 of the 6 +12V power cables on your connector😉
 
Last edited:
We can even enhance the effect - I will also do this experiment by disconnecting two of the three 8pin cables from the GPU. You will cut off (with all necessary safety) 4 of the 6 +12V power cables on your connector😉
Just because planes can technically take off with half of their engines disabled doesn't mean they should since that leaves no reserve power in case of bird strike or other engine failure during take-off.

Just because a good HPWR cable should be fine (stay within thermal specs) with half as many wires/pins doesn't mean you should intentionally run it that way either since that leaves no room for externalities.