News Reviewer reports RTX 5080 FE instability — PCIe 5.0 signal integrity likely the culprit

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Yeah, but if they upgraded the chipset connection, then you could get plenty of PCIe slots via that route and wouldn't even need to bifurcate the x16 slot.
Basically there is no chipset on the BD790i, it's the SoC, only, with all of its 24 lanes.

But the SoC's IOD and Inifinity Fabric is a switch with tons of flexibility, which is turned into fixed allocations via slots.

That's why bifurcation gives you that flexibility, if you also allow the aggregation (should have mentioned that, perhaps).

So if you can customize the lanes by connecting 1-4 M.2 x4 cables to such a dGPU, that allows for both, variable form factors (smaller mainboard as well as bigger GPUs), and bandwidth flexibility.
As for people running dual GPUs, that's not really a mainstream thing to do. Games don't support it, so AI is the main reason one might do it. That's sufficiently niche that I think mainstream desktops don't need to cater for it.
Yeah and even with AI, multi GPU fails to deliver, unless you really tune for it (like DeepSeek). But there is more than just dGPUs you might want to add (I/O, other accelerators).

Mainstream used to be equivalent to x86 and PC. It was quite literally the defining platform.

But these days it's the segment that can only shrink. From servers, from workstations, from laptops, from consoles, from portable consoles, from IoT and appliances and last but most numerous--mobile: everyone 'steals' from desktop and the list just goes on and on.

Yet the other defining element of PC has always been flexibility. So if that is helped, even against what vendors wish for, it has a rather higher chance to succeed in evolution. In my book PCIe v5 for RTX 5xxxx helps that, while being backward compatible to PCIe v4 x16 (and PCIe v3 x16, which I am still using on my big Xeons).
They're already integrated into motherboard chipsets.
and I've been fantasizing about having lowest-end Ryzen CPUs act mostly as PCIe switches on PCIe cards (or M.2 slots), because they are actually cheaper than mere PCIe switches, after Marvell/Broadcomm bought up all that IP and did the Broadcomm price jump.

But it's exactly the point I am tring to make: the IOD is super flexible in terms of lane allocations and protocols: it can speak PCIe, but also USB, SATA, RAM and native Infinity Fabric with dynamic functional configuration and resource allocations.

But that's of less use, if slots mean static lane allocations. That's why bifurcation is so crucial in that it at least allows one path out of that. Currently that means clunky extra cards, consolidating on M.2 connectors and cables makes a lot more sense IMHO.
 
  • Like
Reactions: Li Ken-un
This article doesn't really make sense to me. It speculates that there could be a systemic issue related to PCIe 5.0 and multi PCB cards. But also says the RTX 5090, which uses the same multi PCB design and PCIe 5.0, doesn't have an issue. Maybe the debauer video explains things better, but I kinda hate watching techtuber videos.

The linked igorslab article is kinda confusing too. It makes some vague, passing mention of RTX 5090 boot issues, but doesn't seem to actually describe what those issues are. Then Igor says he fixed the issue with some soldering. What issues was he having, and what did he solder? Admittedly I haven't read the entire review, maybe he put those details in some otherwise unrelated page of the article.
Sense depends on information, which is still emerging. Neither Igor nor Roman have hard facts yet, but they (like tons of others) make money on your impatient clicks. Put the blame where you want it, but do not expect everyone to agree.
 
  • Like
Reactions: evdjj3j
You'd be paying double, for the vapor chamber design and the water block. One of the main motivations previously for using a water block might have been to free up some slots the 4+ slot previous cards blocked. If you run EPYCs with tons of single width x16 slots, water might still be attractive for filling them all.

If you like water in your computer, that's your privilege. I try to avoid it.

Not sure they'll have one. But holding out can be a pleasure on its own... in measure, I'd say.
Well aware, it is a case of already having the components. I've been using the same CPU block for nearly 10 years now. I used to do SLI and the large air coolers of the day were not helpful when you had two stacked.

On the Intel front, signs point to yes, but maybe skipping Battlemage and doing a full launch with Celestial.
 
I didn't say the standard was bad. I meant including it in a consumer desktop machine was not only a pointless waste of money, but now it's causing actual problems.


Yeah, which isn't an issue with at PCIe 4.0 speeds. So, the fact that Intel decided to reach for PCIe 5.0 (and AMD followed) just created a pitfall and Nvidia walked right into it.


So, PCIe 5.0 ended up being worse than a pointless waste of money - it's downright harmful!

Today there isn’t a large number of peripherals that require PCIe 5, arguably it is a waste of money but peripherals will be produced that require PCIe 5 to function at their designed speeds.

Saying it’s harmful is inaccurate, peripherals need to be correctly designed and implemented to work correctly with any standard at the bleeding edge. If Derbauer’s hypothesis is correct then there is a weakness in the 5080. Perhaps partner cards will avoid this, perhaps it’s just a bad card and a non-issue.

That the 40x0 series on PCIe 4 don’t have an issue leads to the suggestion to switch to v4 with a 5080. That is understandable - the data rate is 50%, still a high number but the capacitive effects are far greater as the frequency increases, lower impedance across conductors and as the frequency increases the inductive reactive impedance increases causing attenuation and timing issues. Signal lengths have to be made to higher tolerances to keep coherence across lanes, board designs must maintain this to the GPU core. Timing and attenuation linked with crosstalk and noise are a pain to engineer out. Fixes are best made at design time and during thorough prototyping. The increasingly common use of PCIe risers is an added complication potentially causing extended signal paths and exacerbating any inherent signalling problems.

Gimme a 286 on an 8MHz ISA bus… reliable!
 
Last edited:
  • Like
Reactions: JarredWaltonGPU
The 40 series doesn't use a multi-PCB design for the PCIe connector.

With the 50 series, the slot component (the part that fits into the motherboard) is not on the same PCB that houses the GPU core and VRAM. They are two different pieces with a type of FFC connecting them. Currently, it is theorized that this FFC allows for more noise introduction in the signaling. Too much to handle at times, apparently.

If this turns out to be true, this is a VERY big deal as it is part of the core design for the Founder's Edition card.
i didn't mean 40 series used it.

My point was design is good in theory, but its flawed in execution (like due to the signal noise)
 
PCIe signals via PCB are pretty poor. In the enterprise, they’ve had cabling standards for PCIe for a while, and cabling might not be optional with later PCIe generations. Research has even leapt ahead to optical connections. PCIe 5.0 is just hitting up against the limits of what can be done with PCB traces.
 
  • Like
Reactions: bit_user
PCIe signals via PCB are pretty poor. In the enterprise, they’ve had cabling standards for PCIe for a while, and cabling might not be optional with later PCIe generations. Research has even leapt ahead to optical connections. PCIe 5.0 is just hitting up against the limits of what can be done with PCB traces.
I don't think that's the case, as even PCIe 7.0, with 4x the data rate, is still being designed to accommodate PCB traces. Not that I expect to see that on consumer hardware any time soon (if ever).
 
  • Like
Reactions: George³
Note that this is an issue with one particular card and not endemic to all 5080 Founders Edition cards. Also, I'm not saying this will be the only card with an issue, just that it's probably more of a QA and testing thing rather than bad design. I guess we wait and see.

My cards have been working fine (knock on wood), and there's certainly more potential for problems with three PCBs. Well, really it's just the two PCBs and the ribbon cable between them: the PCIe 5.0 slot connector, ribbon to the main PCB, and the GPU PCB. A crimped or damaged cable would obviously be one potential culprit.

And naturally, the melting 16-pin 12VHPWR connectors on the 4090 started with just one instance. LOL. Would be very interesting if, over time, there are a bunch of failures or issues with the Founders Edition cards and PCIe 5.0 that don't crop up on the custom AIB designs!
So if there is almost no difference in performance between pcie 4 and 5 then more bandwidth isn't being used....or not much more. So if that is true, then something in the pcie 5 protocal is likely what is to blame right? Or just it revved up the clock for no real gain and is causing instability?

Seems in short like pcie5 is bad for anything that doesn't max out pcie4 simply because it lowers the threshold for noise before you have crashes or instability. Wonder if pcie4/5 is the breaking point for this type of connector...or if it is simply something that gets even worse with the new setup from nvidia.

Either way seems like set your boards to pcie4 if possible and you don't lose anything, but wonder what it does to a 5090 where you might actually see a diff?

Sounds all kinds of a quagmire to me....that I wish I could experience with a card but alas I was scalper blocked 🙁
 
So, PCIe 5.0 ended up being worse than a pointless waste of money - it's downright harmful!
? From what I understood it is an Nvidia problem, not an PCIe 5 problem and while the standard may have little to no benefit for gaming (as long as the GPU has enough VRAM), that is not the same as other types of gear doesn't benefit.
 
So if there is almost no difference in performance between pcie 4 and 5 then more bandwidth isn't being used....or not much more. So if that is true, then something in the pcie 5 protocal is likely what is to blame right? Or just it revved up the clock for no real gain and is causing instability?

Seems in short like pcie5 is bad for anything that doesn't max out pcie4 simply because it lowers the threshold for noise before you have crashes or instability. Wonder if pcie4/5 is the breaking point for this type of connector...or if it is simply something that gets even worse with the new setup from nvidia.

Either way seems like set your boards to pcie4 if possible and you don't lose anything, but wonder what it does to a 5090 where you might actually see a diff?

Sounds all kinds of a quagmire to me....that I wish I could experience with a card but alas I was scalper blocked 🙁
PCIe doesn't dynamically change clocks, AFAIK. Phison has a patent for SSDs that can renegotiate PCIe link speed to save power. I'm not sure if that's related. Anyway, for a GPU like the 5090 or 5080, I'd expect it would register as a 5.0 speed and stay there, whether the additional bandwidth is needed or not.

And the general consensus of people that have taken the time to test it is that PCIe 4.0 x16 offers the same performance as PCIe 5.0 x16 on the 5090. Maybe when I have time (next month, or maybe March?) I'll try to do my full test suite in PCIe 4.0 and 3.0 mode. But right now I have a lot of other GPUs to test!
 
PCIe doesn't dynamically change clocks, AFAIK. Phison has a patent for SSDs that can renegotiate PCIe link speed to save power. I'm not sure if that's related. Anyway, for a GPU like the 5090 or 5080, I'd expect it would register as a 5.0 speed and stay there, whether the additional bandwidth is needed or not.

And the general consensus of people that have taken the time to test it is that PCIe 4.0 x16 offers the same performance as PCIe 5.0 x16 on the 5090. Maybe when I have time (next month, or maybe March?) I'll try to do my full test suite in PCIe 4.0 and 3.0 mode. But right now I have a lot of other GPUs to test!
Just saying it seems odd for it to be a noise issue only if pcie 5 is on if there is no extra bandwidth being used. So I would think it is some protocal, or clockspeed, or something. Sounds more protocal based than anything. ie a new protocal is requiring a lower level of noise even though in this case it gives no gains, or very little gains. So pci 5 might have something in it that is in fact not worth it...at least for now.

Then Nvidia did something that increased noise, and boom issues. Just spit balling but sounds like what is happening. Since you confirmed no increase in clock, then protocol really sounds like the issue.
 
Guess it doesn't really matter because scalpers seem to have scooped them all up anyway. Once again Bestbuy and those like them didn't do enough it seems.
I don’t know that I’d blame the retailers with all the rumors of EXTREMELY limited stock. Large retail outlets were talking about receiving single digits of 5090s. They probably got 2 5080s.
 
  • Like
Reactions: bit_user
Just saying it seems odd for it to be a noise issue only if pcie 5 is on if there is no extra bandwidth being used. So I would think it is some protocal, or clockspeed, or something. Sounds more protocal based than anything. ie a new protocal is requiring a lower level of noise even though in this case it gives no gains, or very little gains. So pci 5 might have something in it that is in fact not worth it...at least for now.

Then Nvidia did something that increased noise, and boom issues. Just spit balling but sounds like what is happening. Since you confirmed no increase in clock, then protocol really sounds like the issue.
Pcie5 does run higher clock speeds than pcie4. That’s where part of the bandwidth increase comes from.
 
  • Like
Reactions: bit_user
PCIe doesn't dynamically change clocks, AFAIK
If you are talking about it, the driver can change bus version dynamically (but not every driver does this)! Enable native ASPM and its several modes L0&L1 in the BIOS. This is what Intel recommended to do to make Arc switch the PCIe bus dynamically for power saving.
 
  • Like
Reactions: bit_user
I don't think that's the case, as even PCIe 7.0, with 4x the data rate, is still being designed to accommodate PCB traces. Not that I expect to see that on consumer hardware any time soon (if ever).
I’m on the fence about this, having been burnt by plenty of PCIe connectivity problems (starting with PCIe 4.0). Though, PCIe 7.0 is only double the frequency (and using PAM 4) of PCIe 5.0, so if PCB traces work for PCIe 5.0 with some retimers/redrivers sprinkled all about, it’s not implausible that PCIe 7.0 would work the same way.
 
I don't think that's the case, as even PCIe 7.0, with 4x the data rate, is still being designed to accommodate PCB traces. Not that I expect to see that on consumer hardware any time soon (if ever).
I think you both have points. PCIe 5.0 effectively requires retimers on motherboards. A lot of PCIe 4.0 boards have retimers, but I'm not sure how strictly necessary those were.

PCIe 6.0 maintains the same symbol rate (i.e. clock frequency), but instead switches to PAM4 signalling. This isn't as challenging as doubling frequencies again, but it does mean you need a better signal-to-noise ratio for reliable transmission. IIRC, it also adds FEC to help improve reliability.

PCIe 7.0 surprised me, by doubling clock speeds yet again. It's still in draft status, though likely to be finalized this year. As such, we don't yet know what form PCIe 7.0 hardware will take.
 
So if there is almost no difference in performance between pcie 4 and 5 then more bandwidth isn't being used....or not much more. So if that is true, then something in the pcie 5 protocal is likely what is to blame right?
No, the protocol didn't change, AFAIK. Certainly not in ways that would be performance-relevant.

Or just it revved up the clock for no real gain and is causing instability?
This. It's not needed for games, because graphics APIs use asynchronous messaging to decouple the CPU thread & GPU from bus bottlenecks. Furthermore, assets being sent to the GPU are sized to run well at PCIe 3.0 or 4.0 speeds, so we're just not at a point where PCIe 4.0 x16 is a bottleneck.

Plus, if you just do the math: a high-end GPU has like 16 to 32 GB of GDDR memory. PCIe 4.0 x16 can send about 32 GB/s per direction. The rate of turnover in GPU memory isn't going to be anywhere close to 1 Hz, so it stands to reason that PCIe 4.0 x16 ought to be plenty fast for gaming.

The main use case for these cards having PCIe 5.0 is AI ...and maybe a few other professional apps. Gaming benchmarks on the RTX 4090 showed very little gain, going from PCIe 3.0 x16 to 4.0 x16, but I think I saw some Davinci Resolve benchmark that showed a > 50% gain.


Seems in short like pcie5 is bad for anything that doesn't max out pcie4 simply because it lowers the threshold for noise before you have crashes or instability.
Yes, that's my thinking.

Wonder if pcie4/5 is the breaking point for this type of connector...or if it is simply something that gets even worse with the new setup from nvidia.
Servers have been using PCIe 5.0 literally for years. It's probably just Nvidia's exotic design, with internal cabling, that was too ambitious.

Just saying it seems odd for it to be a noise issue only if pcie 5 is on if there is no extra bandwidth being used.
It doesn't matter how much bandwidth is being used, because whenever data is sent, it's sent at double the clockspeed. The higher the clockspeed, the tighter the tolerances become on all the electrical parameters. So, it totally makes sense that their board can work at PCIe 4.0 speeds, but not PCIe 5.0.
 
Last edited:
PCIe doesn't dynamically change clocks, AFAIK. Phison has a patent for SSDs that can renegotiate PCIe link speed to save power. I'm not sure if that's related.
Oh, I'm pretty sure PCIe link speed is dynamically renegotiable for power saving. I think that's what LSPM does.

I could swear I've even seen it. You probably don't see it, because gaming drivers and BIOS on gaming boards would disable it, since there's some latency when switching between different link speeds.
 
and right out the gate the NVidia bolts threw the gate and stumbles at the first hurdle.

damn you think considering what they charge for these gpus they would have done a lot of this testing prior. this is why you don't buy day one junk from any vendor be it amd or nvidia,.
 
  • Like
Reactions: bit_user