News If you think PCIe 5.0 runs hot, wait till you see PCIe 6.0's new thermal throttling technique

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

Notton

Prominent
Dec 29, 2023
466
412
560
if things keep going the way they seem to be going then PCIe implementations are going to be an increasing problem.
Again, an RTX 4090 only suffers a 2% performance loss when used on a PCIe 3.0 x16 slot.

What possible consumer oriented product is going to saturate a PCIe 5.0 x16 slot?
Just because it exists in the server market, it doesn't mean it has to be used at the consumer level.

It's like SAS, RDIMM, etc.
 

MobileJAD

Prominent
May 15, 2023
11
5
515
within limits of m.2 ssd? they have little wiggle room.

& they are a company that #1 rule is...profit...so they push the newest product.

about only way they'd get around thermal limit (w/o lowering soemthing spec wise) is having to redesign MB's so they could fit active coolers on them (and currently they cant due to the layout of MB and gpu's being so close to all the slots blocking any actual active cooling you could do that would have meaningful impact)



in every situation thats not the answer.
if your boot OS drive got too hot cause you installed/copied a file and shut off...thats gonna be an issue right?

and they already do slow down to prevent cooking themself as they have thermal limits before they throttle.

yes, but there would require cases to support that...and also with how they are moving some connections to back would interfere as well and would be an increase in cost of production.

and then you run risk of if it is good enough. heatsinks/spreaders can only move oh so much heat in a given time. (similar to why IHS can't keep cpu's cold is even best cooling gets overcome by the amount of heat generated in a specific area)

i recall reading article in past they were testing out actual water channels inside CPU's to try and keep em cooler..no idea what happened to that idea (might be worked on yet or dropped)




they will.

pcie 5.0 already has thermal issue (and why you see so many active cooling for em in market as well as why they all come w/ some form of heatsink (wasnt case with 3.0 and only tail end of gen 4 drives could hit speeds that might need em)


becasue its costly to do so & GPU makers have no reason to do son for consumer gpu's (they will slowly increase over time)

now server side? Nvidias H800 & H100 are gen 5 gpu's. (primarily ai focused which can hit the need for gen 5)
Maybe I miss read the article, but I thought that it was the PCIe subsystem itself that was over heating and not the device itself that was overheating?
 

razor512

Distinguished
Jun 16, 2007
2,143
76
19,890
yes, but there would require cases to support that...and also with how they are moving some connections to back would interfere as well and would be an increase in cost of production.

and then you run risk of if it is good enough. heatsinks/spreaders can only move oh so much heat in a given time. (similar to why IHS can't keep cpu's cold is even best cooling gets overcome by the amount of heat generated in a specific area)

I was thinking something on the back of a motherboard that looks a bit like this, but spanning the entire back of the board.

WidnJXf.png


Imagine something like the heatsink on the router (RAXE500), but without that slim fan. In the case of a common desktop motherboard. which would effectively mean a heatsink with low height fins measuring nearly 11.9in x 9.6in.
 

bit_user

Polypheme
Ambassador
I was thinking something on the back of a motherboard that looks a bit like this, but spanning the entire back of the board.
The main hot spots are going to be under the CPU, the chipset, and the VRM. Maybe a warm spot under the RAM, SSD, and GPU.

I could see putting a small heatsink under each of the big three, but spanning the entire under/back side seems like overkill and not very cost-effective.
 

Colif

Win 11 Master
Moderator
We need a different motherboard layout now to give enough space to allow massive heatsinks on the nvme without the first one interfering with GPU. That is a problem now, PCIe 6 just makes it worse. Almost need a break out box just for storage. To keep them cool and remove the thermonuclear heat generator in PC.
I don't even have pcie 4 yet, I see the diminishing returns earned by going faster... is it worth it? I realise its for servers now but people want faster even if they don't know they won't really notice it anywhere meaningful.
GPU don't need it yet..

So its only for people who must have the best... provided there is a choice of the old for everyone else, I don't have a problem with it.
 

Li Ken-un

Distinguished
May 25, 2014
102
74
18,660
Again, an RTX 4090 only suffers a 2% performance loss when used on a PCIe 3.0 x16 slot.

What possible consumer oriented product is going to saturate a PCIe 5.0 x16 slot?
Just because it exists in the server market, it doesn't mean it has to be used at the consumer level.
I don’t know which drives temperature up more: the number of lames or the amount of data transmitted by each one. As the SAMSUNG SSD 990 EVO shows, PCIe 5.0 can run cool, and their product happens to use a mere two lanes in PCIe 5.0 mode.

Maybe Samsung and Intel are onto something. Devices which will operate with fewer lanes each carrying an insane amount of traffic, and result in much less lanes being active overall to keep temperatures down. And then those who are happy with lots of lanes operating at PCIe 3.0 speeds can have what they want too.

If the new standards can't run as rated then why bother upgrading to them if you can use the older standard and spend less money?
It’s speculation, but perhaps operating fewer lanes at higher speeds actually results in less heat being generated, so long as the unused lanes remain unused (i.e., no connecting 4 PCIe 6.0 lanes to 4 SSDs). We’ll know if the trade-off is real in a few years.
 

bit_user

Polypheme
Ambassador
We need a different motherboard layout now to give enough space to allow massive heatsinks on the nvme without the first one interfering with GPU. That is a problem now, PCIe 6 just makes it worse. Almost need a break out box just for storage.
You mean like U.2?

People seem to keep forgetting that this is a solved problem. Many cases still have a spot to mount a couple 2.5" drives and it tends to be right in the intake airflow path.

To keep them cool and remove the thermonuclear heat generator in PC.
Isn't M.2 is limited to just 12 W? That's really not very much, especially when you consider they only use that much under heavy loads. A hard drive can spike above that, not to mention some of the U.2 drives out there which I've seen specified at a max dissipation of over 20 W.

I don't even have pcie 4 yet, I see the diminishing returns earned by going faster... is it worth it?
If you copy around VM images and snapshots or do lots of video editing, then a you can actually see the difference in sequential speed. For more basic stuff, even a decent SATA SSD is still fine.
 
Last edited:
  • Like
Reactions: thestryker
Yeah, I won't be buying one of these (or a PCIe5 NVMe for that matter) because clearly, we don't have the technology yet to handle speeds like this safely.
An yet in your signature you have a x3d CPU that, wait for it, runs at lower clocks to keep temps in check, clearly we don't have the technology yet to handle cache like that but it didn't keep you from buying that CPU.
What makes PCI any different?!
 

_Shatta_AD_

Reputable
Jan 27, 2020
42
30
4,560
Another ploy for mobo manufacturers to market their shiny new products…’PCIe GEN 6 CAPABLE!!!** 4X MORE BANDWIDTH THAN PCIe GEN 4!!! ;
**Gen 6 speed under certain limited scenarios***;
***Scenarios include but not limited to when only x8 links are utilized in the entire mobo with only SATA drives connected, OR x4 link if one or more NVMe Gen 6 drive(s) @x2 lanes installed (max two), CPU->Chipset link runs at Gen 4 speed at all times, …..”
A.K.A. your PC can only utilize Gen 6 speed 2% of the time/scenario with only 4-6 lanes available even then.
 

M0rtis

Distinguished
Dec 31, 2007
36
22
18,535
You guys remember the Asetek Vapochill case from the early 00's ? We need an updated version of that.
I have no idea if it was even good though.

Maybe use those crossflow fans to control how the air moves through the case to the evaporator coils
 

Pierce2623

Upstanding
Dec 3, 2023
173
156
260
Even improvements in the encoding can burn more power. I'm sure PCIe's PAM4 does, not to mention its FEC computation.
In some use cases PAM4 encoding allows reduced clock speeds because it transmits more data per clock cycle. Like I know in Nvidia Ampere GPUs that used gddr6x memory, they actually ran at lower clock speeds but used PAM4 encoding to increase data transmitted per cycle by 2x to end up at higher data rates than standard gddr6.
 

Pierce2623

Upstanding
Dec 3, 2023
173
156
260
That power is burned at the PCIe controllers in the endpoints and any intervening switches. Perhaps also PCIe retimers, but I'm not sure they're concerned about those.

While I suppose the copper traces in the PCB might heat up a little bit, I'm pretty sure that's not what they're worried about.
You’re right. It’s 100% the PCIe controllers that heat up rather than the traces themselves. Also a lot of the heat is coming from device controllers running at higher speeds to match the PCIe controllers. If one actually reads the article it’s pretty clear that’s the sort of thing they’re talking about.
 

PEnns

Reputable
Apr 25, 2020
668
720
5,770
Maybe Intel, considering its vast experience with generating heat, should add another profitable business to its business models: Creating really hot blast furnaces!
 

slightnitpick

Proper
Nov 2, 2023
144
89
160
I'm definitely not even close to being an expert but it seems there are three realistic options to address this:
  1. Move parts closer so they lose less electricity as heat in transmission, and thus need less electricity. Possibly fragmenting things, such as breaking apart a single graphics card into a handful of cards nearer to the CPU/memory (or vice versa).
  2. Sincerely start transitioning to optical interconnects. 1 2
  3. Crap, I sincerely forgot the third one.
 
Maybe Intel, considering its vast experience with generating heat, should add another profitable business to its business models: Creating really hot blast furnaces!
Still better than AMD that is slowly turning into an explosives company in their attempt to keep up.
Hot but not exploding is still better than hot but also explodes.
 

bit_user

Polypheme
Ambassador
Maybe Intel, considering its vast experience with generating heat, should add another profitable business to its business models: Creating really hot blast furnaces!
Well...

Also, I saw this covered on this site, but am having trouble finding the article about their work on 2 kW cooling solutions:
 

bit_user

Polypheme
Ambassador
I'm definitely not even close to being an expert but it seems there are three realistic options to address this:
  1. Move parts closer so they lose less electricity as heat in transmission, and thus need less electricity. Possibly fragmenting things, such as breaking apart a single graphics card into a handful of cards nearer to the CPU/memory (or vice versa).
Servers already are about as dense as can be, and I think PCIe specs stipulate what distances they need to support. Anyway, if you're a SSD or CPU maker, then it's somewhat out of your hands how closely together a server manufacturer decides to place them, as long as it's within spec.

Furthermore, I don't know (but maybe someone does?) just how much of the power is being consumed by driving the I/O vs. implementing the signalling & protocol. In the analog domain, PCIe 6.0 shouldn't really differ from PCIe 5.0, since they both run at the same clock speed. At that level, the main difference would just be any added SNR needed to support the PAM4 encoding.

2. Sincerely start transitioning to optical interconnects. 1 2
Yeah, there's already a PCIe working group that's looking at optical. At this point, we know PCIe 7.0 will still support copper, but I haven't heard anything about 8.0.
 
  • Like
Reactions: slightnitpick
Definitely seems like a lot of people posting in this thread have no clue what the use case would be. It's definitely only for server applications and just a guess on my part, but it seems like it would be geared mostly towards high end NIC applications. This type of thing would be perfectly fine to temporarily limit max performance when needed.

As for PCIe 6.0 I'd be surprised if we saw client adoption anywhere near as quickly as PCIe 5.0.
 
  • Like
Reactions: bit_user

Notton

Prominent
Dec 29, 2023
466
412
560
I'm definitely not even close to being an expert but it seems there are three realistic options to address this:
  1. Move parts closer so they lose less electricity as heat in transmission, and thus need less electricity. Possibly fragmenting things, such as breaking apart a single graphics card into a handful of cards nearer to the CPU/memory (or vice versa).
Sooo.... UCIe?

UCIe is great for making a chiplet SoC, but I'm not sure it is ideal for high speed storage.
 
  • Like
Reactions: slightnitpick

Alex Atkin UK

Distinguished
Jun 11, 2012
52
2
18,545
Still better than AMD that is slowly turning into an explosives company in their attempt to keep up.
Hot but not exploding is still better than hot but also explodes.
A broken CPU is a broken CPU. Intel CPUs are literally killing themselves due to too much power and heat, as Intel deliberately allowed motherboard vendors to use an uncapped power limit and just let the CPU thermal throttle.

AMD had a firmware bug at launch that pushed too much power to the IO chip, it didn't improve performance, it was never necessary in the first place.

I own Intel and AMD, and its Intel that more often than not let their CPUs run hot, especially in laptops or other small form factor devices. In fact the problem I've had on AMD is them being too aggressive on their ultra low voltage SoCs at cutting back the power consumption.
 

jlake3

Distinguished
Jul 9, 2014
56
74
18,610
Again, an RTX 4090 only suffers a 2% performance loss when used on a PCIe 3.0 x16 slot.

What possible consumer oriented product is going to saturate a PCIe 5.0 x16 slot?
Just because it exists in the server market, it doesn't mean it has to be used at the consumer level.

It's like SAS, RDIMM, etc.
I'm not saying it's going to be needed now, or next year, or the year after, but it's the next generation of a standard that is widely, widely used at the consumer level, and there's also been a trend towards running smaller links at higher speeds rather than increasing the number of lanes that CPUs make available and that devices can utilize. One product totally saturating a 5.0 x16 link is unlikely, but if the link is split x8/x4/x4 and is dropping to an older version and also cutting the link width shared among those three devcies to x8, that might be something to watch how well future devices handle. Not an immediate concern, but a trend to possibly keep an eye on.

SAS existed more in parallel to SATA, and RDIMM has been a variant on each version of DDR instead of a direct successor.

Saying that PCIe 6.0 and beyond won't be used on the consumer level means we've either reached the end state for client computing, or we're gonna have to brace ourselves for a compatability-breaking change.
 
  • Like
Reactions: bit_user

bit_user

Polypheme
Ambassador
Saying that PCIe 6.0 and beyond won't be used on the consumer level means we've either reached the end state for client computing, or we're gonna have to brace ourselves for a compatability-breaking change.
I didn't think PCIe 5.0 would reach consumers so early. Since it did, I wouldn't be too surprised if 6.0 also gets here. Relatively speaking, I'd guess 6.0 doesn't necessarily use that much more power, but the issue for using it in consumer products might be the additional PCB costs (more layers, retimers, materials, etc.) needed to support the higher SNR requirements.

PCIe 7.0 ...that I'm really not sure consumerland will see. It returns to doubling the frequency, for one thing. So, if client devices still need bandwidth beyond PCIe 6.0, maybe photonics will be sufficiently developed by then. PCIe has had a great run, but it'll end eventually.
 

Notton

Prominent
Dec 29, 2023
466
412
560
Saying that PCIe 6.0 and beyond won't be used on the consumer level means we've either reached the end state for client computing, or we're gonna have to brace ourselves for a compatability-breaking change.
I think we have.
PCIe 4.0 wasn't too difficult or costly to implement
Where as PCIe 5.0 has knocked mobo pricing out of the ballpark.

If cost wasn't enough, there is also the aspect of power efficiency.
There has been a long push towards better efficiency, and the custom PC market only barely avoided switching to ATX 12VO.
Making the mobo power hungry and having it produce more heat when everyone hates whiny little chipset fans? That is a bold move.

If PCIe 6.0 does arrive for consumers, I don't expect the mobo to cost any less than $800.
 

bit_user

Polypheme
Ambassador
I think we have.
PCIe 4.0 wasn't too difficult or costly to implement
Where as PCIe 5.0 has knocked mobo pricing out of the ballpark.

If cost wasn't enough, there is also the aspect of power efficiency.
There has been a long push towards better efficiency,
Consider that PCIe 6.0 runs at the same frequency as PCIe 5.0 and the main source of additional power is its encoding. Unlike I/O, the logic implementing that can potentially improve in energy efficiency with more advanced process nodes. It should absolutely be an efficiency win to replace a PCIe 5.0 link with a PCIe 6.0 link that's half the width.

Another argument for PCIe 6.0 to reach consumers is its support by CXL 3.0. If CXL ever trickles down (and signs are that it will), then I'd say PCIe 6.0 is likely to follow.

Also, to the extent that PCIe 6.0 controllers are burning lots of power, consider that they're potentially talking about server CPUs implementing 128+ lanes, which is on a very differente scale than what any consumer CPUs implement.

If PCIe 6.0 does arrive for consumers, I don't expect the mobo to cost any less than $800.
I'm not sure what that's based on, but let's wait and see how much cost it adds to server boards.