Discussion On troubleshooting temperature issues for CPUs since Intel 12th gen and AMD Zen 4

Since Intel 12th gen and AMD Zen 4, understanding temperature values for the CPU has gotten pretty confusing. Of note, AMD has officially said that Zen 4 is meant to operate near its Tj_max value. Intel, as far as I know, hasn't said anything officially, but given the definition on the ARK website, they say:
This is the maximum operating temperature allowed as reported by temperature sensors. Instantaneous temperature may exceed this value for short durations.

But between the two of them, they list the maximum temperature of the CPU at 95-100C. I wouldn't say this isn't a problem, as semiconductor chips have another 50-60C more before reaching the thermal runaway condition. That's the point where the chip's electrical resistance starts to go down, which causes higher current, which causes higher temperature. And the CPUs will start doing what they can to keep the temperature from going above it anyway (to the point where it can survive the worst-case failure: no CPU cooler at all)

I also understand that a lot of people want their CPUs to perform at their best, but sometimes the CPU can't reach the higher end of their boost clock speeds. A lot of people attribute the CPU not being able to maintain boost speeds at higher temperatures as thermal throttling. Which while yes, it's technically thermal throttling, may not mean the CPU is in a condition where it's a problem. The minimum requirement for the cooling system I feel hasn't changed since the dawn of time: can it keep the CPU from getting worse when at its base/minimum guaranteed specifications? I think people have forgotten that boost speeds are meant to be a bonus, not a promise.

Although annoyingly, Intel got rid of the base specifications for P-cores, stating that CPUs are supposed to swing between a variety of clock speeds so it doesn't make sense to have one.

So in any case, I think the following should be checked before declaring a problem with temperature and the cooling system should be checked.
  • What's being done while this temperature is reached?
    • Obviously sitting around doing nothing at 95C is a problem, but running Cinebench may not be.

    • Idle temperatures I feel is something that's constantly under discussion. I've seen people want nothing over 40C. I've seen people who are fine at somewhere between 40-50C. And I'm pretty sure I've heard someone say 50-55C is fine because temperature only really matters under load.

      I'm starting to think temperature under load is more important than temperature when idling, because that'll definitely tell us if the CPU and/or cooling system is functioning properly. If the CPU can't perform under load, then the idle temperature doesn't really matter anymore.
  • What's the power consumption of the processor?
    • In the absence of power consumption measuring hardware (like say a Kill-A-Watt), the next best thing we have is software that can pull data from sensors, like HWiNFO

    • If power consumption is near the rated TDP at 95-100C, technically this is still not a problem, because this is floating around how the CPUs were designed.

      We could argue that if consistent performance can't be maintained in this condition then the cooling system is at fault here. But we could also be chasing an endless amount of rabbits here.

      Also there's Intel's processors to consider, since they have two power levels. I side with the cooling system needs to meet PL1 requirements over PL2, despite the fact that PL2 tau is inifinte by default on some models. This is only to resemble some level of sanity because I don't think any cooler outside of phase change cooling keeps anyone satisfied with an i9 running full bore.

    • If power consumption is less than the rated TDP but something is at 95-100C, then we'd have to figure out what else is going on. A 1-2 core workload can spike temperatures for that part of the CPU, while the rest is fine. But if it's an all core workload, then it's looking like a problem with the cooling system.
So for a quick and dirty test to see if things are with the expected range, running Cinebench with HWiNFO to monitor power consumption (or better, a Kill-A-Watt) and temperature. If you really want to be damned sure things are being pushed, Cinebench can be swapped for Prime95, but I think Cinebench is good enough.

The only time I'd say there's a problem with the cooling system somewhere is if power consumption is drastically lower than the rated TDP but the temperature is closer to Tj_max.
 
Last edited:
Intel shows the base clocks for p-cores, what other "base specifications" did they used to show that they don't now? If you mean the turbo table, since that's the only thing that comes to mind that they don't show anymore, that's hardly "base" anything.

Also there are plenty of cheap coolers that will get close enough to the 253W TDP that nobody will care.
https://www.tomshardware.com/features/intel-core-13900k-cooling-tested/2
"The $20 Assassin 120 R SE sustained 5055MHz (an increase of 333MHz) with the CPU consuming an average of 245W."
https://www.tomshardware.com/reviews/amazon-basics-cpu-cooler-review/2
This one shows the ag400 as well as the assassin, both are fine for "basic" usage of the 13-14900k being able to run it at ,close enough to, its full 253W.
 

TRENDING THREADS