[SOLVED] Dell Precision 3660 water cooled I9 12900K overheating - any solutions / fixes?

May 9, 2022
8
0
10
We purchased two identical Dell Precision I9 12900K, and we can bring them both into the dreaded 0.78GHz "freeze mode" by compiling a quite large C++ program on them (6 minutes of compile)

Initially Intel Processor Diagnostic Tool (4.1.5.37.W.MP) was enough to trigger the presumed thermal issue, but a cooler replacement by Dell seems to have solved that on at least one of the machines.

Disabling the turbo freq. in the bios solves the issue, but it costs ~20% in compile time (and I don't want to sell the performance)

Anything that I as a layman can do - any special throttle Dell programs that I could use.


9fwM43O.png

Max temps and power after a "fatal" run.

hHwmazx.png

Post run - power cycle needed to recover.

Cheers
Laro
 
Last edited:
Solution
The red 100% numbers indicate that the processor has throttled.
A 12900K depends on good cooling to get the max out of the processor.
In windows power options, try changing the max cpu% from 100% to 99% and use the balanced power option.
There may be other options in the bios, but I am not familiar with them.

You may be able to help the situation by replacing the radiator fan with one that runs at considerably higher rpm.
If the case has front intake fans, see if their performance can't also be improved.

It looks like the case and psu may be proprietary so replacing the case may not be a good option.

If Dell can not fix your issue, I would try to replace the units.

Lutfij

Titan
Moderator
workstations-precision-3660-gallery-6.psd

If this is the rear of the prebuilt system you're working with( and I'm assuming they came equipped with a measly 120mm AIO)then my best suggestion would be to return them...not kidding! That processor needs at least a 360mm AIO to keep it under check or perhaps an airconditioner blowing at it at 18 Deg C.

Gamers Nexus did a review on a similarly designed chassis;
View: https://www.youtube.com/watch?v=UnvxSkqJ8ic

which effectively crippled it's performance.
 
May 9, 2022
8
0
10
Hi @Lutfij & all.

Yup - that is the chassis - I actually ordered the water cooled version - however this is not enough to keep it alive.

I expect Dell to contact Intel and then there will be a bios update that slows everything down further. We have two identical systems and some internal c++ compile time benchmarks to compare to. The cooling solution in one of our identical computers actually resulted in a compile time being ~10% slower...

Pictures for reference, an Asetek solution it seems, and a labeled pump.

oPnghrm.jpg


VCRhxvv.jpg
 

KyaraM

Admirable
A 120mm radiator cannot tame that CPU...

How good is the BIOS on that thing? Can you, for example, lomit power consumption? That should help a bit even if it costs some performance, but youd issue certainly costs more. If not, download Intel XTU and adjust things there. You can also use it to undervolt the CPU, which helps a lot with temps, too.
 
  • Like
Reactions: Eximo

Eximo

Titan
Ambassador
Disabling turbo is probably the best thing with the stock system.

You could take it out of the chassis and run it as an open bench with a different cooler. Since the board is non-standard, not going to be that straight forward.

Proper answer would be return these and get something from the real workstation line. Won't have Alder Lake, but they will have room for larger coolers and even multi-socket motherboards.
 
The red 100% numbers indicate that the processor has throttled.
A 12900K depends on good cooling to get the max out of the processor.
In windows power options, try changing the max cpu% from 100% to 99% and use the balanced power option.
There may be other options in the bios, but I am not familiar with them.

You may be able to help the situation by replacing the radiator fan with one that runs at considerably higher rpm.
If the case has front intake fans, see if their performance can't also be improved.

It looks like the case and psu may be proprietary so replacing the case may not be a good option.

If Dell can not fix your issue, I would try to replace the units.
 
Solution
May 9, 2022
8
0
10
Short followup.
The machines (two identical) are brand new Dell units, build to compile faster than our present (fallback) machines.
Dell is being quite supportive, and has replaced the motherboard which did not solve the issue and will now replace the CPU before involving the "next level" of support.
A strange thing is that the water cooler fan is connected to the chassis fan connector on the mobo and not the cpu fan. It was confirmed to the the right placement by dell (and the color coding corresponds to this) however it might be a hickup by the design department.

Story continues after the CPU replacement :)
 
Quick update. CPU replaced with identical hardware by DELL, did not resolve the problem :-(
If cpu is overheating, why are you replacing motherboard and cpu? This doesn't make any sense.
Cooling needs to be improved.

Have cpu cooler replaced - AIO with larger radiator (240mm,280mm,360mm) is the best option there.
PC case will need to be replaced too. Choose one with high airflow - mesh front.

You may also try high end air cooler instead - Noctua NH-D15. Not sure, if that fits in your current pc case though.
 

KyaraM

Admirable
Quick update. CPU replaced with identical hardware by DELL, did not resolve the problem :-(
Wasn't likely to since it's clearly a cooling issue, not a faulty CPU. Not sure what made you think it is. If you don't get rid of the problem, the problem won't be fixed, naturally. I made some suggestions for how to actually try and reduce temps above by changing settings in the BIOS/using XTU, you may want to look into it. But as others said, the best thing to do is getting rid of that cooler and getting something decent.
 
May 9, 2022
8
0
10
Hi @KyaraM and @SkyNetRising.

I agree with you that it most likely is a cooling issue, however these are off the shelf machines from DELL that was purchased to compile - so I won't tamper with them myself, but let DELL sort it out under warranty.

I have been thinking about the thermal situation for a bit. My best guess is that the heat from the CPU is heating the motherboard too much which is the root cause of the the change to 0.78GHz and requiring a full power cycle.

Any chance anyone knows which periferal chip is responsible for the 0.78 0.8 GHz lowest clock mode?
 
Quick update. CPU replaced with identical hardware by DELL, did not resolve the problem :-(
Sounds like support does not understand the problem.
It looks to me like a case design that is not capable of sufficient cooling for a high end processor.
If the motherboard is a standard MATX format, then possibly changing out the case could fix the issue.
Since Dell is willing to swap parts, see if they can't replace your processor with a I9-12900KS.
The KS version is a better binned processor that will operate at lower voltages and generate less heat.
On your current setup, with a 120mm size aio, about the only thing you can do is to replace the aio fan with one that pushes more air at higher rpm.
That will come with more noise, and still may not fully let the processor run to it's potential.

What is the processor you are currently using?
How does compile times on the new compare to the old?

If you are stuck with that case, there are external cooling solutions.
Here are two links to get you going:
https://modmymods.com/alphacool-eiswand-360-cpu-all-in-one-external-cooling-system-11026.html

Can you supply a snapshot like in your initial post while the compile job is actually running?

If it shows that compile is significantly single threaded, you perhaps can reduce heat and optimize the process by reducing the number of cores used.
 
Last edited:

KyaraM

Admirable
Hi @KyaraM and @SkyNetRising.

I agree with you that it most likely is a cooling issue, however these are off the shelf machines from DELL that was purchased to compile - so I won't tamper with them myself, but let DELL sort it out under warranty.

I have been thinking about the thermal situation for a bit. My best guess is that the heat from the CPU is heating the motherboard too much which is the root cause of the the change to 0.78GHz and requiring a full power cycle.

Any chance anyone knows which periferal chip is responsible for the 0.78 0.8 GHz lowest clock mode?
You don't need to tamper with anything if you use XTU, a program runner under Windows that doesn't permanently change any BIOS settings and can be uninstalled at any time, too, but can be a big help. Considering the support apparently didn't catch yet that the issue isn't faulty components, but insufficient cooling, I wouldn't exactly trust them to handle the issue, honestly... that's only wasting time.

What would cause mainboard throttle would be the VRMs around the CPU. They might be overheating, but usually due to the CPU drawing too much power, not excess heat from it.
 
Jun 25, 2022
1
0
10
I currently have a Dell Alienware R13 machine with the 12900KF, 3090 Nvidia GeForce RTX graphics card, 64 GB DDR5, 2TB hard drive, 1000 PSU and the infamous 120MM Cryo Cooled System. Basically the machine is fully loaded. The reason I've ended up in this forum is I have been seriously thinking of replacing / abandoning my Alienware R13 machine for the Dell Precision 3600 in the hopes that Dell has correctly addressed the engineering flaws inherent in the design of the R13. Based on this dialog (above) I can clearly see that the issue I'm experiencing on the R13 is NO different then what's actually happening with the Dell Precision 3600.

As we all know the new Intel I 12900 processor runs HOT. Very hot and at load consumes 240 watts of power when fully loaded. In order to address the heat issue Dell in it's infinite wisdom (sarcasm inserted here) paired the hottest chip set with the smallest radiator/cooler on the market! According to Intel's guidance during load demands the processor will kick up running all 16 cores 4ever at 240watt (maximum power) for as long as there remains a sustained load against the CPU. Since Dell must of known they had a engineering issue by not being able to provide sufficient temperature management (insufficient cooler) @ 56 seconds the PL2 window kicks in and the power limit is reduced from 240 watts to 210 watts. This PL2 limit is built into the BIOS of the machine to effectively "Power" throttle the CPU so it doesn't' thermally overheat! If the load/demands continue for longer periods the PL1 limit will again limiting the power from 210 watts to 160 watts! By simply NOT providing the proper cooling Dell takes an amazing I12900 KF processor and limits its performance by as much as 16% making it as effective an Intel 12700 K processor and that is not what I payed for!

Based on my initial research a 240mm cooler/fan could keep the processor up to 7c cooler than the current infamous 120mm cooler. Again based on manufactures guidance, as long as the load is sustained then my question is how hot will the processor get and could a 240mm or a 280 mm cooler/radiator really solve this problem and manage the temps without power throttling the CPU to potentially by-pass the self imposed PL2, PL1 limits imposed by Dell.

Thank you,
Speed racer
 

KyaraM

Admirable
I currently have a Dell Alienware R13 machine with the 12900KF, 3090 Nvidia GeForce RTX graphics card, 64 GB DDR5, 2TB hard drive, 1000 PSU and the infamous 120MM Cryo Cooled System. Basically the machine is fully loaded. The reason I've ended up in this forum is I have been seriously thinking of replacing / abandoning my Alienware R13 machine for the Dell Precision 3600 in the hopes that Dell has correctly addressed the engineering flaws inherent in the design of the R13. Based on this dialog (above) I can clearly see that the issue I'm experiencing on the R13 is NO different then what's actually happening with the Dell Precision 3600.

As we all know the new Intel I 12900 processor runs HOT. Very hot and at load consumes 240 watts of power when fully loaded. In order to address the heat issue Dell in it's infinite wisdom (sarcasm inserted here) paired the hottest chip set with the smallest radiator/cooler on the market! According to Intel's guidance during load demands the processor will kick up running all 16 cores 4ever at 240watt (maximum power) for as long as there remains a sustained load against the CPU. Since Dell must of known they had a engineering issue by not being able to provide sufficient temperature management (insufficient cooler) @ 56 seconds the PL2 window kicks in and the power limit is reduced from 240 watts to 210 watts. This PL2 limit is built into the BIOS of the machine to effectively "Power" throttle the CPU so it doesn't' thermally overheat! If the load/demands continue for longer periods the PL1 limit will again limiting the power from 210 watts to 160 watts! By simply NOT providing the proper cooling Dell takes an amazing I12900 KF processor and limits its performance by as much as 16% making it as effective an Intel 12700 K processor and that is not what I payed for!

Based on my initial research a 240mm cooler/fan could keep the processor up to 7c cooler than the current infamous 120mm cooler. Again based on manufactures guidance, as long as the load is sustained then my question is how hot will the processor get and could a 240mm or a 280 mm cooler/radiator really solve this problem and manage the temps without power throttling the CPU to potentially by-pass the self imposed PL2, PL1 limits imposed by Dell.

Thank you,
Speed racer
You got it backwards. PL1 and PL2 are Intel-defined limits, not mainboard manufacturer limits. If the manufacturer of the motherboard uses them as standard settings, as they should be doing, is up in the air, but that's not Intel's doing. So except for the very weird and according to Intel specs nonsensical stepping, the power limits, at least, work as intended. Cooling not so much, but the CPU throttles according to temperature anyways and that won't change no matter your settings. Get a descent cooler and you won't have temperature issues anymore, but if this power limit behaviour is hardcoded, that won't change.
 
May 9, 2022
8
0
10
Dell have released a couple of bios updates, and it can now finally run my C++ build without failing.
I believe that the compile performance has degraded roughly 10-20% as a result of the bios upgrades. My guess is that the clock controller slows down more agressively to prevent the heat buildup that the water cooling system apparently could not handle properly.