[SOLVED] CPU pegging 99C, water cooler fail?

txbdan

Distinguished
Feb 2, 2012
18
0
18,510
Hey all,

I've had a build that's been running for about four years and doing well. It's an i7 6700k, not overclocked, with a Corsair H100i v2 water cooler setup.

I recently noticed that under load, temperature spikes up almost instantly to 85C and within 3 seconds hit 99C on a core or two and then the CPU throttles back. Under normal browsing type use they stay around 40C and it doesn't seem to escalate. I'm not even sure how long its been in this state as i haven't been gaming a lot lately. I was actually playing iRacing when i noticed [strike]sxxxx[/strike] performance which got me looking into things. I guess i was playing iRacing for 30min or so sessions at 99C bouncing off the CPU throttle?... yikes.

I assumed my water pump died so I connected the USB cable to the cooler and installed iCue. It reports that both the fans and the water pump are alive and well.

So i guess maybe the water block has loosened from the CPU? I ordered some new thermal paste so i'll reseat it again tomorrow and see how it fares.

In the mean time, could anything else be the culprit here? Can the liquid leak or maybe air get in the system?

Thanks!

<MODERATOR EDIT>

TOM'S IS A FAMILY FRIENDLY WEBSITE!
PROFANITY IS STRICTLY FORBIDDEN AND IS A BANNING OFFENSE!

WATCH YOUR LANGUAGE!
 
Solution



txbdan,

There's a flow problem due to a partial blockage, which is typical for a unit that old. Most AIO's fail well before that time frame.

Apart from the potential for a coolant leak, there are four problems common to AIO's:

(1) All AIO's will eventually fail. It’s not a question...
Many pumps get contaminated with hard water deposits/algae-like growth after a few years, so, 3-4 years such failure (ever-increasing drop-off in cooling effectiveness) would actually be semi-expected...

(Some folks can clean their pumps /fluid blocks with assorted solvents and elbow grease with a hard brush, giving them a new lease on life)
 

txbdan

Distinguished
Feb 2, 2012
18
0
18,510
Using iCue, i forgot exactly, but i want to say the fans and pump were both around 2000rpm. Maybe 2040, 2070. One of the two also dropped to the 1800ish range. There was definitely nothing in the 3000s, even when the CPU temps were in the 90s. I remember coolant temp being around 42C. But this was only running a stress test for like 10seconds. And again, the CPU temp spikes were near instant.

Radiator and the two radiator fans looked pretty clean.
 

CompuTronix

Intel Master
Moderator



txbdan,

There's a flow problem due to a partial blockage, which is typical for a unit that old. Most AIO's fail well before that time frame.

Apart from the potential for a coolant leak, there are four problems common to AIO's:

(1) All AIO's will eventually fail. It’s not a question of if; it’s a question of when. Pumps can have component failures in the electronics that drive the impeller, which is a moving part that can wear out, so those which run 24/7/365 are prone to premature failure. AIO units are notorious for failures due to inferior pump quality, whereas custom loops typically use high-end pumps which have greater longevity.

(2) Coolant can slowly evaporate over time due to "permeation". However unlikely it may seem, coolant can actually dissipate directly through the tubes. This can introduce bubbles into the impeller chamber which can cause it to audibly "cavitate", thereby reducing or interrupting coolant flow.

(3) AIO's are sealed Closed Loop Coolers (CLC), which unlike custom loops, are not designed to be disassembled for the water block and impeller to be flushed, cleaned of bio-contaminants and radiator sediments, then refilled. Bio-contaminants gradually accumulate over time and will eventually clog the water block and radiator tubes, thereby reducing flow and thermal efficiency. This is evidenced by Core temperatures that slowly increase over periods of several months to a few years.

(4) Also unlike custom loops, AIO's use dissimilar metals (aluminum radiator / copper water block). This causes galvanic corrosion which produces sediments that accumulates over time, resulting in blockages and flow problems. Even new AIO's may contain radiator sediments due to inadequate flushing after manufacturing. Experienced builders of custom loops will always thoroughly flush brand-new radiators. Flux, solder and metal fragments are typically found in the flush water when it's poured through a strainer.

For no apparent reason ... OR ... by simply installing, moving, tipping, handling or otherwise disturbing a NEW OR OLD unit, particles can become dislodged, whereupon the next power-up, the now free-floating particles can circulate into the impeller chamber and restrict or completely stop the impeller. Since the impeller is magnetically coupled to the stator (no direct shaft), the unit may "appear" that it's running while the impeller is restricted or stalled. "Hearing" the unit running or "feeling" vibration can be deceiving as it does not necessarily indicate flow, nor does Pump RPM in BIOS or various software utilities. Pump vibration is often mistaken for fan vibration which can "telegraph" throughout entire the AIO unit.

Without an actual in-line sensor, proper flow is difficult to confirm. Under normal operation, even with the CPU at 100% workload, there should be only a minimal temperature differential between the tubes. However, if one tube is hot while the other is cool, or the water block is hot while the radiator is cool, it indicates little to no flow. Surface temperatures can be verified with an infrared (IR) thermometer.

Since your AIO is about 4 years old, it's highly likely that bio-contaminants and particles from galvanic corrosion have combined to impede coolant flow throughout the unit.

Fortunately, Corsair has a five year warranty, which much to their credit, they honor without difficulty. I suggest that you submit an RMA request for a warranty replacement cooler. In the interim, a backup air cooler should suffice.

CT :sol:
 
Solution

txbdan

Distinguished
Feb 2, 2012
18
0
18,510
Thanks for the info.

I got some Cool Laboratory Ultra in and reapplied the water cooler block. No dice, exact same performance. I then found an old 1151 celeron stock intel fan that i had so thought i'd try that. I knew it'd run toasty, but it pretty much performs the same as well. Idles around 33C and jumps to high 90s in a few seconds. I'm using RealBench stress test.

I slightly question my application of the thermal paste. This metallic stuff was weird and seemed sort of lumpy. When i took off the water block after that test failed, it looked like it didn't spread all that nicely. Hard to say i guess.

Would you have expected the celeron fan to perform that poorly? I though it'd at least perform differently.

Next up is a new hsf or water cooler? hrmm
 

CompuTronix

Intel Master
Moderator
txbdan,

(1) You put a 65 Watt TDP cooler on a 91 Watts TDP processor for which Intel specifies a 130 Watt TDP stock cooler (sold seperately), so that's a problem.

Intel BXTS15A stock cooler 130 Watts TDP for 6th & 7th Gen 91 Watt processors, and for 8th & 9th Gen 95 Watt processors - https://www.newegg.com/Product/Product.aspx?Item=9SIAH417YP0354&Description=Intel%20BXTS15A&cm_re=Intel_BXTS15A-_-35-203-026-_-Product

(2) In Real Temp, if you ran the Sensor Test, it requires that you download Prime95 ... however ... if you downloaded and ran a version later than 26.6 (such as 29.4), then that's also a problem.

Prime95 versions from 27.7 through 29.4 run "AVX" code on your processor, which has AVX "Instruction Sets". Version 26.6 runs a true 100% workload, but later versions with AVX impose an absolutely brutal workload at about 130%. This a more extreme workload than any other utility, which is completely unrealistic, and will cause Core temperatures to run 20°C higher than non-AVX version 26.6. As per Intel’s Datasheets, TDP and Thermal Specifications are validated “without AVX”.

Intel tests their processors at a steady 100% TDP workload to validate Thermal Specifications. Utilities that don't overload or underload your processor will give you a valid thermal baseline. Here’s a comparison of utilities grouped as thermal and stability tests according to % of TDP, averaged across six processor Generations at stock settings rounded to the nearest 5%:

All tests will show 100% CPU Utilization in Windows Task Manager, which indicates processor resource activity, not % TDP workload. Core temperatures respond directly to Power dissipation (Watts), which is driven by workload. Prime95 v26.6 Small FFT’s provides a true and steady 100% workload, so if Core temperatures don't exceed 85°C, then your processor should run the most demanding real-world workloads without overheating.

You might want to read this: Intel Temperature Guide - http://www.tomshardware.com/forum/id-1800828/intel-temperature-guide.html

If you did indeed run version 29.4, then of course your Core temperatures will immediately reach throttle temperature at 100°C. But even if you ran version 26.6, your 91 Watt TDP 6700K will still throttle on a 65 Watt TDP cooler, which unfortunately, doesn't much help you to prove that your AIO is failing. The only revealing indication is your lower idle temperature around 33°C, which is certainly lower than the 40°C you originally posted with your AIO.

Instead of Real Temp, you need to run "Core Temp" so you can monitor Power consumption "Watts" to observe thermal behavior on light to moderate workloads below the cooler's 65 Watt cooling capacity, such as just surfing the Internet.

• Core Temp - http://www.alcpu.com/CoreTemp

So which version of Prime95 did you run?

CT :sol:
 

txbdan

Distinguished
Feb 2, 2012
18
0
18,510
Right, i know the Celeron hsf is undersized. I was looking for another data point to hopefully nail down that the water cooler was dead.

The stress test i ran was ROG RealBench here:
https://rog.asus.com/articles/news/realbench-v2-43-new-version-available-now/

I put it in Stress Test mode.

What stress tool do you recommend? Just for checking system health.


Thanks again for all the info.
 

Karadjgne

Titan
Ambassador
With that cooler, I don't recommend ANY stress testing, for health or otherwise. The last thing you actually want to do is push that cpu TDP to its limits. Or even close.

It's the difference between standing on a skateboard to check balance, and standing on that same skateboard as it's hurtling down a massive hill and you are wearing nothing but shorts.

RMA the H100i, and do nothing stressful on the pc until the cooler is replaced.
 

CompuTronix

Intel Master
Moderator
RealBench is too much workload for the cooler.

Any workload you run on that 65 Watt cooler will need to be light to only moderate, at most. If you look at the scale above, you can see that AIDA64's individual CPU test is only about 70% workload, which may have a Power consumption of around 64 Watts, so I would expect Core temperatures in the 90's.

You can also try CPU-Z ... click on the "Bench" tab then run "Stress CPU" which is about 80% workload. I would expect Power consumption to be about 73 Watts, which will still overwhelm the cooler. If you disable Hyperthreading, Power consumption will decrease.

As I mentioned, instead of using Real Temp, you need to run "Core Temp" to monitor "Power" consumption (Watts) to observe thermal behavior on light to moderate workloads below the cooler's 65 Watt cooling capacity.

• Core Temp - http://www.alcpu.com/CoreTemp

The key here is to closely monitor Power consumption (Watts) relative to workload and Core temperatures. If thermal behavior appears normal such as 70's to 80's under a moderate workload of about 50 Watts, then you will have proven the AIO failure.
 

txbdan

Distinguished
Feb 2, 2012
18
0
18,510
Hi all,

Today I went and bought a Corsair H100i Pro water cooler and installed it using Noctua NT-H1 paste. It seems to be working better. When i first turned it on the water gurgled a bit, but it stopped.

I'm using HWiNFO64 to monitor temperatures.

Idle is around 27-32C across the cores. Little things like even opening a tab and website will bump them 5-10C for a second.


CPU-Z Bench "Stress CPU" after running several minutes steady states around 64-77C across the four cores. Core 0/1 are always hotter than 2/3 it seems, by around 10C.

According to the max/peak temps, 76,80,66,67 across the cores.


Then i ran the ROG RealBench 2.56. It also went better than before. After several minutes temps settled out around 81,83,72,70 across the cores will peaks of 83,88,72,73C.

That seems a bit more reasonable. Does it seem a little warm for a water cooled setup, though? The back plate was loose around the motherboard with the standoffs tightened. Ie, the motherboard was too thin for the standoffs to clamp on it. I've read of people saying it doesn't matter and some that say it does and add plastic/rubber washers. It should matter because since the standoffs tighten all the way to the bracket and the thumb screws tighten all the way to the standoff body, the distance between the front of the back bracket and the back of the thumbscrew is fixed. The thickness of the mother board then positions the CPU in relation. Less pressure is applied to the CPU if the mobo is thinner. Thoughts?

Thanks again!
 

Karadjgne

Titan
Ambassador
Cooling is cooling, doesn't matter if it's aio or air cooler, they both do the same thing in the same way, remove heat energy from the cpu and dissipate it via a fan blowing air through an array of fins. The only real differences are the way that heat is transmitted between the cooler block and the fins. With an air cooler, it's heatpipes filled with a non moving liquid attached directly to the cpu, with aios it's heatpipes filled with a moving liquid attached elsewhere. This is significant only in that liquids have a far greater ability to absorb heat energy than metals and the dissipation surface area of the radiator is far greater than the largest aircooler. This doesn't mean aios are better and will get better temps, often an aircooler has better temps in fact, what it means is that aircooling is limited. The largest aircoolers can handle just over 250w of heat energy, the largest aios is closer to 400w. This affects only the top end, the maximim ability of the coolers, for anything less than @250w output, it's a guessing game dependent on design, fans, fan speeds, cfm, sp, ambient temps, case temps etc as to exactly what the end result will be.

I7-3770K @4.9GHz, 8 threads, Prime95 v26.6 small fft, Nzxt Kraken X61, fans 900rpm = 70°C average.
I5-3570K @4.3GHz, 4 threads, Prime95 v26.6 small fft, Corsair H55, Noctua NF-F12 900rpm = 70°C average. The H55 is equitable to a CM Hyper212, the X61 is equitable to a NH-D15. In performance. The difference is only in that the X61 could handle a larger OC, more heat output and not move up much in temps, whereas a NH-D15 would be at its limits and temps would shoot up drastically with a larger OC. If I dropped the OC, they'd still have roughly the same temp.

So to me, no your temps are still high, but my ambient is 23°C, case temps rarely ever go beyond 36°C and I have good airflow with 4x 140mm fans. Your situation is probably quite different, so your results will be different. If airflow in your case is stunted, ambient are higher, gpu isn't having its heat removed, case temps are higher, placement of the aio rad is on exhaust etc all these can lead to lowered efficiency of the radiator to dissipate heat effectively, so cpu temps are higher. Also my i7 is at 1.32v, if your vcore is higher, cpu temps are higher. So there's multiple factors that can make you see higher overall temps. That makes what's normal or good for you, different than what's normal or good for me, or anyone else.

Either way, with those temps under stress limits, you are still well within cpu performance tolerance limits, so effectively exact temps don't matter to the cpu, only to you. Question is are you able to live with them?
 

txbdan

Distinguished
Feb 2, 2012
18
0
18,510
Yeh. I wish i paid more attention to what they were before.

CPU is at stock settings right now. I'm running a small ITX case (http://www.phanteks.com/Enthoo-Evolv-ITX.html). It has a 200mm front intake fan, 140mm rear exhaust fan, and i just set the two 140mm CPU cooler fans to exhaust out the top. (there is a metal grill vent at the top rear and smaller mesh vents along the top edges). So, I am pulling warm case air through the radiator, but i like the idea of exhausting the heat upward and directly. Debatable I guess. The computer was previously setup to intake air from the top through the CPU cooler radiator. (I didn't build this PC originally which i why I don't know much about it.)

Also in there is a gtx1070ti that's overclocked. It's a tight package, but that also seems like a good bit of fan to me?