i74790k Overheating Issues Tried Everything?

crsaura

Honorable
May 17, 2012
7
0
10,510
Hello All,

My trusty 4790k has been giving me issues as of late, or so I think. I started having PC issues as of late and ended up with my OS getting corrupt. Ran a full panel of hardware tests (Memtest, chkdsk, antivirus, PSU test, Mobo Test etc) I work for an IT company so I've got a pretty full suite of diagnostic and repair tools at my disposal, everything came back clean, except for CPU Temps they spiked all the way to 80C during the diags. I reinstalled win10 and everything seemed stable again.... Time to figure out these crazy high temps.... Ran prime95 Smalls test, after 3 seconds I was already at Tj Max. Time to clean out the cooler and apply new thermal paste I thought, did that.. about 4 times, everytime the same result, faulty AIO I thought.. so I buy a new one.. install cooler again... nothing changes.. pseudo stable system.. however prime95 smalls test still reaching 100C within seconds... reinstall cooler and reapply paste 3-4 times... again same results.. I'm at a loss here guys... Vcore temps are normal... voltages are stable from PSU... I really don't know whats going on... I need some help figuring it out... here are system specs and test screenshots.

Processor did have a mild overclock since this rig got put together( 4.0ghz to 4.5ghz) Temps never went above 65C after 8hrs of prime smalls for heat-
or during 48hrs of prime95 Blend for stability.

CPU - Intel i7-4790k
Cooler - Corsair H80i replaced by Deepcool's Captain 120EX
Core speed - 4.0 Current (4.5 in the past)
Core voltage at load - Vcore 1.072
Load test software - prime95
Temperature software - HWMonitor
Load & idle Core temperatures - Load 100C - Idle 35C
Memory Corsair Vengeance DDDR3-
Motherboard - MSI M87 (MS-7815)
Graphics Card - XFX AMD R9 290
Ambient temperature
Edit: 74F/23C

https://imgur.com/a/fTJEVyj - take during idle moments --

in that screenshot the Max numbers from HWMonitor are after 3 seconds of prime95 Smalls test, Thanks in advanced for all the help.
 
Solution
I was editing while you were posting, so I updated your Quote.

5.0GHz is an excellent overclock for Haswell / Devil's Canyon processors, and 1.125'ish Vcore is marvelously low ... it seems you hit the silicon lottery, so let's hope the problem is being caused by a component other than your processor.

Corrupt software is commonly caused by a faulty memory module, or by the Integrated Memory Controller (IMC) in the Die, which unfortunately means a failing processor. PSU rails / power delivery can also cause instabilities, as well as the Fully Integrated Voltage Regulator (FIVR) on the package, which is unique to 4th Generation processors, unlike earlier or later Generations.

Looks like you have some serious troubleshooting ahead of...

CompuTronix

Intel Master
Moderator
crsaura,

There's a known issue when running Prime95 version later than 26.6.

2nd through 8th Generation i3, i5 and i7 CPU's have AVX (Advanced Vector Extension) Instruction Sets. Prime95 versions later than 26.6 run AVX code on the CPU's Floating Point Unit (FPU) which causes unrealistic temperatures up to 20°C higher due to excessively high TDP workloads. Other high TDP utilities have similar results.

Not all loads are created equal. Utilities that don't overload or underload your processor will give you a valid thermal baseline. Here’s a comparison of utilities grouped as thermal and stability tests according to approximate % of TDP, averaged across six processor Generations at stock settings:

All tests will show 100% CPU Utilization in Windows Task Manager, which seldom coincides with %TDP workload. Higher TDP tests produce higher Core temperatures. Although actual Power dissipation (Watts) varies with Core Speed, Core voltage, cooling and workload, Prime95 v26.6 Small FFT’s always provides a steady 100% workload, whether you’re running stock or overlocked.

AVX can be disabled in Prime95 versions later than 26.6 by inserting "CpuSupportsAVX=0" into the "local.txt" file in Prime95's folder. However, since Core temperatures will be the same as 26.6, it's easier to just use 26.6. AVX doesn't affect Core i 1st Generation, Core 2, Pentium or Celeron processors since they don't have AVX Instruction Sets.

If you’re overclocked and run AVX apps such as for rendering or transcoding, you may need to reduce Vcore and Core speed or upgrade your cooler and case fans so Core temperatures don’t reach 85°C. Many 6th, 7th and 8th Generation motherboards address the AVX problem by providing offset adjustments in BIOS, however, motherboards for 4th Generation processors do not have AVX offset adjustments.

From Intel: Troubleshooting Intel® Core™ i7-4790K / i5-4690K overheating - https://communities.intel.com/docs/DOC-23517

"Overheating occurs very quickly on motherboard products with certain BIOS settings when running the Small FFT test that is part of the Prime95* application, versions (later than 26.6) ... Overheating is not as severe with earlier versions of Prime95 ... "

• Prime95 v26.6 - http://www.mersenneforum.org/showthread.php?t=15504

Asus RealBench runs a realistic AVX workload, and is an excellent utility for testing overall system stability, whether you're running stock or overclocked.

• Asus RealBench - http://rog.asus.com/rog-pro/realbench-v2-leaderboard/

Check out our Guide: Intel Temperature Guide - http://www.tomshardware.com/forum/id-1800828/intel-temperature-guide.html

CT :sol:
 

crsaura

Honorable
May 17, 2012
7
0
10,510
Computronix,

I can definitely see what you're talking about with the misreporting. Unfortunately it only delayed my problem... I made it only to test#2 in prime95 v26.6 until I shot it down, I'm pretty sure if I let it finish test2 or maybe test3 I would've reached Tj Max, there were no signs of Temps stabilizing at any point during the test.

Here's a screenshot with min/max cleared in HWmonitor as soon as test started.
https://imgur.com/a/TPFaZxz

Edit: I just saw the bit you added about Asus RealBench.. Im going to give that try and report back.
 

CompuTronix

Intel Master
Moderator
If your new replacement cooler also has a faulty pump out of the box, which happens all too often, then it would certainly raise more troubleshooting questions than it would otherwise answer. I suggest that if you still have your stock cooler, or a better aftermarket air cooler, that you test your rig at stock settings on air, just to rule out variables.

CT :sol:
 

crsaura

Honorable
May 17, 2012
7
0
10,510
RealBench wont even go 2 seconds into Stress Test. It shows instability detected and shuts down :( I make it to about 4.08%

https://imgur.com/a/ZUBhTuI

Edit: Saw your reply a little too late.. knowing my luck.. that would be the case. I'm gonna get a replacement and try again, otherwise.. I've never experienced this but could a on-its-way-to-die processor be causing this? I've certainly given it hell with photo editing, video rendering and gaming over the last few years.
 

CompuTronix

Intel Master
Moderator
That's certainly worth mentioning. As you were formerly running a mild overclock at 4.5GHz with low Vcore and temperatures, although possible, it's not likely to be a degradation problem.

Instability due to degradation is typically caused by Vcore above 1.300, (which is high for 22nm processors), and is accompanied by sustained Core temperatures near Throttle temperature over extended periods of time. It begins with an occasional BSOD which becomes increasingly frequent relative to uptime and sustained heavy loads. Eventually the only way to temporarily halt the instabilities is downclocking, rather than increasing Vcore, which only accelerates the process.

From your screenshots, although all Core appear to be running at their base clock of 4.0GHz, what I find suspicious is that your Core temperatures are out of spec concerning deviation between Cores, which typically shouldn't exceed 10°C from hottest to coolest. Intel's spec for DTS sensor accuracy is +/- 5°C, but since your deviation is 14°C, this suggests Intel's TIM between the Die and IHS is failing, which can adversely affect stability.

Assuming the TIM between the IHS and cooler is properly applied and the cooler is securely mounted, the fix is to delid and apply liquid metal TIM.

Have you ever observed these high deviations prior to the recent overheating behavior?

Regardless, before jumping to premature conclusions, I would first test it on an air cooler. If using Intel's stock cooler, make absolutely certain those problematic push-pins are inserted completely through the motherboard and securely latched, or you could throw yet another variable into the equation, and really end up chasing your tail.

CT :sol:
 

crsaura

Honorable
May 17, 2012
7
0
10,510


Originally when I built the rig it was overclocked at 5.0Ghz, Vcore of 1.125ish, Temps would be around 68-70C during prime95 testing for about a 2 years running the H80i I mentioned in the OP.
I tested it around 8hrs for prime95 smalls and 48hrs for prime95 blended, is what I usually test my builds with. About a year ago I messed around with OC'ing the ram and what not, ended up not doing it, but I lowered the Overclock to about 4.5ghz and set the vcore back to stock because I knew it could handle it from the initial overclocking. Didn't stress test then mainly because I had previously stress tested these settings during build and Everything was stable then.
I've had maybe a 4-5 BSODs in the past year, after a reboot everything would fine for quite some time, spaced very far apart. Didn't think much of it. They never escaled in frequency up until recently where my OS got completely corrupt after a couple back to back BSODs, this was about 2 weeks, ago, been fine every since after a fresh OS install, high temps during heavy load but no BSODs


Edit: I'm gonna give a replacement cooler a shot here shortly however, if that doesn't help, whelp, I'll probably just run this build until it dies and then I'll get my answer as to what was wrong all along lol
 

CompuTronix

Intel Master
Moderator
I was editing while you were posting, so I updated your Quote.

5.0GHz is an excellent overclock for Haswell / Devil's Canyon processors, and 1.125'ish Vcore is marvelously low ... it seems you hit the silicon lottery, so let's hope the problem is being caused by a component other than your processor.

Corrupt software is commonly caused by a faulty memory module, or by the Integrated Memory Controller (IMC) in the Die, which unfortunately means a failing processor. PSU rails / power delivery can also cause instabilities, as well as the Fully Integrated Voltage Regulator (FIVR) on the package, which is unique to 4th Generation processors, unlike earlier or later Generations.

Looks like you have some serious troubleshooting ahead of you. Hopefully you have access to a stash of known-good used parts to help the process flow more smoothly.

CT :sol:
 
Solution

crsaura

Honorable
May 17, 2012
7
0
10,510
I do have another a couple coolers laying around, the problem is to find the brackets. However, I am merely gonna exchange the cooler since I ordered it online less than a week ago, if that doesn't solve my problems; I will unfortunately just use my system as is until it completely fries or I break down and build another one... the dishwasher doesn't like to wash dishes once he is home you know haha

Edit: thanks for all of your help btw, 10/10 on the response time and knowledge provided. That temp sticky is awesome, I've learned quite a bit tonight from multiple sources.