Unstable / Random Shutdowns and Lockups - Thermal Paste Fail?

coachjc

Distinguished
Nov 29, 2014
28
2
18,535
Would this cause instability issues?
(i7-5930 and a Cooler Master on a Rampage V Extreme)

(I've not reseated this for at least 12 months)

My Windows 7 will randomly lock up and behave strangely... And recently it started to continuously cycle up and down without ever even getting to POST.

Here are pics of the thermal paste:

https://drive.google.com/file/d/0BzcWGbndCRC4d2ozc1dmc3VjV0E/view?usp=drivesdk

https://drive.google.com/file/d/0BzcWGbndCRC4UVZSOWVJRDhWcEU/view?usp=drivesdk

Thank you for your help!

 
Solution
Doubtful it's the cpu. Barring any user abuse, cpus are notoriously hard to kill. There's just too many more fragile components on the motherboard that get in the way first. I'd suspect that it's the mobo that's toast, the only way to figure out whether it's mobo or psu is to swap one out temporarily or take it to a shop who can bench test with their own supply. I'd check the psu first, any will do for test purposes, you'll only be running windows so full on gaming/production power isn't needed.
I have honestly forgotten... But I believe I went for a very minimal overclock.
I know with 100% certainty I stayed far away from the "extreme" range as I wanted to be careful with my hardware and it was already running pretty fast.
 
With that cooler, what I'd do is use a credit card and apply a small amount of paste to it first making sure to scrape it well and get paste down in the tiny cracks created by the heatpipes. You just need enough to get the cracks, then scrape clean sideways and install paste as usual. Other than that, looks like a good application and spread.

Monitor your temps, but I'm guessing that's not the issue, your cpu would need to be hitting @100°C and you'd get some serious throttling before the pc shutdown. Instability under duress is quite often associated with 2 things. Either lack of vcore due to tight voltage control for temps under OC, or the psu is exceeding the LLC settings and giving you some terrible ripple. Returning to factory default bios settings is quite often a band-aid cure, as Intel is well known for running high stock voltage settings, just watch the temps as that's a mediocre cooler on a high wattage cpu. A mild OC can actually show lower temps done right as user voltages will be lower than stock, but that Hyper212 is still a 140w cooler on a 140w cpu.

Use of good paste also helps. Avoid AS5, it does dry out and need repaste @200 heat cycles, MX4, Noctua, Phanteks, Gelid etc will last longer than the pc and never need repaste with a correct application.
 
Here's what I've done since I first posted:
- Cleaned & re-set the CPU.
- Re-set the CPU cooler, RAM, and GPU.
- Re-set all power cables.
- Turned off all OC and set everything to just boot normally. (The overclocking was on the lower end)

Here's what's been happening:
After letting it sit for 30 mins-2 hours... it'll boot up again... only to shut off again after 5 minutes - ~20 minutes or so. I have one exception after an experiment last night... will share below.

The problem seems to be due to "heat"... or something heating up...
(and turning all overclocking off didn't impact it's desire to shut down on it's own)

So last night, I decided to boot it up... but just leave it at the Windows login screen. It stayed stable... and 8 hours later... is still up.
(and I'm on it right now... will test with some basic computer use... and then try out a game (which I rarely do now-a-days) to push the hardware.)

Here is some background on it... hopefully you guys can continue to help me debug this.

So this exact issue happened about a year ago.
(shutting down, then turning on for 1 second, then instantly back off... and I'm 99% positive I probably wrote a post on here to ask for you help then too)
:)

I recall it needing a "warm up" period last year... where over several attempts... it just finally started working again. (and all I did last year was the same as this time, re-set the CPU, etc.)

FWIW - I first built this machine 2 years ago... it's my first ever "full build". It's had random lockups and random behavior over the 2 years... more so than a normal Windows build.

This is my work machine and I'm desperate to get a quick fix... even if that means putting money into it immediately. (I was about to take it into - gulp -
Geek Squad - to have them fully test more.)

It absolutely seems to be related to "heat"... especially build up over time.

Here are more specs so you can recommend any new hardware I may need:
- CPU: i7-5930k
- CPU Cooler: Cooler Master EVO 212
- Motherboard: Rampage V Extreme
- RAM: 64MB Ripjaws DDR4 2166
- GPU: Single GTX 1080
- Power: Corsair AX1200i
- HD: 6 HD's, 5 are SSD's and one 4TB backup (video editing machine)
- Case: Corsair Graphite Series 780T
- Fans: 5 140mm case fans

Whatever advice you have... I am all ears and GREATLY appreciate all of your expert help!
(I'm fine with putting $$ into this build... need it back up, and hopefully can find the stability issues)
 
This is just idle? And it only seems to happen commiing back from long periods of idle? If this is the case it doesn't seem to be a heat issue in my opinion, try setting your computer to high performance in windows:
https://www.howtogeek.com/240840/should-you-use-the-balanced-power-saver-or-high-performance-power-plan-on-windows/
Have you monitored the temperature to determine there is a heat issue?
The next thing I would check is the power supply. Yes you have more than enough power to run three of those identical systems, but the PSU may be faulty. If you can unplug all of those extra hard drives and just run off the boot drive to see if the problem persists (this may not be possible if you have raid) or even better, get your hands on an extra HD and reinstall windows on it. A fresh copy just to rule out any software issues. If the issues persist on that drive with only that drive installed, replace the power supply to see if the problem persists. If it does, i'd be inclined to suspect your motherboard is the issue. Is it's bios up to date?
 


When it shuts down and starts cycling up/down...
It will only boot up after I flip the main power switch off and let it sit there for 30-120 minutes.
(and if I try to boot it up too early... it just cycles up/down again)

The "cycles" I'm referring to are that it will turn on for ~1 second, then immediately full power off... then it'll retry (on it's own) to power back up... only to last ~1 second and go right back down.

I've had my Windows set to high performance since the beginning.

I've monitored the CPU temp gauge on the outside of my machine... which right now is sitting at 36-38C.
It seems to be at higher temps (games, etc.) when it was shutting down.

But, here's what's very strange and leads me to think it's related to thermal paste...

My machine never did this behavior the first year. (although it did have random lockups... but it would never continuously cycle up/down. If it did lock up... I'd use the reset switch on the Rampage V and it would come back up fine.)

Then, about a year ago... this exact same "cycle up/down" happened... I freaked out because I couldn't keep my machine up for more than minutes at a time.

I did the CPU re-setting, etc... and then over this past year... it's been the same as the first year. (random lockups but never to where it couldn't cycle up)

and 2 days ago... it started doing that again.
(and up until yesterday... I've had slight overclocking turned on from the very start... but I know my board & hardware should easily be able to handle what I was throwing at it... although if it's more stable now with all of it fully off... I can probably live with that as so far I haven't noticed any differences aside from a longer loading Windows logo upon bootup.)

So far... my machine has now been up since last night. (even though for 8 hours it was at the windows login screen... and now for ~2 hours it's been in Windows... but just 2-3 Chrome windows open and minimal load... I'm going to go try out a game right now and will report back soon. If it does lock up, I will also unplug the drives if it does lock up and test that out a bit.)

Is there a better way I should be measuring temps? (because I know eyeballing it and quickly reading it in the 1 second the machine is trying to re-boot isn't the most scientific method.)
(and there are also 3 little "temp guages" that came with my motherboard... they're not installed and I would be happy to get them into place if that will help debug all of this... although I don't know where they send data to yet.)

PS SgtScream - I want to especially thank you because you were the same guy helping me with this issue a year ago, and I believe also helping me design my build 2 years ago. THANK YOU!

PPS - I don't think the BIOS is up to date... I may have messed with it a year ago... but not sure.
 
Yeah no problem man, run MSI afterburner in the background. You can configure it to show the cpu/gpu load/temps while your gaming. In my humble opinion, I don't think a hyper 212 evo was designed (even though it fits) to dissipate heat effectively from a CPU that size. You can check to see if heat can be an issue with Intel Extreme Tuning Utility:
https://downloadcenter.intel.com/download/24075/Intel-Extreme-Tuning-Utility-Intel-XTU-
This will allow you to monitor the temps while you place the cpu under 100% load.
If the issue happens via gaming and your temperatures seem in check (under 100c) then the issue is most likely instability due to an unstable overclock. If your settings are 100% confirmed to be defaulted and the issue persists, i'd be inclined to say there's a problem with either the power supply or the motherboard. Ensuring your motherboard is updated to the latest bios version is sometimes important, especially with newer hardware. This is because the manufacturer is still working out existing issues from releasing their hardware too earlier than they should.
 
As I said, that hyper212 is @140w cooler on a 140w cpu, barely enough. I'd not be surprised to see temps in the 70+ range during normal usage and it probably sounds like a freight train under heavy loads. If it doesn't, then the fan really isn't doing its job right and fan curves should be set in Asus fanXpert to a higher duty cycle at a lower temp. OC probably actually helped as most times even though you are using slightly faster speeds, that's offset by a significant vcore drop which drops overall wattage output, lower temps. Considering the usage, I'd have opted for a larger cooler, Noctua NH-D15 2011 comes to mind.

Use CoreTemp or RealTemp to monitor cpu temps, for Intel cpu's they are as accurate as it gets and simple to use.

And as Sgt said, do NOT under any circumstances attempt a bios upgrade at this time. You need to get the power situation stable first and foremost. The only things I would make sure are current are audio and Lan drivers.

I'd also inspect the way power is being supplied to the psu. With a heavy, constant 500w draw from the wall, the outlet is going to see some heat. As is any power bricks etc. It might be that the pc is not even an issue, but the wear on the house circuit, which gets a tweak whenever you unplug to service.
 
I did get to run that game, it went for about 20 minutes... then shutdown and started cycling up/down.
The temps were about 48-50C when it happened. (On the Rampage V readout on the case). For the past year... my temps have been in the 45-60C range. (The higher range when rendering video)

So I left it off for about an hour... and turned it back on and logged into Windows. (I was going to let it sit in Windows to "burn in" more... because that's what worked a year ago)

That's where I screwed up...

because I left the machine on, unattended...
And when I went back to it... now it will turn on for a fraction of a second, then just turns back off. (And no longer does the "cycling" behavior)

It now won't come up at all... even when sitting for many hours.

My guess is that is shut down when I was away from it... did that "cycle up/down" behavior nonstop for an hour(s)... and something fried/burned out/died.

There is one difference now though... when it turn it on... it's just the fans (& fan lights) that come on for a split second and then stop...
the lights on the motherboard and the front panel display readout actually stay on now. (which didn't used to happen... when it was "cycling"... the front panel readout would only show up for a second.)

So I'm guessing it would be a dead PSU or CPU... but I don't know what to test next.
(And being this is the 4th day of no machine... I may need to take it into a computer shop for a quicker diagnosis... but as I know that's basically "selling out" as a self builder... I'm open to next steps...)

PS - I don't think it's the wall power as this all runs through a CyberPower 1500va and all looks to be perfect on it.
 
Doubtful it's the cpu. Barring any user abuse, cpus are notoriously hard to kill. There's just too many more fragile components on the motherboard that get in the way first. I'd suspect that it's the mobo that's toast, the only way to figure out whether it's mobo or psu is to swap one out temporarily or take it to a shop who can bench test with their own supply. I'd check the psu first, any will do for test purposes, you'll only be running windows so full on gaming/production power isn't needed.
 
Solution


Thank you Karadjgne!
I don't have any other modular type of power supply around that will fit this build...
but I was able to do some more testing.

I disconnected everything from the PSU, did the paperclip in the "green/black" wire... and then added in one of the 8 pin connectors into the PSU...

When I turned it on... not much happened... but then when I held down the self test switch on the PSU... the PSU ran and the fan on the CPU also ran.

I additionally unplugged everything from the PSU, held down the self test switch, and the PSU fan runs and the self test light turns green.

Does this mostly narrow it down to the motherboard? (I'm thinking of heading down do a "Micro Center" and having them check it out as well... as long as they'll do it on the spot while I wait.)

or should I just go buy a modular PSU locally and test that out real quick to verify it's not the PSU?
 
Don't rely on modularity, you'd have to use the same psu, the pin placements and connectors are basically proprietary to each psu. So any borrowed psu will work, it doesn't have to be permanent type installed, you just need the plugs and sit it on the frame.

But if your pc is giving a green light, that's a rough call since it's green with no real load. But it's looking more and more like a motherboard issue. Which can happen to even the best, most reliable boards. When you crunch the numbers, if there's a solid million of those boards in circulation, even with a 99.9% reliability factor, that's still 10,000 that went belly up for whatever reason. So it does happen, unfortunately.
 
I took it into the local shop... they were awesome with testing it right there with me.

We first tried a new PSU... and it booted up and started doing a Windows chkdsk...

So then I asked them to try it with my PSU... and it booted up just fine...
So they started thinking it was the PSU cable or even my UPC or wall outlet.

When I brought it home and just tried it with the default PSU cable and everything as it's always been... it booted fine. (just like in the store)

In the middle of the chkdsk... or shortly after... it shut down on it's own.

It's now back to the power cycling behavior.

They gave me a new PSU cable and I've tried it... as well as trying all combinations of my own PSU cable and their new one in my UPC, directly to the outlet, and also running through a completely separate circuit in my house... all of which hasn't changed the behavior. (it's still cycling up/down when I flip the PSU switch, then power it on)

I did a video showing all of the above, and showing the cycling... I'll post it here once it's rendered in YouTube.
https://youtu.be/JmEhnIiUzLA

So now should I try a brand new PSU before resorting to trying to replace the more intensive motherboard?
(mobo and PSU are both under warranty... although I'm losing massive work productivity and need this up asap... I'm likely ordering a new PSU or whatever the shortest path to getting this back up & running 100%.)

or maybe I need to think about checking cables, etc.? (all of my cables came with the hardware... so do I have to go back to the manufacturer's sites to order the correct ones? Which cable(s) could be the culprit? lol!)

This is my first ever build... and I was super detailed putting everything together... I'm disappointed I've only got 2 years out of this build so far (1 year since this first happened... but re-seating the CPU had this problem go away after 1-2 days last year)

but I'm hopeful to get stuff fixed quickly and be back up & crankin' soon. (with your amazing help... or even my local computer shop which I may bring it into here in a bit because I've now lost about 6 days of productivity in my business.)

Thanks for any help you can provide!

PS - So what is it... that after sitting powered down for hours at a time... or car rides... that make it "work again"...
and what is happening to make it cycle power up/down? (Something is "tripped" or heated up too high?)

and what's weird is that yesterday at this time... not even sitting there powered off for 10 hours got it to turn back on... it was mostly "dead" aside from a few mobo lights...

but after the 15 minute car ride to the shop yesterday... it then was working again... but not for more than ~10-15 minutes...
 
If you can get into the bios, check the temperature. It should be on the main bios page.
Next try each of your ram sticks individually in the left most slot. If your computer fails to boot with all of your ram sticks individually tested in the first dim slot, continue to try the next proceeding dims slots to the right. Going through this process will help you determine if one of the memory modules or dim slots are defective.
Try a different power supply. You don't have to take the old one out of the case. Rest the computer on it's side, then unplug all the old psu cables. Take the test power supply, set it on a secure part of the case so it cannot fall into the case on to yout internal components. Then plug the cables of the test power supply into the internal components while leaving the old power supply in it's original installed position. If the issue replicates you won't have to re-install your old power supply. All you'll have todo is plug back in the connections.
 


That "sounds" like overheating.
Heats up during the initial checkdisk, and then stays too hot until later when it's been unused/unplugged for a while.
Those original pics show almost no thermal paste.
And the cooler is a bit inadequate.

PCPartPicker part list / Price breakdown by merchant

CPU Cooler: CRYORIG - H5 Ultimate 76.0 CFM CPU Cooler ($46.89 @ OutletPC)
Thermal Compound: Thermal Grizzly - Kryonaut 1g 1g Thermal Paste ($11.99 @ Amazon)
Total: $58.88
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2017-09-06 17:04 EDT-0400
 


HUGE progress!

I started doing the above... and then realized I couldn't complete the whole RAM process because of my CPU fan. (my soon to be "old" EVO 212... I have a Noctua nh-d15 sitting here... but won't be installing it until all is back to normal... which is very soon due to the next paragraph...)

So I finally found a local store (Best Buy) that had a mediocre PSU that I could pick up immediately. (Corsair TX850M... it's the best one any store within 45 minutes of my house had in stock)

I bring that 850M home... get it all connected (minus one peripheral/sata cable because there's not enough 6 pin slots)... and my machine has now been up for the past 12 hours!

I'm going in now and will push the hardware a bit more with a game. (although not too hard as all OC is turned off, my airflow is down from normal from the extra PSU in place, and I don't trust it as it's not the same standards of my AX1200i)

So I want to thank ALL of you for all of the help... if this game test works and it stays solid for today... I'll be RMA'ing my AX1200i as it's definitely been narrowed down to be the culprit.

I'll also share that while this was the most frustrating computer experience of my life (aside from when I was 12 years old, staring at a C prompt on our new computer, and wondering how I get into program basic like my "old" machines from the early 80's)...

this was HUGELY SATISFYING to now not only have built my own full build a few years ago...

but also to now have troubleshot this similar issue two separate times, have re-set the CPU 3 times now... and learned even more troubleshooting protocol...
I may even be ready for water cooling on my next build!
(although I'm in no rush for a build right now... pumped up that my "baby" is back up & running!)

THANK you to all of you for your help!
(and I'll come back in to finalize this thread and let you know how the RMA process goes...)
😉
 


You bet Sgt... I'm an open book.

I run a few businesses... The main reason I have these higher end components & 6 hard drives is for video editing & rendering. (Adobe Premiere Pro wants a drive for the program, source file, output file, and cache all separate. The others are to separate data/programs, and a big non SSD back up drive)

I also wanted to be able to play an occasional game... but I love what I do and don't play much. (lately maybe once every month or two)

My machine is overkill in some areas. (I originally planned to put up to 4 cards in SLI... but there are no games I play that even need more than my 1080GTX.)

My main biz/incomes are from internet marketing (both affiliate, as well as my own products), and helping others learn how to leverage the internet to create their own income. (One of the recent newcomers in my community did $850K in the past 12 months... with an ad budget well under $50K... and he had zero previous successes. He's not even technical... just a very hard worker and follows directions when needed... paves a path when needed.)
On the side I do speaking/coaching/coaching/consulting. (all marketing/coaching/entrepreneurship related...)

On a given day... my machine typically does:
- Video editing & rendering
- Has windows open in at least 2/3 monitors. (Lots of copy/pasting data between screens, checking stats on 1 screen and editing in another, etc.)
- Runs live webinars for hundreds+ attendees.

Most colleagues & friends run their biz from a laptop and travel all around... but I'm married with 3 kids and love my home and home office. (I travel 1-2x/month... but don't "live around the world" like many of them... so I really enjoy an uber-high productivity machine/environment.)
:)
 


LOL!

1) I don't play many shooters... haven't much since the original Quake.
(I'm a strategy game kinda guy... especially StarCraft. Used to like RPG's a lot... but they're such huge time sinks I haven't played them for 3-4 years.)

2) I haven't worked for a boss since 2008... far too many limitations on time, income, and energy.
😉

 

TRENDING THREADS