PC turns off

gatsurai

Honorable
May 12, 2018
22
1
10,515
Had this issue in the past whereby I'd be on a game like farcry 4 for awhile or scp secret lab & the computer would just up & restart.

I changed the thermal paste about 4 months ago & that fixed it but it happened again today after about 3 hours on game.

Specs are:
CASE:- NZXT H440
CPU:- Intel i7-6700k 4.00ghz OC to 4.40ghz
CPU vcore:- 1.320v
MOBO:- MSI Z170a Gaming M7
MEM:- 32 GIG Corsair DDR4
GPU:- Asus Strix GTX1070 @ 10% OC
AIO:- Deepcool Captain EX 240
M.2 SSD:- Samsung 951,256 GIG
SSD:- Kingston Hyper X Savage 240 GIG
SSD:- Kingston Hyper X Savage 480 GIG
HD:- 1 TB Seagate
PSU:- Corsair RM850
FANS:- Corsair AF120
Windows 10 pro

Looked at event viewer & the errors are:

Windows logs - system
Error - Event ID 35 Task category (2)
Performance power management features on processor 0 in group 0 are disabled due to a firmware problem. Check with the computer manufacturer for updated firmware.

This is repeated, 8 in all going from 0 to 7.

Performance power management features on processor 7 in group 0 are disabled due to a firmware problem. Check with the computer manufacturer for updated firmware.

Accompanying each instance is also:

-===-
Information, Event ID - 55 Task category (47)
Processor 0 in group 0 exposes the following power management capabilities:

Idle state type: ACPI Idle (C) States (1 state(s))

Performance state type: None
Nominal Frequency (MHz): 4008
Maximum performance percentage: 100
Minimum performance percentage: 100
Minimum throttle percentage: 100

So again processor 0 - 7 (8 instances)

-===-
Error, event ID 6008
The previous system shutdown at 02:00:26 on ‎11/‎10/‎2018 was unexpected.

-===-
Critical Event ID 41
The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

If I had to guess what is happening (could be way off the mark) the cpu activing cooling function is kicking in but the cpu gets soooo hot after awhile that the system tried to downgrade the OC to stock cpu speeds to lower the temp & the moment that happens it crashes the system.

Any way to fix this without resorting to removing the oc?

Edit - Been using this pc since 08/2016 (was new).

If any other information is required let me know & I'll do my best ro provide it.

Hopefully someone knows what to do, thanks.
 
Solution
Yeah, you need a new power supply. That one is done. 11.4v is throw away voltage. Anything between 11.4v and 11.7v is probably having problems. Anything above 11.7 but below 12.0v is showing signs of maybe beginning to get weak. Technically, should never drop below 11.7v, ever. 11.4 is definitely discard readings.

Mobo bios is 6 revisions behind:


Version
7976v1L
Release Date
2018-07-30
File Size
7.26 MB
Description
- Improved USB compatibility.
- Update Intel Micro code for security vulnerabilities

Version
7976v1K
Release Date
2018-05-04
File Size
7.28 MB
Description
- Fix MSI Clutch GM60/GM70 mouse cursor jump in windows and BIOS when DPI set over 1000Mhz.
- Update Intel Micro code for security vulnerabilities

Version
7976v1J
Release Date
2017-12-19
File Size
7.23 MB
Description
- Improve NVME compatibility.
- Update micro code.
- Improve USB 2.0 device can not work properly after resume from S3 mode.
- Improve memory compatibility.
- Update Intel ME for security vulnerabilities

Version
7976v1I
Release Date
2017-05-08
File Size
7.27 MB
Description
- Improved NVME device compatibility.
- Improved memory compatibility problem.

Version
1.H
Release Date
2016-12-19
File Size
7.25 MB
Description
- Enhanced M.2 Genie function.
- Improved Board Explorer function.
- Improved NVME device compatibility.
- Updated VBIOS and GOP.
- Support multi graphic cards.

Version
1.G
Release Date
2016-10-18
File Size
6.97 MB
Description
- Supported next generation new Intel CPU.


Version (Using)
1.E
Release Date
2016-07-22
File Size
6.91 MB
Description
- Improved PCIE compatibility.

====================================

Intel chipset driver installed is not latest (unsure what ver' is running)

Memory stuff:
https://imgur.com/a/qMXjwPa
https://imgur.com/a/Kw9v5jT

XMP profile 2 I believe.

If updating the mobo firmware & cpu chipset drivers can that be done without having to re-install the whole system? i;e it will not affect anything on the software/hdd storage side? Does one have to be done before the other in a particular sequence?
 
Yes, it can be. You don't need to re-install anything to update the bios, and new chipset drivers can just be "installed".

I would update to the latest bios first, and you do not need to do each version, just update to the latest version only which contains all previous updates except those that maybe didn't work out or were buggy and got removed later on. So, latest bios only.

Then, install the latest chipset drivers. These can be found on the product page for your motherboard as well and will be called Intel .inf files.

The bios must be done outside windows, from the BIOS, so find a good tutorial on updating the bios on your motherboard model and watch it. Also, there are likely tutorials on updating the bios on the MSI website. Usually there are a ton of Youtube videos for any given motherboard model showing how to do this.

For chipset drivers, install only the latest version as shown on the product page for your motherboard on the support>downloads tab, or the latest 100 series motherboard chipset drivers as available from the Intel website. Actually, those found on the Intel website are usually more up to date if you can locate them.
 
https://www.msi.com/Motherboard/support/Z170A-GAMING-M7.html#down-driver&Win10%2064

I'll have a go at the bios & chipset updates tomorrow; bit late here. The bios settings will reset correct? So I'd better write down all the settings, so bios first then chipset?

The memory I believe are 2 kits or maybe even 1 but I cannot confirm as this pc was built by another & purchased.
In cpu-z the listing for each of the 4 slots is identical so I'm inclinded to think it is 1 set.

Ram clock speed is 2900mhz, I remember the guy said he could not get things stable at 3200 so he downclocked the ram to 2900.

Set might be this, well something in this range anyway - https://www.scan.co.uk/products/32gb-(2x16gb)-corsair-ddr4-vengeance-lpx-black-pc4-25600-(3200)-non-ecc-unbuffered-cas-16-18-18-36-x?gclid=EAIaIQobChMIr_vA_rD93QIVjpPtCh1CEwtrEAQYAiABEgK2cfD_BwE
 
That's very possible considering the use of four modules. Let me know when you are ready to bother with the memory. It's also very possible that bios updates will allow the memory to have better compatibility, and if not, we might be able to get them to run at their XMP value anyhow with slight increase in DRAM voltage which is almost always required when using quad modules that are much higher frequency than the SPD default values.

No promises on when, I have jury duty this morning. Boo.
 


Hiya, ok downloaded the latest bios version & Intel chipset driver. I have never done a bios update before so will obviously watch a video but one thing I am unsure of before I do this that either has mixed opinions or the posts are nearly a decade old is:

Do I need to reset my bios options to safe defaults or optimized defaults before I update the bios version? I read that doing a bios update with overclocks on your processor speeds & voltage can cause problems. Is there an easy way to save the settings somehow or do I need to just go through every section one at a time writing down what is currently in place/taking screenshots to then put them back after the bios update? Do I need to reset the bios settings once more after the update is complete?

Is there a good program to run to show cpu temps/usage on each core while running games then maybe I can see what the temps were at just before it crashes again in the future ( I have a feeling it might be due to 1 core being constantly at 100% for awhile) & is there a utility I can use to find out the exact ram models I am using?

Cheers.

P.S - Never been on jury duty; hopefully it's not a mafia case eh.

Once I've done the updates we can look into the memory stuff if you're willing.
 
I've never bothered resetting to default settings before doing a bios update, but it couldn't hurt. My configurations have always already been validated as stable prior to moving on, so it hasn't been a problem. For your setup, that MIGHT not be true. So I'd probably reset to default, save settings, exit bios, go back into bios and THEN do the update.

After, you can reconfigure your bios settings and yes I would be more than willing to try and help get your memory configured afterwards. If you are overclocking, or plan to overclock the CPU though, you will want to completely validate that process BEFORE moving on to memory. Badly configured memory, or anything other than the default JEDEC SPD memory settings can seem like an unstable overclock, so do memory LAST, if you are going to overclock the CPU. OR the GPU for that matter.

HWinfo.

HWmonitor, Open hardware monitor, Realtemp, Speccy, Speedfan, Windows utilities, CPU-Z and most of the bundled motherboard utilities are often not the best choice as they are not always accurate. Some are actually grossly inaccurate, especially with certain chipsets or specific sensors that for whatever reason they tend to not like or work well with. I've found HWinfo or CoreTemp to be the MOST accurate with the broadest range of chipsets and sensors. They are also almost religiously kept up to date.

CoreTemp is great for just CPU thermals including core temps or distance to TJmax on AMD platforms.

HWinfo is great for pretty much EVERYTHING, including CPU thermals, core loads, core temps, package temps, GPU sensors, HDD and SSD sensors, motherboard chipset and VRM sensor, all of it. Always select the "Sensors only" option when running HWinfo.

In cases where it is relevant and you are seeking help, then in order to help you, it's often necessary to SEE what's going on, in the event one of us can pick something out that seems out of place, or other indicators that just can't be communicated via a text only post. In these cases, posting an image of the HWinfo sensors or something else can be extremely helpful. That may not be the case in YOUR thread, but if it is then the information at the following link will show you how to do that:

*How to post images in Tom's hardware forums



Run HWinfo and look at system voltages and other sensor readings.

Monitoring temperatures, core speeds, voltages, clock ratios and other reported sensor data can often help to pick out an issue right off the bat. HWinfo is a good way to get that data and in my experience tends to be more accurate than some of the other utilities available. CPU-Z, GPU-Z and Core Temp all have their uses but HWinfo tends to have it all laid out in a more convenient fashion so you can usually see what one sensor is reporting while looking at another instead of having to flip through various tabs that have specific groupings.

After installation, run the utility and when asked, choose "sensors only". The other window options have some use but in most cases everything you need will be located in the sensors window. If you're taking screenshots to post for troubleshooting, it will most likely require taking three screenshots and scrolling down the sensors window between screenshots in order to capture them all.

It is most helpful if you can take a series of HWinfo screenshots at idle, after a cold boot to the desktop. Open HWinfo and wait for all of the Windows startup processes to complete. Usually about four or five minutes should be plenty. Take screenshots of all the HWinfo sensors.

Next, run something demanding like Prime95 version 26.6 or Heaven benchmark. Take another set of screenshots while either of those is running so we can see what the hardware is doing while under a load.

*Download HWinfo


For temperature monitoring only, I feel Core Temp is the most accurate and also offers a quick visual reference for core speed, load and CPU voltage:

*Download Core Temp
 
Updated bios & the intel chipset thing.

I'll take the screenshots using hwinfo of a cold boot & running prime 95 later then upload them.

Whatever is causing these crashes under prolonged heavy load in certain games needs to go.

P.S - Never used prime95, is there any particular options I should enable/disable like aida64 or is it pretty straight forward?
 
Small FFT option. Run for 15 minutes to establish thermal compliance. Anything under 80°C is technically thermally compliant. If it's not overclocked, you'd like to see it a bit lower than that especially with a good cooler like you have.

For stabilty testing, if you do overclock, which I think you do, then you want to run Realbench for 8 hours. Choose the stress test option, choose 8 hours, choose half your RAM, run. Don't think of using your PC during that time, you won't be able to. If it passes, it's stable. If it does not, then it isn't. Simple as that and need to reconfigure by lowing something or increasing voltage if it isn't.
 
Will run realbench @ some point.

For now I re-applied my bios settings after the update including the oc the builder set (4ghz oc to 4.40ghz & ram set to 2900mhz, gfx set to 110%)

First set of pictures is from a cold boot, 3-5 mins into pc starting (os is on an m.2 ssd)
https://ibb.co/dHM6dp
https://ibb.co/hOQzyp
https://ibb.co/b9esJp
https://ibb.co/kUqzyp
https://ibb.co/jYNjW9

Second set was taken at 15 mins or so mark of running prime 95 @ FTT option
https://ibb.co/jggdr9
https://ibb.co/jJU9yp
https://ibb.co/byU9yp
https://imgur.com/tEanuVi
https://ibb.co/dxc7jU

Third set was from a 15 minute run of hwinfo with the majority of it from just after prime95 was stopped.
https://ibb.co/jRqMB9
https://ibb.co/k26HjU
https://ibb.co/nuzEW9
https://ibb.co/cAxZW9
https://ibb.co/gHx1B9

I have a whole bunch of pics of what the bios is set to like fanspeed options etc... if that is of any use.
 
10 minutes in on scp secret lab (had msi afterburner running to check gpu temp, cpu temps & usage % for each core) & it crashed again, last I looked at the screen stats, the % of usage on each core was no higher than mid 60% at worst.

Temps were between 45-60c.

So something else has to be causing this issue seeing as prime95 ran for 15 mins with 100% core usage, 80-90 temps & no crash.

The system upon it restarting (took a min or two) stated that the oc was unstable & the oc settings have been reset so I set the oc to off yet it for some reason is running at 4.20 atm instead of 4.00 & ram went back to 2133 I believe, I am still using xmp profile 2 but downscaled to 2800 instead of 2900.

Will load up scp secret lab again & see if there is a crash.
 
Crashed 5 mins into scp this time, set dram frequency or whatever it is to auto which has set it at 2133mhz

CPU seems to be back at 4.00ghz now.

Ok will load again & it will probably crash again, no idea why,
 
Ok another update, this time it crashed after 2 hours running the stock cpu speed & 2133mhz ram without xmp.

Got this in event viewer

https://imgur.com/a/wzzuquZ

Always on scp secret lab, before this back in June or so before I put on new thermal paste it would happen on farcry 4 capped at 120 fps, then 100 then finally 60 and would happen regardless. I ended up uninstalling fc4 though so can't test it but I bet the results will be the same seeing as it does it on scp.
 
Your CPU is WAY too hot. 80°C is the temperature at which you say "Well crap, something is wrong"

Anything over that, means you are getting into "well, this isn't good" territory. At 90°C, that turns to "bad things are going to start happening soon".

So, something is wrong. Have you had the cooler water block OFF yet, and changed the paste, at all, since owning it?

Have you checked to see that the backplate to CPU cooler mounting bracket fasteners are tight? That the water block to CPU cooler mounting bracket fasteners are tight? I'd check all of that, and if they are tight (Not like, cranked down tight, but very snug) then I'd pull the water block off, clean the thermal pad or thermal paste from both the CPU lid heat spreader and the bottom of the water block, and apply fresh paste. If you don't have any, I'd get some. I can make recommendations if you need to know what kinds are good/preferred.

If the thermal paste HAS been re-applied at some point in the not too distant past and the fasteners are for sure fully seated and snug, then perhaps the pump isn't running or the radiator fans are not configured correctly.

Water pump should be connected to CPU_OPT header and that should be set to full time 100% operation in the bios. Radiator fans should have a splitter and be connected to the CPU_FAN header configured to either a custom profile you like or the performance profile. UNLESS, this cooler has a different arrangement for connections like some of the Corsair coolers do where the pump just connects to either a water pump header or the CPU fan header, at 100%, and then the fans connect to the pump. Would be very good to verify how things are set up and make sure they are right.

No way you should have those temps. I have a 6700k@4.6Ghz on a single finstack Noctua NH-U14S air cooler and my max core temps NEVER exceed 70°C under any circumstances, and usually max out around 68°C unless I run Prime95 and Realbench at the same time, then it goes to 70 but never higher.

My RAM is also configured with custom timings, very tight, down from 15-16-16-35 to 13-14-14-28, at 3000mhz, which tends to also increase thermals, and I still don't have problems. Something is absolutely wrong with your cooling configuration or your cooler. Could even be a pump problem, or air bubble. IDK, but I do know something is wrong unless you simply have way too high of a voltage and/or Load line calibration setting in the bios.
 
Changed the thermal paste in July or August, it seemed tight enough, used - https://www.amazon.co.uk/Thermal-Grizzly-Kryonaut-thermal-grease/dp/B011F7W3LU

Here's the fan settings in bios w/ some voltage levels

https://ibb.co/d0SiuU
https://ibb.co/bBtdTp
https://ibb.co/inUwEU
https://ibb.co/mRvQ8p
https://ibb.co/dimXop

I believe this is from when the cpu was @ 4.40 ghz.

Here are the pwm pins, I believe this is the correct matches:

https://ibb.co/mHRCM9
https://ibb.co/j728uU
https://ibb.co/mqRq8p
https://ibb.co/ncVuZU
https://ibb.co/cj83Tp
https://ibb.co/gc9XM9

I have one of these unused in a box - https://www.amazon.co.uk/Liquid-Freezer-120-Performance-Compatible/dp/B013WAY9UQ/ref=sr_1_9?ie=UTF8&qid=1539494192&sr=8-9&keywords=arctic%2Bfreezer&th=1

Was going to put it in but the case seemed a bit cramped at the time without moving the hdd's around. Should be doable but a pain in the ass.
 
Restarted again on scp on a solo server, was greeted with this screen:

https://imgur.com/VPJEp38

I hit f1 & changed nothing, tried alt+l but both would loop into said screen; I then looked at cpu & that was set to stock, then I checked memory and it was at 2900mhz so I dropped it to 2133 then the pc booted up.

I doubt scp would cause it to crash again straight away but after a bit it will for sure.

I installed whocrashed but it says there are no memory dumps & I unticked automatic restart under startup & recovery but it restarts anyway.
 
Before that I had the same issue with farcry4 back in around June or July, this was after getting about 30% in, the weather was extremely hot around that time which made me think to change the thermal paste. I believe it crashed again anyway (can't recall) so I uninstalled it & assumed the game was buggy.

Same thing now happens on scp though.

I did a test running the cpu at stock & the ram at 2133mhz, also used DDU to do a fresh install of gfx drivers & monitor driver.

Anyway I noticed the 12v running a bit low while the game was running - https://i.imgur.com/yKoj0ym.jpg

11.328 at it's lowest after 5 or so minutes
11.424 for a lot of the time.

Now this is at stock so I'd imagine with the 4.40 & 2900 speeds it would drop even lower?

I heard voltage going either side of the margin by more than 5% is problematic so I'm wondering if at the time of the crash the voltage dipped even more with the reason it takes longer to crash at stock being that the voltage drops slower?

I know ppl running scp game with way lower spec rigs than I with zero problems involving crashes & the like.
 
Yeah, you need a new power supply. That one is done. 11.4v is throw away voltage. Anything between 11.4v and 11.7v is probably having problems. Anything above 11.7 but below 12.0v is showing signs of maybe beginning to get weak. Technically, should never drop below 11.7v, ever. 11.4 is definitely discard readings.

 
Solution