Crashing - Not BSOD - More Frequent Under Heavy Load

MartyrB

Prominent
Mar 29, 2017
10
0
510
Okay so I've literally tried everything I can think of at this point, from drivers and updates to flashing the mobo.. I'm literally pulling my hair out over this because I can usually fix any issue.

The issue usually occurs under heavy load, but can also happen under very little load, and it's quite random.. sometimes I can game and stream at full quality for 8 hours without a crash, other times it will crash 2 minutes into a stream and then again 5 minutes later. Screens will flash black for a few seconds and then return, most programs seem to reload but others need to be completely restarted. I've never had a BSOD and Event Viewer doesn't seem to show anything about it.. My GPU does get a bit hot but I know that's not the issue due to it being the second GPU that's been through this system as the first one died of old age but was also having the same issues. I've tried literally everything I can think of over the past year, I've just been dealing with it until my new rig is built but I want to give this to a friend now so I would love to fix the issue. I don't even know the amount of times I've reinstalled windows or updated drivers but I'm almost certain it's not that.

Here is my system and temps (I have a lot running atm to show temps at a decent load): https://imgur.com/a/OrJUw
Furmark @ 1080p: https://imgur.com/a/FQrVI

If required I can probably force a crash and monitor w/e you suggest? I have ran things like prime95 for a day without it crashing so I'm really not sure?

Please help me, think of my friend without a PC <3
 
Solution
I don't think anything was damaged because you had an older version of bios. At most, it would have introduced some instability that could have accounted for some of the crashes, but that should have gone away with bios update.

As for bad wiring, I had a rig once in a place that had horrid wiring, and totally ruined my psu as a result even though I moved some time after. Surge protection should have mitigated the worst of it though. I'd still get a voltmeter and check it, cause at this point there isn't much else you can do.
It can be several things. The fact it happens more often under heavy load suggests to check first the psu voltages (part 2 of the test is the important part for you):
https://www.wikihow.com/Check-a-Power-Supply
to see if they're still within tolerances.
Then check your gpu. You say you ran furmark but I'd be surprised if it didn't crash after a while if it's either gpu or psu.
You've run prime95 so try testing memory next. You can try switching the sticks around, using different slots (a1/b1 vs a2/b2 and so on) and even trying with a single stick to see if you still have the issue. You can run memtest but it's not conclusive and it's recommended minimum 8 passes if you do run it.
As for event viewer, look around the timestamp of the crash +/- few minutes and look for events that say critical (have red circle with a ! through it).
 
The PSU is well above what is required for the system (750W) as I planned to use it for a future system, but have decided to just completely rebuild now.

I don't believe it's the GPU as I've had the exact same problem with my old GPU which was a few years older than my current.

I have tried every memory combination, including a friends memory so we can rule that one out.

I don't have any critical events, I've checked this numerous times.. All I get are warnings which say "Display driver amdkmdap stopped responding and has successfully recovered."

No matter how many tests I run I have never crashed during them. This leads me to believe a type program is causing it as I close everything when running tests obviously. Either that or a mobo problem? I'm not sure how I can run tests for a mobo..
 
Wattage of the PSU is not the deciding factor. It's the quality. Once the voltages start varying far beyond tolerances, you get system instability, unexpected crashes, especially when the system gets a bit loaded. Max wattage won't really factor in.
There isn't a test for mobo really. You can either reset mobo settings to default or you can update bios and that's about it that you can do yourself.
If you think it's software related, you can create a live usb version of ubuntu that you load on boot from a usb and it doesn't install anything. It's gone once you reboot and take out usb stick. In a space of a single boot it can allow you to stress test your system, under load and see if the same thing happens. If it does, with a different OS and different drivers, it's more likely it's hardware.
 
I've reset mobo settings aswell as updated bios and everything, even tried turning off power saving settings on my CPU more recently as I read that it can cause crashes in some systems, but also didn't work of course.

This is my PSU: http://www.antec.com/product.php?id=706266&pid=51
Wouldn't turning off all my power saving settings cause more frequent crashes due to more power consumption? Not to mention my last GPU chugged like 3x the power..

I've actually ran linux on this system for a few weeks with no crashes come to think of it.. is that a sign that's it's software related? Would software conflict with my hardware at all even if it's up to date and works fine for others?

I'll try forcing a crash now with event viewer up and post any results.
 
I'm getting an error with that link.
Power saving settings can actually cause issues. Sometimes they'll try and limit the cpu and if you're trying to push it with high load, it can create problems. Your power settings in windows should be on balanced. Sleep enabled but not hibernate. No extra power savings in windows or on the cpu. If you've got an eco mode on your psu, try not to use it as see if it helps.

Linux is...a lighter system I find. But if you've done all the same things, as in pushed the system just as hard from linux, then yes it could be software/driver related. It could even be certain settings which are on default in one OS and on a different default in the other.

Did you always have this issue? Did it only start happening recently? How easy it is for you to back your important files up and do a fresh windows reinstall?
 
I've reinstalled windows about 5 times this year lol. When I said I turned off power saving settings for the CPU I meant via bios. I don't recall if it started as soon as my system was made a couple years back but I don't remember not having crashes if that helps lol?

The only hardware that hasn't changed in the system since it was first made is the CPU/MOBO, everything else has changed over the course of a few years..

I'm actually gaming/streaming right now with all my apps open and can't get it to crash.. it's so random, sometimes I can go a few days without one, other days it happens every few hours, I've even crashed twice in a row before.. like I'll have just a game open and everything else closed in fear of crashing but still do and now everything open with no crashes zzz..

Current temps/usage: https://imgur.com/a/9xYIH

PSU: Antec HCG-750M
 
Alright I got it to crash lol.

This time the screens went black and didn't come back on, so I waited a minute before restarting. This happens rarely, probably due to the driver trying to recover but there was too much running for it to do so? I have no clue but it usually happens this way when I have a lot of shit open.

Again not a thing in event viewer, just it telling me "The system has rebooted without cleanly shutting down first." obviously because I hard to force restart it.

The whole time my music did not stop playing or skip at all if that helps? I have the game open, streaming @ 6,000 bitrate 30 fps, watching the stream, youtube music @ 1080p playing, multiple other browser tabs open, discord open, blizzard app open, steam open, speccy open, not really sure what else but yeah..
 
Yeah those temps are ok.
If you're reinstalling so often try this. Right click on C drive (I'm assuming your OS is on C), and go into properties-tools and then do a disk check. Let it repair anything it finds, if it finds issues.
Also, how much space do you have on your drive? What the total volume? Does your manufacturer (of the drive) have a utility for disk health check? Run that as well.
Just to cover all the bases. Not saying this is the issue but since it's hard to pinpoint what it is, and you're waiting for a crash/event ID, might as well?

Edit: Wait, look in the event viewer around the timestamp of the crash, there should be at least one event that says critical with a red circle and ! through it.
 
Both SSD, 120Gb & 240Gb, even at like 25% usage on each I've had crashes, I also recently did a 'sfc /scannow' with no errors, I scanned both drives through speedfan also with no errors , I think the drives are okay lol.

Will event viewer post logs of the crash even after I've restarted? Because everything here is after the restart.

The only "critical" (which has an x through it btw) was due to me hitting that restart button to force restart. (I believe anyway lol)

"The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly."

- System

- Provider

[ Name] Microsoft-Windows-Kernel-Power
[ Guid] {331C3B3A-2005-44C2-AC5E-77220C37D6B4}

EventID 41

Version 5

Level 1

Task 63

Opcode 0

Keywords 0x8000400000000002

- TimeCreated

[ SystemTime] 2017-10-23T21:04:07.912311300Z

EventRecordID 11101

Correlation

- Execution

[ ProcessID] 4
[ ThreadID] 8

Channel System

Computer DESKTOP-7P35ILG

- Security

[ UserID] S-1-5-18


- EventData

BugcheckCode 0
BugcheckParameter1 0x0
BugcheckParameter2 0x0
BugcheckParameter3 0x0
BugcheckParameter4 0x0
SleepInProgress 0
PowerButtonTimestamp 0
BootAppStatus 0
Checkpoint 0
ConnectedStandbyInProgress false
SystemSleepTransitionsToOn 2
CsEntryScenarioInstanceId 0
BugcheckInfoFromEFI true

This was the first log after restart "The operating system started at system time ‎2017‎-‎10‎-‎23T21:04:06.490243100Z." - Not a thing prior to that for a few hours.

There are 2 red "error"s (with the ! in them)

1st; "The previous system shutdown at 7:37:28 AM on ‎24/‎10/‎2017 was unexpected." - that time is wrong as it happened at 8:03AM ???

2nd; "The CldFlt service failed to start due to the following error: The request is not supported."

 
1st disk check;

Chkdsk was executed in scan mode on a volume snapshot.

Checking file system on C:

Stage 1: Examining basic file system structure ...

220672 file records processed. File verification completed.

8143 large file records processed.
0 bad file records processed.
Stage 2: Examining file name linkage ...

286044 index entries processed. Index verification completed.

Stage 3: Examining security descriptors ...
Security descriptor verification completed.

32687 data files processed. CHKDSK is verifying Usn Journal...

38274240 USN bytes processed. Usn Journal verification completed.

Windows has scanned the file system and found no problems.
No further action is required.

116706303 KB total disk space.
92676780 KB in 136110 files.
97384 KB in 32688 indexes.
330871 KB in use by the system.
65536 KB occupied by the log file.
23601268 KB available on disk.

4096 bytes in each allocation unit.
29176575 total allocation units on disk.
5900317 allocation units available on disk.

----------------------------------------------------------------------


Stage 1: Examining basic file system structure ...

Stage 2: Examining file name linkage ...

Stage 3: Examining security descriptors ...

2nd disk check;

Chkdsk was executed in scan mode on a volume snapshot.

Checking file system on D:

Stage 1: Examining basic file system structure ...

106752 file records processed. File verification completed.

2 large file records processed.
0 bad file records processed.
Stage 2: Examining file name linkage ...

122962 index entries processed. Index verification completed.

Stage 3: Examining security descriptors ...
Security descriptor verification completed.

8105 data files processed.
Windows has scanned the file system and found no problems.
No further action is required.

233958172 KB total disk space.
198569864 KB in 76858 files.
24224 KB in 8107 indexes.
179860 KB in use by the system.
65536 KB occupied by the log file.
35184224 KB available on disk.

4096 bytes in each allocation unit.
58489543 total allocation units on disk.
8796056 allocation units available on disk.

----------------------------------------------------------------------


Stage 1: Examining basic file system structure ...

Stage 2: Examining file name linkage ...

Stage 3: Examining security descriptors ...
 
Yeee...41(63) is a hardware failure or failure of the driver for said hardware. Here's the checklist that you usually need to go through:
https://social.technet.microsoft.com/Forums/windows/en-us/308cbcb3-46ce-4f74-85f9-d87ce4cef0d6/kernel-power-event-id-41-task-category-63-spontaneous-improper-shutdowns-and-reboots?forum=w7itproperf
but considering you've already checked overheating and memory, it leaves you with OC (turn of any and all overclocks and check stability) and PSU. Assuming you've already thought to reset bios and turn off all overclocks, we're back to psu lol. I've checked your model and it's a pretty good quality according to http://www.tomshardware.com/forum/id-2547993/psu-tier-list.html so I'd be surprised if it crapped out this quickly as you say you've had issues since very early on.

How about...hm. How the wiring in your place? Do you plug monitor and pc into the same wall outlet? Do you have a power bar with surge protector on it? But that would be odd. If it was random maybe, if it's only happening under load that would not be due to surges. Then again if it's an old place and wiring is crappy...I don't know. Running out of ideas really...
 
Yeah it's a pretty hard problem to solve hence why it's stressing me out.. I use a power board and always have (only desktop and 2 monitors connected, 1 is low power), if I was in my old house I'd say the wiring wasn't the best as it was a pretty old place but here where I am now it's fairly new. I don't see that as a problem though due to other desktops with far more usage like my housemates for example also running off a single power board. I also built his rig and am quite upset at how well it runs lol.

I've never OC'd this, I never felt the need to.. and yeah I built the machine as I have some experience and worked in computer maintenance/repair hence why the PSU is decent, that's the one thing I made sure of but I kept the budget in mind also.

My drivers are obviously up to date.. I install them directly from the manufacturer and then use driver booster to grab any missing ones. My mobo is up to date, the only issue I can think of is that I used the incorrect bios version for awhile.. could that cause permanent damage?

I will try a lot of what's in that microsoft link, it seems a few people are having my issue.. or atleast close to it

"As you can see from my previous post, I have replaced essentially everything and I still have the problem. I have a Gigabyte mobo. In the BIOS there is a Power Management setting called Power Loading. It states in the Gigabyte manual that if the computer PSU is at a low load, a self protection will activate causing it to shutdown or fail. If this occurs. please set to ENABLED." - I will check for this now lol.
 
I looked through but nothing jumped out at me as odd. Was that first pic what you get when you click on Power Management tab up top?
What do you mean incorrect bios?
Also you mentioned your old place had crappy wiring. Did you have your psu in that old place or you built it after you moved? But you also said you used a power board. Presumably with surge protection? So this would be less of an issue.
 
Yeah the first picture is power management.

I believe I was running it in AM3 mode rather than AM3+ which required an update. "To enable AM3+ AMD FX-Series CPU support, please update your motherboard with the most current BIOS found in your motherboard’s download section."

The system was made prior to the house with "bad wiring", although it wasn't bad it was just an old place.

And yeah always surge protection.
 
I don't think anything was damaged because you had an older version of bios. At most, it would have introduced some instability that could have accounted for some of the crashes, but that should have gone away with bios update.

As for bad wiring, I had a rig once in a place that had horrid wiring, and totally ruined my psu as a result even though I moved some time after. Surge protection should have mitigated the worst of it though. I'd still get a voltmeter and check it, cause at this point there isn't much else you can do.
 
Solution