System freeze after 5/10 minutes of graphics intensive games/programs. NOT heat related

parkurtommo

Honorable
Jul 26, 2013
46
0
10,530
For a few months now I've been having an issue with freezing/crashing during games and programs that are graphically intensive. Very reliably, my computer will completely freeze up after maybe 5-10 minutes, nothing works, only solution is to reset with the power button. There are no crash dump files. Sometimes, it will be a BSOD, but the BSOD also freezes, at 0%.

I've been monitoring my temps when it freezes, everything is absolutely fine there, gpu hovering around 60 at 100% use, CPU around 50.

The problem was not always there, I built this computer last December and it was working fine while playing The Witcher 3 for quite a while. Can't recall when the problem started exactly, because I've been too busy to play games this year.

Build:
Windows 10 Home 64-bit
Intel Core i5 6500 @ 3.20GHz
16GB RAM
Gigabyte Z170-Gaming K3 (U3E1)
NVIDIA GTX 970

Again, built in December last year, although I reused the gpu from my last build.

I've tried stuff like updating nvidia drivers, making the page file larger. Any ideas?

Thanks.
 
Solution
That's not great. I mean at least you know it's likely your gpu (or your PSU starving your gpu which could have the same effect) but you know to focus on this and not your cpu. Since it's not the temps, and you're not overclocking, I'm inclined to think there's something wrong with the card itself. Do you have a possibility of rma-ing it? Contact nvidia (or whoever is the card's maker, if it's a third party), and explain your issues and your test results and ask if there's anything you can do to address the problem, short of rma, and if not, then rma if you can.


As I said in my post, my temperatures never get anywhere close to overheating.
 


I checked around the time of my last freeze and there is nothing besides the basic critical error "The previous system shutdown at (time) on (date) was unexpected."

Also like I said there is no crash dump info, so my impression is that the computer is freezing from some hardware issue.
 


haha mb. maybe its getting overloaded? the gpu.
try underclocking the gpu first pretty low
then higher it if the pretty low fixed it and then higher until it starts doing it again.
 

Tried underlocking it by 300mhz and froze all the same.
 
I'd think ram and you haven't mentioned if you were overclocking that but if that was the case, it would have been happening right from the start. The fact it's taken a bit of time suggests a failing component but if the temperatures are ok, that's hard to imagine too (though it happens, they do have a limited lifetime, even with care). With no errors of any kind it's hard to say what's going on. The only thing left to do is to swap in different parts (gpu, ram, cpu), from another computer to test, until it stops happening and you see which component it was.
 


Well, I just opened up my case, cleaned it out a bit with compressed air and made sure everything was properly connected, and I swapped the RAM modules places. Tested Shadow of Mordor --which was consistently causing my PC to freeze after about 7 min- and so far so good. I will update if it freezes again at some point. But if I broke the pattern by reseating the RAM modules then maybe that was the issue after all, I guess maybe I didn't have them properly seated, we shall see. Thanks for the help!
 
The freezing returned! 🙁 I managed to play about 1 hour of The Witcher 3 before it froze again, then when I tried again it froze after about 3 minutes. This kind of inconsistency really makes me think of overheating, but there's no way, unless somehow the PSU is overheating?

Anyways, I don't have any spare parts to swap with, so I think I'll just refrain from playing games until I can afford to get some new parts!
 

Corsair Vengeance LPX 16GB (2x8GB) DDR4 PC4-17000C13 2133MHz Dual Channel Kit - Black (CMK16GX4M2A21
Yeah, don't know the timings.
It's at the original frequency, 2133 Mhz. I'm using The dual channel slots furthest from the cpu.

I ran memtest today and it passed without errors.
 
I don't know about voltage, though if your temperatures are fine I wouldn't think it was an issue unless you've overclocked the card or played manually with voltages and that has in turn made it unstable. You're expected to be running your card fine as the games by default (and your video settings) will adjust automatically to best possible settings your card can give you without compromising performance. So you can expect card to be utilized pretty thoroughly. If you're unsure, you could try lowering your settings in your games, and seeing if it makes a difference?
When stress testing your memory, give it multiple passes. One pass won't give you much. If you want to stress test CPU you can try Prima95. You can use FurMark to stress your gpu but I'd think lowering settings in your games will pretty much give you the answer if it's a gpu stress or not as you will see the problem disappear if it is. Turn off any overclocks if you currently have them set.
Since you're testing everything, I'd get a testing utility from whoever manufactured your drive/ssd and test it for bad sectors too.
If everything fails and tests show nothing wrong, it could also be an issue with your windows. Try logging into a different account of your windows install. If still issues, you could try tracking them down one by one, though unless it's something I can point to and try to solve, I usually wipe and reinstall windows, as it often doesn't take longer than troubleshooting individual problems and I keep all important files on separate drivers/partitions so only system files and some programs get reinstalled.
Finally, and this really should be the first thing to look at, not final one :), check the game itself. Is there some kind of bug wherever you're getting stuck? Are there issues reported with your particular card and that game? With the drivers? I know you've played it a while but drivers get updated, and games have glitches. Do you freeze on other demanding games?

 

Thanks for the advice! I'll try all of those things and come back. Yes, it happens in pretty much anything that is using a lot of resources, including stuff like photoshop or 3d software.

I considered a fresh install of windows but I have way too many important programs that I would need to reinstall and reconfigure for my work, if I do that I would probably lose a full day to it. But I'll give it a try later on as a last resort!
 


So I ran the Prime95 stress test (mixed), it froze after about 10 minutes, in the same way that it does when playing games. I suppose cpu is to blame, then?
 


Tried the test with higher power consumption and less RAM (since RAM is clearly not the problem), it froze on the second test (5 minutes after starting) at 100% CPU usage and 80º temp, the highest it reached was 89º, which does seem dangerously high but it never gets this high when playing games, max is maybe 75º. I am using the cooler that came with the cpu so of course temps will not be great.

Also, I tried switching to a different windows user and tried Witcher 3 there on low settings, it still froze after about 20 minutes.
 
Humm it's interesting it froze sooner with less ram, but still yeah, those are not danger level temperatures. The fact it's freezing doesn't look great for it.
Is your cpu under warranty?
Btw you're running small FFT's and not blend test right?
 


The first 2 tests I ran were blended, then large FFTs (which crashed faster), now I tried small FFTs and it crashed after 10 minutes all the same.

Yes, it should be within warranty, although I disposed of the packaging. I'm chatting with customer support right now!
 
So intel support is of no help because according to their diagnostic tools and stress tests, the cpu is fine... No errors, no freezing.

Kinda lost here. I still have a sneaking suspicion that this is my PSU.
 
be sure to update your motherboard sound drivers from your motherboard vendors web site.
it is pretty common for the motherboard sound driver to conflict with the GPU sound driver or the cpu's sound support for the internal GPU. Most of these were fixed after windows 10 shipped and have to be installed from the motherboard vendor website. They would not be provided via windows update.

basically, the bad driver responds to sound events that it should not. It causes the buffer to overflow in the gpu sound support for displayport or HDMI and that causes the graphic driver to stop responding. Even though it still responds to windows directx so windows does not detect that the driver is hung and does not attempt to reset it.
 
Doesn't seem to be the issue, I updated the sound driver and the freezes are still happening.

 
if a actual crash memory dump does not complete, I would check to make sure that the drive has proper space free, then make sure the c:\pagefile.sys is large about the size of the actively used memory. I might also move the c: drive to the primary SATA port (the one that is directly supported by the CPU which is often the slower SATA controller) or I would update the drivers for the CPU chipset and any usb 3.0 external chips drivers. if you have a SSD you might look for firmware updates but I generally only see this problem with old firmware releases before 2013.
it would also be a good idea to download and run crystaldiskinfo.exe to read the smart data from the various drives. (it also shows the firmware number)
----
there are a lot of causes of a system "freeze" conflicts with sound drivers messing up the GPU is just the most common.
with newer motherboards the USB subsystems can be running thru the PCI/e bus and disrupt the GPU, SSD firmware can do the same. Best thing to do would be to start cmd.exe or powershell and turn on the windows debugging flags via:
verifier.exe /standard /all
then change your memory dump type from minidump to a kernel memory dump.
then google how to force a memory dump via keyboard, make the registry setting changes and reboot the system.

the next time the system locks up, you force a memory dump and put it on a cloud server like Microsoft one drive, share it for public access and post a link. Someone with a debugger can see where the system is hanging or look for why the system is not correctly responding.

note: be sure to know how to get into windows safe mode, in case the system bugchecks during boot. This is so you can run the command verifier.exe /reset to turn of the special debugging.

the mini dump can also be useful since the driver table can be looked at in the debugger and overclocking drivers and very old drivers can be easy to identify. The kernel memory dump will include info on all of the processes running on all of the CPUs and will show proper info on the USB subsystem.

I have also seen old networking drivers cause graphics to lock up if NVidia shadow play was running or even virus scanners. the drivers would return a error and not process certain packets and the virus scanner would not let other processes get access to the packet until it was scanned. (generally the network driver was over 4 years old)






 

Thanks for the detailed reply. I tried doing the manual memory dump, it did trigger a crash but there are no dump files, I tried switching the memory dump file location and still nothing. Also, the manual memory dump doesn't trigger when the system is frozen, it's completely unresponsive.

Is the lack of memory dump perhaps a pagefile issue?

Also since yesterday I'm getting very poor performance in general, cpu is constantly between 50-100% use despite only running basic programs like chrome and spotify. A bit of input lag in photoshop which is making my job a pain. Diagnostics still show nothing incl. stress tests. I've noticed that the process using my cpu consistently at 10%-20% is something I haven't seen before: WmiPrvSE.exe

I tried restarting in safe mode and the cpu performance/utilization is normal. Malware? :/