Computer Black Screen Crashing When GPU Under Load

Status
Not open for further replies.

r3g3n24

Commendable
Aug 30, 2017
7
0
1,510
Hello everyone,

I am having a problem that seems to be related with my ASUS GTX 970 STRIX OC.
Since the 21st August whenever I stress my GPU, or just play games that are more or less GPU intensive (Rainbow Six, League of Legends), after a little while, my PC crashes, leaving me with a black screen (no audio loop). If I look at the computer itself, the PWR LED is on and the only way I can start it up again is if I hold down the power button or turn the power supply off. This does not happen while the PC is idle or under minimal load (i.e.: with Chrome and Outlook started and in use, or a non-graphic intensive game such as Hearthstone added to the aforementioned programs).

I'd like to mention the fact that I've had this GPU since November 2014 and has only started acting up recently, so I don't think it's something to do with the GPU itself, or improper seatment or any other issues that would've obviously occured since I bought it three years ago.

I've been able to reproduce this crash with Heaven, Furmark and Aida64. I also have multiple Aida64 logs up until the moment of the crash.

Also interesting to note is the fact that the GPU load itself does not seem to matter. A Furmark stress test with 1280x720 and no Anti-Aliasing causes the crash to happen in about the same time as a Furmark stress test with 1920x1050 and MSAA 8x. (even though, obviously, during said tests I get an average of 115 fps and 30 fps respectively)

I don't think the crashes have anything to do with voltages or temperatures, because I've monitored the behavior of the PC before the black screen and both the voltages and the temperatures were well within the acceptable ranges and constant.


So far I've tried reinstalling the video drivers, installing old video drivers (the ones before the issue started appearing), uninstalling the audio drivers, enabling the onboard Intel GPU, dusting computer, reconnecting all wires, reseating and testing RAM one by one (each and every one produced crashes), applying new thermal paste on my CPU, changing the PSU slots, updating motherboard BIOS and GPU BIOS, running 2 passes of memtest, underclocking my GPU by 10/20/50 MHz, underclocking my GPU to 62% power (the lowest my ASUS GPUTweak would let me go).

I also tried replicating the crash with my old GTX 650, and after 5 minutes of Furmark I decided to stop the stress test, because usually after 2-3 minutes the system would crash (with the GTX 970).

My Specs (Via Speccy): http://speccy.piriform.com/results/yda1W1fCIge5UviBNS2RIlF

I don't want to RMA because it would take a long while before it came back, and I'm supposed to leave for university next month.



Thanks in advance for your help.



Regen
 

EpIckFa1LJoN

Admirable
Stress test the CPU, without the GPU in. AIDA64 should be fine. League of Legends is the exact opposite of a GPU intensive game, in fact, my GPU barely gets used in it. It's extremely interesting that it is causing it to crash since it should more or less be on idle in that game.

Other than the CPU, there's nothing else it could be besides the GPU.

And just because it has worked fine for almost 3 years means nothing. Chips degrade over time and need to be replaced. That's just how it goes. 3 years is considered the short end of the lifespan of a GPU but they can and do go bad as soon as that, especially if they are heavily OC'd like your Strix is.
 

r3g3n24

Commendable
Aug 30, 2017
7
0
1,510
I almost never OC'd it before, forgot to write that down. Should I use the AIDA64 benchmark or the Geeks3D CPU burner? Do you want me to upload the logs and link them here or just tell you my diagnosis after looking at them?
 

EpIckFa1LJoN

Admirable
Just see if it crashes at all. If it doesn't it should be fine, and its almost definitely the GPU, but AIDA64 is fine for stress testing the CPU, if you want to fast track it and torture it, run Prime95 and run Small FFTs for about 10 minutes. Its going to push the CPU HARD. If it passes about 10 minutes of that it should be fine.
 

r3g3n24

Commendable
Aug 30, 2017
7
0
1,510
11 minutes of Prime 95 done, no crash. Of course, the CPU was at a constant 99-100 *C after the first minute, with throttling that went up to 50%, but no crash.

One thing I forgot to mention: if the computer is left off for a while (say 1h+) the crash occurs later than it would've if it had been under use for the past time. It's like if I give it time to 'relax', it can perform more/better before the crash. Since it's not the temps/voltages, I'm not sure why that would happen.
 

blockhead78

Distinguished
I know you've tried re-installing nvidia drivers, but did you do a DDU wipe?

If not, it's worth a try.

If that didn't work, or you've already tried it and given the troubleshooting you've already done, then the most common things in this scenario are faulty GPU or PSU.

although, if the GTX 650 requires a PSU connection and is not solely powered through pcie, then it would most likely suggest it's the 970 that has developed a fault

If you can get access to another PC up to spec to try the 970 in, then it's the sure fire way to find out
 

r3g3n24

Commendable
Aug 30, 2017
7
0
1,510
I did multiple DDU wipes (To get all drivers off, and then install the 970 ones, then to get the 970 drivers off and install the 650 ones in order to test it, and then back to the 970 drivers)

Other than Aida64 voltage meters I'm not sure how to test my PSU. The only interesting thing I noticed was that under a gpu stress test, the +12V dropped to around 11.952-11.880 but I'm not entirely sure how important that is, or if it's meant to do that.

I've tried reaching out to a friend and he might be able to help me install the 970 in his PC, but I'm not entirely sure when that's going to happen.
 

EpIckFa1LJoN

Admirable
That's extremely interesting... I've never heard of a problem where leaving it to idle alleviates the symptoms.....

Try playing some of the games with integrated graphics, the FPS is going to suck but if it doesn't crash, it's definitely the GPU (or possibly the PSU but that is less likely). And since you ruled out drivers.... the next step would be to give it a thorough cleaning with a can of air.

If cleaning the GPU doesn't work... it's not a good situation and it most likely will need to be replaced.
If none of the above works I would take it to a computer repair store so they can drop it in one of their machines and test it. They will be able to tell you if it can be fixed or if you need to RMA or just get a new one altogether.


But yeah... give it a good cleaning.. it may be overheating and leaving it idle lets it cool off a bit, as the booting process does run the chip just a bit and will lead to some overheating if the cooling system is not working properly. If you never clean it it will get clogged with dust and effectively choke it out.
 

r3g3n24

Commendable
Aug 30, 2017
7
0
1,510
I cleaned it a couple of days ago and the temperature readings stayed the same. Overheating was my first guess so the first thing I did was to amp the fans to 100%. The crash persisted, happened in the same timeframe of 2-3 minutes after starting the benchmark tests. Interesting thing to note was when I ran AIDA64 just GPU, the crash happened normally, in the aforementioned timeframe. When I ran AIDA64 GPU + CPU + FPU, it crashed in about 20-30 seconds. Maybe a one time thing. Not sure.

Also not sure if I could blame it on temperature for a whole other reason. Seeing as how I can run low-end games and software for any amount of time without a crash, I'd think that points to the fact that it's not a constant build up of heat that eventually suffocates it.
 

EpIckFa1LJoN

Admirable
What PSU do you have. If it isn't a really high end model it could be degrading by now. If the 12V rail performance is going bad that could cause the system to shut down because it isn't supplying enough power to the MB/CPU. Since the problem only occurs with your higher end GPU and not with your basic GPU that is a possibility and if you are running a program that heavily draws on both it would also shut down faster.


You could also try setting the power management mode in the control panel to the recommended setting (balanced I think) in both Windows control panel and NCP.

It's worth a shot. That actually helped me with some stability issues I had recently.

I may know more once I know the brand and model of PSU you have, but for now try the power management settings, if that doesn't work I would take in the PSU and GPU to get tested.

That 970 draws about 300W at full load, whereas no GPU won't draw anything (obviously) and where the 650 is limited to 75W. It's entirely possible its the PSU.
 

r3g3n24

Commendable
Aug 30, 2017
7
0
1,510
I have a Corsair CX-750M which was purchased with the GPU. So afterall, that 11.880V might be the cause of it? How would that link in with the increased running time after idling? Also, if that's the case, is there any temporary fix? I've also tried lowering the power settings (including Voltage) in Asus GPU Tweak 2 and the crash still happened even at 60% Power.
 

EpIckFa1LJoN

Admirable
The CX series is known to not be very reliable. And if it's between the cheap @ss PSU you have and the high-end GPU, i'm putting my money on the cheap PSU being bad. The other theory is the GPU being bad. So take your pick, or go get both tested.

I, personally will never put another Corsair PSU in my system and I have more of their stuff in my rig than anything... I have 5 ML140 Pro fans, a 750D Case, Sabre RGB mouse, K95 Keyboard, MM800 Polaris mouse pad, VOID headset, Dominator Platinum RAM, and an H115i AIO CPU Cooler. I love their stuff but their PSU's are absolute trash lately, not to mention overpriced.

I had an AX750 from them back from 2013, it worked fine but was the loudest thing in my rig and ran hot. I replaced it with my Seasonic Prime Titanium and the whole rig dropped a few degrees celsius and was quiet as a mouse. I can't imagine the CX series being any better.
 

blocksmith

Prominent
Dec 20, 2017
4
0
510
I’m having the exact same issue and have done a complete fresh windows install on a formatted SSD to rid the problem of any drivers causing the issue entirely. The problem was still there. I went into my event log and it said something about nvlddmkm and there being an error with it and then an Intel kernel power error from me hitting the power switch on my PSU to restart the system. The screen just goes black as soon as the GPU is put under load with no response whatsoever. My PSU is a EVGA 120-G1-0750-RX 80 PLUS GOLD 750W Fully Modular and I have the ability to test my GPU with another power supply and system entirely. I plan on doing this tonight but if there was ever a fix to this I’d love to know.
 

r3g3n24

Commendable
Aug 30, 2017
7
0
1,510


nope, I still have my issue
 
Status
Not open for further replies.

TRENDING THREADS