BSOD's, Shutdowns, and Driver Failures after attempting a GPU overclock. Please Help!

MikeIzEpic

Reputable
Nov 15, 2015
48
0
4,540
A bit lengthy of a post, but I'm in desperate need of help.

So yesterday I decided to give overclocking a shot on my EVGA GTX 980 SC ACX 2.0 GPU, after seeing Paul's Hardware do it. I started with a modest overclock, and ran Firestrike. No issues. Increased the clock. No issues. Increased again up to 80Mhz, crash. Decreased to 70Mhz and worked like a charm. I ran it a few times just to make sure it worked, and I had no issues.

I then booted up AC Unity, excited to see my 2 extra FPS I'd gain. It ran stable for about 10 minutes, and then the drivers crashed. I lowered the clock to 60Mhz, ran AC Unity, it ran stable again for about 10 minutes, then BSOD (Error kernel_security_check_failure).

Frustrated, I ditched the overclock all together. The game again ran for about 10 minutes, then BSOD (I forgot the error, but I know it wasn't kernel_security_check_failure). I then proceeded to reinstall the drivers, and try again. About 10 minutes in, the drivers failed.

This was last night, and after the last driver crash I gave up for the night. Come today, I contacted EVGA about the issue, and was told there could be 3 potential issues:

1. Power Issues
2. Corrupted Drivers / Software Issues
3. Issues with the GPU itself

I then went on a spree of troubleshooting, trying to get this sorted out. Going in order of potential issues:

1. I'm not 100% sure how to check voltages, the guy from EVGA said to look for the 12 Volt in the BIOS, which I managed to find, but there was no fluctuation what so ever in the 5 minutes I stared at it. No issue here maybe? I still consider this a possible cause.

2. I uninstalled the GPU entirely from my Device Manager, downloaded DDU and ran it in Safe Mode to ensure everything was wiped, booted back into Windows, and downloaded Nvidia Driver 362.00 (The newer ones are causing issues with multiple people as I'm sure you all know). The problem still persisted after doing all of this, so it can't be the drivers.

3. I don't have another system to test the card in, so I can't be positive if it is the card. This would be the easiest solution, but is not possible for me to do.


After doing all I could, I ran GPUTest ( The stress test with the giant donut looking thing ) at 2560x1080 ( My resolution ), in Fullscreen with AA on MSAA 8x. This ran for close to 5 hours with absolutely 0 problems. I also attempted to use OC Scanner from EVGA, but launching the sress test just made my screen go black with the overlay showing GPU usage and temperature. not sure what's up with that.

Feeling confident clean installing the drivers fixed it, I launched AC Unity, and within 10 minutes, my screen went black, it displayed the "No Signal" message, and something new happened. The LED's on both my KB/M went out. (Both are plugged into my MOBO's (Gigabyte 970A-DS3P) on board USB Ports if that means anything to the situation). This led me to thinking maybe it's the MOBO or PSU (Antec 750M HCG 80+ Bronze) causing the issue. But I realized, the system itself was still running. The lights in the system were still on, the fans were spinning, everything was working just fine. just the monitor was black, and the KB/M LED's were out. After about 20 seconds, it went back to the logging in screen and it was like the computer restarted, just it never lost power.

That got me thinking, the only game I had tested was AC Unity, maybe that was the issue? lol no. AC Syndicate ran for 10 minutes, then I got a BSOD (Error Memory_Management)

I don't know what the issue could be 100% at this point. If I had to guess, I'd say either the GPU, MOBO, or PSU, which limits it a bit, but doesn't make things too easy. Another thing I thought could be maybe the PC isn't getting enough power from the wall outlet, but I'm not an electrician, so I don't know. Is there any way I could test the parts without another system on hand? Any kind of software monitoring or stress tests or whatever would be great. I just don't know at this point. Any game I've tried playing crashes or BSOD within 10 minutes, but a full on stress test had perfect results in the 5 hours it was running. Any help would be appreciated.

List of System Specs:

FX-8350 | No OC
CM Hyper 212 EVO
Gigabyte 970A-DS3P
16GB DDR3
EVGA GTX 980 SC ACX 2.0
3 TB WD Blue + 120 GB Sandisk SSD Plus
Antec 750M HCG 80+ Bronze PSU

Please help!
 

MikeIzEpic

Reputable
Nov 15, 2015
48
0
4,540


About 20 minutes ago I actually figured out it was the GPU. I swapped in an old 960, ran everything I was trying to run with the 980 and the 960 had no issues at all. Going to RMA the card tomorrow.
 
When changing hardware be sure to go to bios and reset it to defaults. This will force the system to rescan and re assign hardware resources. It sends the database to windows. If you do not reset the bios windows may get a old database and be blocked from using key hardware resources when it detects the new hardware.