[SOLVED] Nvidia Drivers are Crashing my Computer. Desperate for assistance

Status
Not open for further replies.

Bholt91

Distinguished
Dec 26, 2012
8
2
18,515
First let me apologize, I don't post here often, please let me know if I need to move this thread if this is the wrong section.

The System:
I built a computer about a decade ago and have been incrementally adding on to it since, upgrades here and there. The most recent upgrade was back in March, where I replaced an old i5 4690k with one of the new Ryzen cpu's. So that also ofcourse meant new motherboard, RAM, and I threw in a NVME SSD for fun. Ever since March the computer has ran just fine with none of the issues described below until about 3 weeks ago.

PCPartPicker Part List

CPU: AMD Ryzen 7 3700X 3.6 GHz 8-Core Processor ($289.99 @ Amazon)
CPU Cooler: Noctua NH-D15 82.5 CFM CPU Cooler ($89.95 @ Amazon)
Motherboard: Asus TUF GAMING X570-PLUS (WI-FI) ATX AM4 Motherboard ($189.99 @ B&H)
Memory: G.Skill Ripjaws V 16 GB (2 x 8 GB) DDR4-3600 CL16 Memory ($76.98 @ Amazon)
Storage: Samsung 860 Evo 500 GB 2.5" Solid State Drive ($69.98 @ Amazon)
Storage: Samsung 860 Evo 1 TB 2.5" Solid State Drive ($139.93 @ Amazon)
Storage: Crucial P1 1 TB M.2-2280 NVME Solid State Drive ($104.99 @ Adorama)
Video Card: MSI GeForce GTX 1070 8 GB Video Card ($639.00 @ Amazon)
Case: Fractal Design Define R5 ATX Mid Tower Case ($123.99 @ Amazon)
Power Supply: SeaSonic 750 W 80+ Gold Certified Fully Modular ATX Power Supply
Total: $1724.80
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2020-09-01 13:21 EDT-0400


The Issue and Symptoms:
So, whenever I have Nvidia drivers installed, my computer will crash and reboot intermittently and frequently. There have been some days where it plays fine all day long only for it to then start crashing. Before the crash and reboot the gpu drivers will crash and recover, so my screen will go black, and then I'll see some strange artifacting. If they recover then it will be caught in a loop with video drivers crashing about every 5 to 10 seconds before my pc just suddenly powers off and reboots. I do not get anything significant in Event Viewer other than Event ID 41 which only says the system suddenly lost power. Every now and then I've gotten a blue screen that says Internal video scheduler error.

Once it reboots for the first time. it will struggle hard after that, frequently crashing video drivers and rebooting immediately after logging into Windows. After a few crash and reboot loops, it will even have trouble posting and just sit on a black screen until I force power off with the power button.

Nothing in particular seems to trigger the crashing, though it does seem to happen the most when playing a game or watching youtube videos or just general web browsing. Stress testing and putting the hardware under load does not seem to trigger the issue.

Uninstalling video drivers makes all symptoms vanish.

Troubleshooting I've Done:
For now let me say that I am almost certain this must be a software related issue, I have exhausted myself for the past 3 weeks troubleshooting hardware and I've found it all to be good. But I am happy to take other suggestions, willing to try anything to get this to work. The thing that makes me further believe this issue is stemming from Nvidia software is because I swapped my GPU with my brother. So he is currently using my GTX 1070 and I am working with his AMD RX 570. The 1070 works fine in his system and and the 570 works fine in mine. No more crashing, no more rebooting (at least not yet, I've been using it for about 24 hours now)

Now for the things I've done:
  • Uninstalled video drivers with DDU
    • Rolled back up to 10 versions of video drivers, some of them choosing a clean install with the driver only / no geforce experience
  • Clean install of Windows (several times, on multiple drives)
    • Rolled back to Windows 10 Version 1909 as I read online that this was more stable
  • Replaced PSU with a known working one, both work fine in other systems and the issue persists here.
  • Replaced GPU with a gtx 1050, both the 1070 and 1050 work in other systems but the issue persists here and I can recreate the same crashing with the 1050 as it is also using Nvidia's drivers
    • As I've mentioned above, I also swapped my 1070 with my brothers rx 570. I'm currently using it seems like the issue is resolved, however, the 1070 is fine, it works well in other systems. This would lead me to further believe the issue is with the Nvidia software not agreeing with my system in some way.
  • Replaced RAM with known working sticks. Also ran memtest for several passes as well as trying 1 stick in each slot. All of my memory modules work fine in another system.
  • Replaced the motherboard completely as my last ditch effort was the blame a faulty motherboard, this did not resolve the issue.
  • Replaced motherboard/cpu/ram with my old i5 and ddr3 RAM from before my upgrade in march. This has resolved the issue, the psu and gpu work happily with my old components just as it did for the past few years.
  • Symptoms do not occur under load. Prime95, furmark, occt do not cause the issues and cpu and gpu are not overheating.
  • Disabled Vsync, Gsync, Altered between between different power management options.
Brief Conclusions:
I've tried quite a few things in the past few weeks so there may be more I need to add to the list above but it is what comes to mind for now. The only thing I have not truly been able to troubleshoot is the 3700x as I do not have another am4 cpu, however it seems to be working flawless right now as I have the rx 570 installed, it does not seem to be a faulty cpu, and all other pc components work just fine in another system. The only thing I feel like I can be certain of is that the issue is coming from Nvidia drivers. Its also just mind boggling to me that this can only happen while drivers are installed, and after all the hardware troubleshooting I've done it yells at me that its a purely software related issue.

I am at my wits end and seriously debating going back to Intel.

Thanks for any insight and let me know if I can provide more information.
 
Solution
First let me apologize, I don't post here often, please let me know if I need to move this thread if this is the wrong section.

The System:
I built a computer about a decade ago and have been incrementally adding on to it since, upgrades here and there. The most recent upgrade was back in March, where I replaced an old i5 4690k with one of the new Ryzen cpu's. So that also ofcourse meant new motherboard, RAM, and I threw in a NVME SSD for fun. Ever since March the computer has ran just fine with none of the issues described below until about 3 weeks ago.

PCPartPicker Part List

CPU: AMD Ryzen 7 3700X 3.6 GHz 8-Core Processor ($289.99 @ Amazon)
CPU Cooler: Noctua NH-D15 82.5 CFM CPU Cooler...
First let me apologize, I don't post here often, please let me know if I need to move this thread if this is the wrong section.

The System:
I built a computer about a decade ago and have been incrementally adding on to it since, upgrades here and there. The most recent upgrade was back in March, where I replaced an old i5 4690k with one of the new Ryzen cpu's. So that also ofcourse meant new motherboard, RAM, and I threw in a NVME SSD for fun. Ever since March the computer has ran just fine with none of the issues described below until about 3 weeks ago.

PCPartPicker Part List

CPU: AMD Ryzen 7 3700X 3.6 GHz 8-Core Processor ($289.99 @ Amazon)
CPU Cooler: Noctua NH-D15 82.5 CFM CPU Cooler ($89.95 @ Amazon)
Motherboard: Asus TUF GAMING X570-PLUS (WI-FI) ATX AM4 Motherboard ($189.99 @ B&H)
Memory: G.Skill Ripjaws V 16 GB (2 x 8 GB) DDR4-3600 CL16 Memory ($76.98 @ Amazon)
Storage: Samsung 860 Evo 500 GB 2.5" Solid State Drive ($69.98 @ Amazon)
Storage: Samsung 860 Evo 1 TB 2.5" Solid State Drive ($139.93 @ Amazon)
Storage: Crucial P1 1 TB M.2-2280 NVME Solid State Drive ($104.99 @ Adorama)
Video Card: MSI GeForce GTX 1070 8 GB Video Card ($639.00 @ Amazon)
Case: Fractal Design Define R5 ATX Mid Tower Case ($123.99 @ Amazon)
Power Supply: SeaSonic 750 W 80+ Gold Certified Fully Modular ATX Power Supply
Total: $1724.80
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2020-09-01 13:21 EDT-0400


The Issue and Symptoms:
So, whenever I have Nvidia drivers installed, my computer will crash and reboot intermittently and frequently. There have been some days where it plays fine all day long only for it to then start crashing. Before the crash and reboot the gpu drivers will crash and recover, so my screen will go black, and then I'll see some strange artifacting. If they recover then it will be caught in a loop with video drivers crashing about every 5 to 10 seconds before my pc just suddenly powers off and reboots. I do not get anything significant in Event Viewer other than Event ID 41 which only says the system suddenly lost power. Every now and then I've gotten a blue screen that says Internal video scheduler error.

Once it reboots for the first time. it will struggle hard after that, frequently crashing video drivers and rebooting immediately after logging into Windows. After a few crash and reboot loops, it will even have trouble posting and just sit on a black screen until I force power off with the power button.

Nothing in particular seems to trigger the crashing, though it does seem to happen the most when playing a game or watching youtube videos or just general web browsing. Stress testing and putting the hardware under load does not seem to trigger the issue.

Uninstalling video drivers makes all symptoms vanish.

Troubleshooting I've Done:
For now let me say that I am almost certain this must be a software related issue, I have exhausted myself for the past 3 weeks troubleshooting hardware and I've found it all to be good. But I am happy to take other suggestions, willing to try anything to get this to work. The thing that makes me further believe this issue is stemming from Nvidia software is because I swapped my GPU with my brother. So he is currently using my GTX 1070 and I am working with his AMD RX 570. The 1070 works fine in his system and and the 570 works fine in mine. No more crashing, no more rebooting (at least not yet, I've been using it for about 24 hours now)

Now for the things I've done:
  • Uninstalled video drivers with DDU
    • Rolled back up to 10 versions of video drivers, some of them choosing a clean install with the driver only / no geforce experience
  • Clean install of Windows (several times, on multiple drives)
    • Rolled back to Windows 10 Version 1909 as I read online that this was more stable
  • Replaced PSU with a known working one, both work fine in other systems and the issue persists here.
  • Replaced GPU with a gtx 1050, both the 1070 and 1050 work in other systems but the issue persists here and I can recreate the same crashing with the 1050 as it is also using Nvidia's drivers
    • As I've mentioned above, I also swapped my 1070 with my brothers rx 570. I'm currently using it seems like the issue is resolved, however, the 1070 is fine, it works well in other systems. This would lead me to further believe the issue is with the Nvidia software not agreeing with my system in some way.
  • Replaced RAM with known working sticks. Also ran memtest for several passes as well as trying 1 stick in each slot. All of my memory modules work fine in another system.
  • Replaced the motherboard completely as my last ditch effort was the blame a faulty motherboard, this did not resolve the issue.
  • Replaced motherboard/cpu/ram with my old i5 and ddr3 RAM from before my upgrade in march. This has resolved the issue, the psu and gpu work happily with my old components just as it did for the past few years.
  • Symptoms do not occur under load. Prime95, furmark, occt do not cause the issues and cpu and gpu are not overheating.
  • Disabled Vsync, Gsync, Altered between between different power management options.
Brief Conclusions:
I've tried quite a few things in the past few weeks so there may be more I need to add to the list above but it is what comes to mind for now. The only thing I have not truly been able to troubleshoot is the 3700x as I do not have another am4 cpu, however it seems to be working flawless right now as I have the rx 570 installed, it does not seem to be a faulty cpu, and all other pc components work just fine in another system. The only thing I feel like I can be certain of is that the issue is coming from Nvidia drivers. Its also just mind boggling to me that this can only happen while drivers are installed, and after all the hardware troubleshooting I've done it yells at me that its a purely software related issue.

I am at my wits end and seriously debating going back to Intel.

Thanks for any insight and let me know if I can provide more information.

Sounds like you've been having a bit of a nightmare with the system....

A few things spring to mind - whilst the 1070 is working well in other systems (and with your older hardware), it does sound like its only been tested when running with slower cpu's. This likely holds the gpu back a little which would take pressure off of it whilst with a 3700X it will be running flat out. If you have had it for a few years the thermal paste on the gpu might have started to dry out a bit so applying new paste and giving the cooler a good clean may help. I would also look at gpu temps whilst gaming as it might be the flat out 100% load that is pushing it a bit hard.

Another thing occurs to me, with a 3700X + X570 you are running the 1070 is a PCIe 4.0 capable slot, have you tried changing the PCIe bus setting in the bios back to PCIe 3.0 to match the card?

Other things that can cause issues with Ryzen are memory settings / timings - it might be worth trying restoring the bios to factory defaults (which will put DDR4 speed back to base spec, 2400 mhz) and seing if that makes the system more stable. It shouldn't make a difference but might be worth a try....
 
  • Like
Reactions: Bholt91
Solution

Bholt91

Distinguished
Dec 26, 2012
8
2
18,515
Sounds like you've been having a bit of a nightmare with the system....

A few things spring to mind - whilst the 1070 is working well in other systems (and with your older hardware), it does sound like its only been tested when running with slower cpu's. This likely holds the gpu back a little which would take pressure off of it whilst with a 3700X it will be running flat out. If you have had it for a few years the thermal paste on the gpu might have started to dry out a bit so applying new paste and giving the cooler a good clean may help. I would also look at gpu temps whilst gaming as it might be the flat out 100% load that is pushing it a bit hard.

Another thing occurs to me, with a 3700X + X570 you are running the 1070 is a PCIe 4.0 capable slot, have you tried changing the PCIe bus setting in the bios back to PCIe 3.0 to match the card?

Other things that can cause issues with Ryzen are memory settings / timings - it might be worth trying restoring the bios to factory defaults (which will put DDR4 speed back to base spec, 2400 mhz) and seing if that makes the system more stable. It shouldn't make a difference but might be worth a try....

Thank you so much for the quick reply, My GPU hits a max of about 80c which is a little warm but seems to be normal for the 1000 and 2000 series, correct me if I'm wrong. I'll will crack it open and give it a clean either way, but this issue also seems to happen while idling around on the desktop and repeatedly happens while the gpu is under no load. Also the 1050 is rather new compared to the 1070, but I'm able to recreate the exact same issues with that card too. Its only been seemingly flawless with this AMD card so far.

Your explanation sounds very promising, but one thought that occurs to me is why hasn't this been an issue during the 5 or 6 months I've used it since the initial upgrade?

I did not think of changing the pcie settings in bios and that does sound promising, I will give that a go once I have a moment.

I did not mention in my post but resetting bios and disabling all overclocks is something I have tried already (for both motherboards)

Thanks again for the suggestions, I will give an update as soon as I have time to work on the computer again.
 
  • Like
Reactions: cdrkf
Thank you so much for the quick reply, My GPU hits a max of about 80c which is a little warm but seems to be normal for the 1000 and 2000 series, correct me if I'm wrong. I'll will crack it open and give it a clean either way, but this issue also seems to happen while idling around on the desktop and repeatedly happens while the gpu is under no load. Also the 1050 is rather new compared to the 1070, but I'm able to recreate the exact same issues with that card too. Its only been seemingly flawless with this AMD card so far.

I did not think of changing the pcie settings in bios and that does sound promising, I will give that a go once I have a moment.

I did not mention in my post but resetting bios and disabling all overclocks is something I have tried already (for both motherboards)

Thanks again for the suggestions, I will give an update as soon as I have time to work on the computer again.

I was just trying to work out what about that system could be different to the other machine - I run a GTX 1070 on a Ryzen machine with no issue, but mine is a first gen R1600 on a B350 mobo, which only has PCIe 3.0 support and probably doesn't push the card so much.

One other thought actually - as you are on X570, have you installed the official X570 chipset drivers from AMD? I know with my B350 mobo I had problems with the default chipset drivers installed by Windows 10...
 

Bholt91

Distinguished
Dec 26, 2012
8
2
18,515
I was just trying to work out what about that system could be different to the other machine - I run a GTX 1070 on a Ryzen machine with no issue, but mine is a first gen R1600 on a B350 mobo, which only has PCIe 3.0 support and probably doesn't push the card so much.

One other thought actually - as you are on X570, have you installed the official X570 chipset drivers from AMD? I know with my B350 mobo I had problems with the default chipset drivers installed by Windows 10...

I have installed the AMD chipset drivers, they were on the support page for my motherboard. I also downloaded them again and installed to be doubly sure

So just now I set my pcie slot to Gen 3 in bios. Used DDU to remove AMD drivers, dropped the 1050 back in (my brother is still happily chugging along with the 1070). I reinstalled the latest Nvidia drivers, rebooted, and was met with gpu driver crashing almost immediately after signing in, before even opening any application. My issues still persist =[.
 

Bholt91

Distinguished
Dec 26, 2012
8
2
18,515
I may have a working solution.. but its not ideal, a band-aid at best. Would still like to discover whats going on here.

I put my gtx cards in the second pcie slot, and everything works now, though running at x4 instead of its full speed. I'm starting to see the pain in the early adoption of the 570 boards. Maybe there are further bios settings I could set to resolve this but I'm ignorant on that kind of extensive configuration, not sure where to begin.

Small recap:
So I know all 3 of the graphics cards I've used are good, though I do not know much about the RX 570. The two Nvidia cards do not work in slot 1, only if video drivers are installed I'm assuming this is because the cards need drivers to make full potential of the x16 slot. But the AMD card has no issues in the first slot, at least not in the 24 hours of stress testing I had with it, whereas the Nvidia cards almost start lagging up and crashing right away. If the RX 570 works fine in this slot wouldn't that tell me the pci slot has not gone bad?
I've used the 1070 for a good 5 months before the issue started to occur, and given the other hardware testing I did, I felt like it came down to either the motherboard or cpu being bad, and out of the two I figured the motherboard would be the most likely thing.

Still hoping for a happy software solution, I would want to believe that my 3700x is fine if its got no issues under load and with another gpu in the first slot.

Again, appreciate any help or insight I can get on this.

Edit and Update: So this temporary fix isn't a fix. Just this morning I've crashed and rebooted twice, though its definitely far less frequent than if the cards were in the first PCIe slot.

Another Update: Marking this one as resolved. As sad as it is, my cpu is faulty, almost positively.

This morning the crashing was happening consistently, more so than usual, after a bit of troubleshooting, the computer fails to post altogether. It would struggle with posting and suddently reboot even during post, and gave several false positives, the error codes kept pointing towards memory, but the memory is still good.

Now the motherboard is consistently pointing at the CPU as the fault, and it does this in both 570 boards.

A sad story indeed.
 
Last edited:
  • Like
Reactions: cdrkf

Bholt91

Distinguished
Dec 26, 2012
8
2
18,515
Yeah it does sound like the CPU - as it's fairly new can you get a replacement via RMA?

Yes, I am working with AMD support now to send it in. I'm hoping it will be little trouble.

I've never had a cpu that would up and die on me when it appeared to be fine the day I got it. Since upgrading to the motherboard and ryzen I had not opened or configured my computer in any way other than the default xmp overclock for my memory, I saw no reason to try and overclock the cpu itself.

I hope you don't mind if I ask you one last inane question. Is there anything I may have done to cause this? Again, no overclocks and temps have always been good and the case hadn't been opened at all until the system instability started. My power flickered a couple a times a few weeks prior, but my system is on a surge protector. I don't see any way of avoiding unpredictable power outages. I want to prevent it but I guess its just bad luck.
 
Yes, I am working with AMD support now to send it in. I'm hoping it will be little trouble.

I've never had a cpu that would up and die on me when it appeared to be fine the day I got it. Since upgrading to the motherboard and ryzen I had not opened or configured my computer in any way other than the default xmp overclock for my memory, I saw no reason to try and overclock the cpu itself.

I hope you don't mind if I ask you one last inane question. Is there anything I may have done to cause this? Again, no overclocks and temps have always been good and the case hadn't been opened at all until the system instability started. My power flickered a couple a times a few weeks prior, but my system is on a surge protector. I don't see any way of avoiding unpredictable power outages. I want to prevent it but I guess its just bad luck.

There is nothing that you have described that jumps out at me as a cause. I have had faulty new components over the years so it can happen. Hopefully AMD will sort you out a replacement quickly.
 
Status
Not open for further replies.