[SOLVED] Disappearing AMD GPU driver and random crashes even after removing newly added (possibly faulty) RAM and reinstalling Windows

Pirulax

Reputable
Aug 7, 2017
9
0
4,510
Hello!
So as the title says the GPU driver keeps disappearing, after each reboot. So, the driver is successfully installed on a freshly installed Windows, it works as expected, but after a reboot it "disappears" - I can still find the control panel in the Windows Search, but when launching it I'm greeted with the same error as seen here.

So, the story goes as follows:
I got some used RAM from a reputable seller whom I've bought a few things from. I installed the RAM and as expected it didn't work @ XMP 3200, so I tried lowering it to 2933 (the other XMP profile basically), and it still failed to even boot. I kept lowering the frequency until around 2800/2666 when it finally became stable.. or so I thought. In the upcoming month (so until this point) I had numerous random BSODs (All related to memory (Eg.: MEMORY_MANAGEMENT, IRQL_NOT_LESS_OR_EQUAL, and similar). In the meantime I've tried the following:
  • Raise SoC voltage to 1.1V
  • Set ProcODT to 60 Ohm
  • Set DRAM frequency to 2400 MHz and voltage manually to 1.4 (Which is a little higher than the XMP voltage)
  • Swapping memory sticks around after I realized I mixed the two kits in the same channel (Because I thought the slots are in the pattern: ABAB, in reality theyre AABB) - It did help marginally - Now I was able to boot @ 2933 MHz (XMP) but the system was still unstable. Yesterday it crashed and I couldn't even get Windows to boot - So I had to reinstall it
  • Upgraded BIOS to latest version - I did this the same time as the above - Windows booted normally
  • Loaded optimized defaults (Which I believe is the same as taking out the CMOS)
  • Ran sfc /scannow
  • Took out the "new" sticks, and left the old known good ones in - Still having issues, with the same occurrence frequency as previously..
  • Reinstalling Windows 4 times - Each time I had the same issue

Things to note:
  • I have one HDD that has 0% according to HD Sentinel - Although it seemingly works without any issues
  • Previous installation of Windows was an older version (I believe build 2004) and didn't have driver issues
  • GPU doesn't show any signs of death - No random artifacts, nothing
  • As said GPU driver is successfully installed but after a restart I get this.
  • PC was working perfectly for the past year before installing the new memory - I had 2 BSODs at most, memory was running @ 3200 MHz (XMP) no issues
  • New memory kit is the same type, although different memory ICs (Hynix vs "Kingston" - Even though Kingston itself doesn't manufacture DRAM)

Specs:
  • Chieftec 750W - Working fine since I bought it used a few years back
  • Ryzen 5 1600AF
  • Tomahawk MAX (Updated to latest BIOS)
  • 2*8GB HyperX Predator (The "new" kit was the same, 2x8GB paired, so in total I had 32GB)
  • Sapphire RX 570

Drives:
  • Kingston A2000 500GB (System SSD)
  • WD Green 240GB (For VMs and stuff)
  • Seagate 4TB
  • Toshiba 500GB - The one that has 0% health left - But again, before installing RAM it didn't cause issues..
  • Toshiba 1TB

Thank you very much for the help in advance!
 
Solution
Tl;dr; I seems that my NVMe SSD was causing the driver issue... Now I installed Windows on the SATA SSD, and it works.... Old known-good RAM caused BSODs, so now I'm kinda confident it's the CPU or somehow the MB that's giving up on me... I honestly doubt the RAM kits

Edit:
I tried installing back the second 8gb "known-good" module, and the GPU driver refused to load, yet again... Then I removed the module, and it loads now. Interesting.

As said above, and as I said before, mixing RAM is a gamble, it can cause all manner of weird and difficult ro trace and troubleshoot errors and glitches.

Also if the "known-good" stick causes issue and removing it resolves it the RAM might have gone bad in the meantime between all the swaps...

Colif

Win 11 Master
Moderator
could try running this on hdd that showed as 0. Western digital bought toshiba drives so it might see it - https://support.wdc.com/downloads.aspx?p=3&lang=en&i

it seems unlikely you got new bad ram but its always possible.
have you run memtest?
Try running memtest86 on each of your ram sticks, one stick at a time, up to 4 passes. Only error count you want is 0, any higher could be cause of the BSOD. Remove/replace ram sticks with errors.
Memtest is created as a bootable USB so that you don’t need windows to run it
 

Pirulax

Reputable
Aug 7, 2017
9
0
4,510
I've ran memtest when I installed the memory for an hour or two, and it showed 0 errors.

But something just occurred to me - So I had mounted an air cooler with kind of a home-made mount.. It might've applied too much mounting pressure, is it possible that damaged the cpu/motherboard? Or is it possible all these issues are caused by incorrect mounting pressure? (Now I have the stock cooler installed)
Also, when I installed the memory I had to turn around the cooler (that is, now the clip is on the opposite side).. I'll try reseating the CPU, maybe that'll help.
 
Last edited:

Satan-IR

Splendid
Ambassador
Not sure about PSU quality and quite a few drives connected , one of which is iffy. I would unplug the iffy HDD (SATA power and data) and see how system behaves.

Are you sure that 0% health drive is not causing issue? Read it's SMART data with CrystalIDskInfo or HD Tune as well and see if they show something. Take a shot upload to imgur.com and post links.

RAM slots might have been damaged between all the swaps? Are they clean inside? No dust or derbis and such in the slots?

Also mixing RAM that didn't come a "kit" in one package is a gamble.

Do you have any minidumps dropped when BSODs happen?

Yes too much or uneven pressure can bend omse of the the CPU socket pins and after that contact might become intermittent. Sometimes RAM slot becomes tricky and even the RAM installed is not recognized properly although it is working. Bent pins can cause issues and crashes.

When you want to reseat the CPU carefully take it out of the socket and see if you can see anu bent pins. Also you can take a few high-resolution pictures of the pins from differnt angles upload to imgur.com and post links here so we can see how the pins are.
 
  • Like
Reactions: Pirulax

Pirulax

Reputable
Aug 7, 2017
9
0
4,510
Thank you for the reply.

Not sure about PSU quality and quite a few drives connected , one of which is iffy. I would unplug the iffy HDD (SATA power and data) and see how system behaves.
The issues started happening when I installed the second kit of memory, thus I think the PSU, GPU can be safely isolated.

I took the CPU out, saw no bent pins (as expected, I it wasnt taken out since it was installed), re-seated it, re-installed the stock cooler.

RAM slots might have been damaged between all the swaps? Are they clean inside? No dust or derbis and such in the slots?
I knew mixing memory is a gamble, but I was expecting it to work (as I did it in the past, and it worked without issues), worst case scenario I expected is that I'll have to lower the frequency.. Well... How wrong I was...
There's no debris/dust inside the slots, they're clean.

Do you have any minidumps dropped when BSODs happen?
Most likely yes, but I re-installed Windows 10 times today, so they're all lost.

Yes too much or uneven pressure can bend omse of the the CPU socket pins and after that contact might become intermittent
"omse", I assume you meant "most"? So, don't get me wrong - I installed the CPU, dropped it in place, closed the latch, all no issues. Then I installed an air cooler, but using the old-school AM3 click mount, now as you might imagine it's quite a hacky way, but it worked for a year (until I installed the RAM, and thats when all these issues began)
So I didn't bend any of the pins, or anything like that (Especially considering that the PC was working fine for a year with that setup)

Now I managed to fix the driver issue, here's how:
I didn't mention yesterday my Windows got corrupted. I assumed it was because of the RAM, but it seems like it was actually my NVMe SSD. Here's what I did:
  • Run AIDA stress test - Failed -> Raised SoC voltage to 1.1V -> Passed for 25 minutes (I stopped it)
  • Take out NVMe SSD
  • Start installing Windows on SATA SSD
  • BSOD - Memory Management - Here I took out 1 stick of RAM (now I have a single 8gb module in... it's painful)
  • Finish Installing Windows
  • Restart a few times to see if GPU driver stays - It did..
  • Update GPU driver
  • Profit

A few questions still remain:
  • Was it my NVMe SSD all along? The BSODs started happening when I installed the new RAM.. Is it possible somehow the faulty RAM also damaged my NVMe SSD? I doubt it.
  • Why am I getting BSOD even with know good sticks of memory? I raised SoC voltage to 1.1, and ProcODT to 60 ohm, but I don't think either values degraded/damaged the CPU.. Now RAM wont work properly unless SoC is raised from the BIOS default (which is 0.88V).. Maybe I have to raise RAM voltage to 1.35 as well? (Even though I'd expect it to run 2400 MHz @ 1.25V no issues)
  • Maybe my MB/CPU got damaged by incorrect mounting pressure? I had to re-install the cooler when I installed the RAM (because it blocks the RAM slots) - Doubt it TBH... I didn't force it too much

What I'll do:
  • Put in all sticks of memory and run memtest for 8 hours (while I'm in school.. finally the time spent in school is useful)
  • Run HD Sentinel on the NVMe SSD - For whatever reason if I install Windows on it, the GPU driver just "disappears".. I also get BSOD 0x21 after running DDU.. Maybe the CPUs PCIe controller got damaged somehow? That would explain why the GPU driver kept not working... but still wouldn't explain why removing the SSD helped.. it's probably the SSD


Sorry for the long text, but I had to write it all down, perhaps someone had/has the same issue, and it will help. (Honestly I don't wish anyone this issue.. it's so stressing to fix it.. I reinstalled Windows at least 10 times over the past day)

Tl;dr; I seems that my NVMe SSD was causing the driver issue... Now I installed Windows on the SATA SSD, and it works.... Old known-good RAM caused BSODs, so now I'm kinda confident it's the CPU or somehow the MB that's giving up on me... I honestly doubt the RAM kit that was working broke over the last month since I installed the RAM.

Sorry for any typos, I'm tried.
 

Pirulax

Reputable
Aug 7, 2017
9
0
4,510
Thank you for the reply.

Not sure about PSU quality and quite a few drives connected , one of which is iffy. I would unplug the iffy HDD (SATA power and data) and see how system behaves.
The issues started happening when I installed the second kit of memory, thus I think the PSU, GPU can be safely isolated.

I took the CPU out, saw no bent pins (as expected, I it wasnt taken out since it was installed), re-seated it, re-installed the stock cooler.

RAM slots might have been damaged between all the swaps? Are they clean inside? No dust or derbis and such in the slots?
I knew mixing memory is a gamble, but I was expecting it to work (as I did it in the past, and it worked without issues), worst case scenario I expected is that I'll have to lower the frequency.. Well... How wrong I was...
There's no debris/dust inside the slots, they're clean.

Do you have any minidumps dropped when BSODs happen?
Most likely yes, but I re-installed Windows 10 times today, so they're all lost.

Yes too much or uneven pressure can bend omse of the the CPU socket pins and after that contact might become intermittent
"omse", I assume you meant "most"? So, don't get me wrong - I installed the CPU, dropped it in place, closed the latch, all no issues. Then I installed an air cooler, but using the old-school AM3 click mount, now as you might imagine it's quite a hacky way, but it worked for a year (until I installed the RAM, and thats when all these issues began)
So I didn't bend any of the pins, or anything like that (Especially considering that the PC was working fine for a year with that setup)

Now I managed to fix the driver issue, here's how:
I didn't mention yesterday my Windows got corrupted. I assumed it was because of the RAM, but it seems like it was actually my NVMe SSD. Here's what I did:
  • Take out NVMe SSD
  • Start installing Windows on SATA SSD
  • BSOD - Memory Management - Here I took out 1 stick of RAM (now I have a single 8gb module in... it's painful)
  • Finish Installing Windows
  • Restart a few times to see if GPU driver stays - It did..
  • Update GPU driver
  • Profit

A few questions still remain:
  • Was it my NVMe SSD all along? The BSODs started happening when I installed the new RAM.. Is it possible somehow the faulty RAM also damaged my NVMe SSD? I doubt it.
  • Why am I getting BSOD even with know good sticks of memory? I raised SoC voltage to 1.1, and ProcODT to 60 ohm, but I don't think either values degraded/damaged the CPU
  • Maybe my MB/CPU got damaged by incorrect mounting pressure? I had to re-install the cooler when I installed the RAM (because it blocks the RAM slots) - Doubt it TBH... I didn't force it too much

What I'll do:
  • Put in all sticks of memory and run memtest for 8 hours (while I'm in school.. finally the time spent in school is useful)
  • Run HD Sentinel on the NVMe SSD - For whatever reason if I install Windows on it, the GPU driver just "disappears".. I also get BSOD 0x21 after running DDU.. Maybe the CPUs PCIe controller got damaged somehow? That would explain why the GPU driver kept not working... but still wouldn't explain why removing the SSD helped.. it's probably the SSD


Sorry for the long text, but I had to write it all down, perhaps someone had/has the same issue, and it will help. (Honestly I don't wish anyone this issue.. it's so stressing to fix it.. I reinstalled Windows at least 10 times over the past day)
Tl;dr; I seems that my NVMe SSD was causing the driver issue... Now I installed Windows on the SATA SSD, and it works.... Old known-good RAM caused BSODs, so now I'm kinda confident it's the CPU or somehow the MB that's giving up on me... I honestly doubt the RAM kits

Edit:
I tried installing back the second 8gb "known-good" module, and the GPU driver refused to load, yet again... Then I removed the module, and it loads now. Interesting.
 

Satan-IR

Splendid
Ambassador
Tl;dr; I seems that my NVMe SSD was causing the driver issue... Now I installed Windows on the SATA SSD, and it works.... Old known-good RAM caused BSODs, so now I'm kinda confident it's the CPU or somehow the MB that's giving up on me... I honestly doubt the RAM kits

Edit:
I tried installing back the second 8gb "known-good" module, and the GPU driver refused to load, yet again... Then I removed the module, and it loads now. Interesting.

As said above, and as I said before, mixing RAM is a gamble, it can cause all manner of weird and difficult ro trace and troubleshoot errors and glitches.

Also if the "known-good" stick causes issue and removing it resolves it the RAM might have gone bad in the meantime between all the swaps and all. You can try cleaning the pins on the RAM modules/sticks with an ordinary soft nonabrasive pencil eraser.

Definitely do what colif said. You said you "ran memtest for an hour or two". You mean you didn't let it finish testing on the default setting? If so, you have to set it as colif said above and let it finish, Might take a few hours so you can do it over night.
 
  • Like
Reactions: Pirulax
Solution

Pirulax

Reputable
Aug 7, 2017
9
0
4,510
Hello!

So, it turns, as you speculated, it was the "known-good" memory that somehow went wrong, but I doubt it was because of swapping, it probably was went bad even before I started swapping it.
I've never seen such a weird issue.. GPU driver doesn't load because of faulty RAM...
But, just to remind you, the "known-good" memory was actually working just fine before I accidentally mixed it in the same channel with the other kit, so not sure what happened.
Either way, thank you all. I'll mark your reply as the answer, as you got it right (Even though I figured it out before your reply), but thank you all very much for taking the time to try and help me!