Question 13-month old system becoming quite unstable but I can't figure out why ?

Kirbyarm

Distinguished
Nov 9, 2013
298
9
18,785
Around November 2022, I assembled a new Windows PC with its primary intent being for gaming and general consumer use. Up until very recently, I can't think of any significant issues it had leading up to this. On December 29th 2023, I had a rather nasty BSOD that seemed to occur when I was playing a video game. It feels like this is about when system stability started to rapidly decline.

122923-8140-01.dmp 12/29/2023 2:18:56 PM ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY 0x000000fc ffffd085`260db010 8a000010`470009e3 ffff8e80`631668c0 00000000`00000003 nvlddmkm.sys nvlddmkm.sys+9af903 x64 ntoskrnl.exe+3fbca0 C:\Windows\Minidump\122923-8140-01.dmp 32 15 19041 7,301,892 12/29/2023 2:19:41 PM

Upon restarting after this BSOD it struggled for a good 30-45 minutes with reboots and attempted self-auto repairs and disk checks etc. before Windows 10 finally booted back to the desktop. Not long after that, BSOD. Since then, on that installation of Windows, it BSOD'd one other time that I can see/recall according to BlueScreenView.

122923-6234-01.dmp 12/29/2023 2:19:55 PM SYSTEM_SERVICE_EXCEPTION 0x0000003b 00000000`c0000005 fffff801`5d6142b8 fffff88a`9b496ce0 00000000`00000000 ntoskrnl.exe ntoskrnl.exe+3fbca0 NT Kernel & System Microsoft® Windows® Operating System Microsoft Corporation 10.0.19041.3803 (WinBuild.160101.0800) x64 ntoskrnl.exe+3fbca0 C:\Windows\Minidump\122923-6234-01.dmp 32 15 19041 3,467,684 12/29/2023 2:27:13 PM

123023-6531-01.dmp 12/30/2023 9:37:03 AM DRIVER_IRQL_NOT_LESS_OR_EQUAL 0x000000d1 0000000f`e8fe7050 00000000`000000ff 00000000`00000000 fffff805`c21e1912 hiber_storport.sys hiber_storport.sys+1912 x64 ntoskrnl.exe+3fbca0 C:\Windows\Minidump\123023-6531-01.dmp 32 15 19041 124,416 12/30/2023 9:38:54 AM

But more or less, things seemed to not be completely broken and things were functioning okay as far as I could tell, so I didn't think too much of it. The only thing I really noticed was whenever I'd wake up the days following it, my Discord was mysteriously closed/crashed (I'm usually hanging out in a game stream call with some friends in the background most of the time around the clock whether I'm present or not). As of January 1st/2nd ish, I noticed Discord sometimes going into a crash loop and then refusing to open even if I were to remove the program and re-install it fresh, still won't open. Rebooting the entire PC usually fixed this, but very temporarily. Then I started noticing tabs on 3 different browsers randomly crashing. I have a large amount of tabs open almost all the time, but never had major issues with how I do things across several systems over the last decade. Then my Steam starts to randomly crash a few times a day (no window freeze, no error pop up, just straight up disappears instantly to desktop like almost everything else that's crashing). Now half my steam games won't launch, the other half crash if steam has any issues (like its frequent crashing). Tried to watch some local video files to de-stress with all this nonense that's going on with my PC (anime) - VLC crashes within seconds to maybe a minute or two tops.

So at this point the PC appears to be critically unstable for handling a large percentage of the intended purposes of the machine, and I'm absolutely clueless as to what the culprit can be after trying some things to remedy the issue or at least diagnose the problem.

Here are the PC's specs:
-Windows 10 Pro Build 19045
-Core i9-13900K
-MPG Z790 EDGE WIFI (MS-7D91)
-G.SKILL Ripjaws S5 Series (Intel XMP 3.0) DDR5 RAM 64GB (2x32GB) 5600MT/s CL36-36-36-89 1.25V
-Nvidia RTX 3080 Ti
-Samsung 970 EVO Plus 2TB NVMe M.2 Internal SSD
-Seasonic PRIME TX-1000, 1000W 80+ Titanium, Full Modular
-Noctua NH-D14

I do not touch any kind of overclocking, unless you count using XMP Profile in BIOS, but that is all.

The things I tried in an approximate order (assume I rebooted PC between each of these at least once):
-Updated GPU driver
-Updated other drivers (chipset etc)
-Performed a Windows Update (I like to do these manually for convenience)

This is the first time the update process was jank for me and had some issues. Download failures, on the first update reboot it said it failed and said "Undoing Changes". Tried again, after 2 or three more reboots and attempts to get that nice looking up to date green checkmark, but it finally popped up after some higher than usual effort.

-Checked CrystalDiskInfo - it reports 100% healthy on the SSD
-Ran the Intel Processor Diagnostic Tool 64bit application, all passed
-Ran a pass of MemTest86 from a USB drive, 0 errors in the 1 pass
-Shrunk the main volume of my SSD and created a 100 GB partition
-Installed Windows 10 onto it fresh to see if issues persisted

This was by far the most jank installation of windows I've ever encountered. Something is very wrong. The first time I was selecting the volume and clicked start installation of windows, it got through the checkboxes and then BSOD. Tried again, worked - I eventually got to the desktop. First thing I did was try to access the Desktop folder of the other partition to bring the pre-emptively downloaded driver files over to this new desktop. It prompted for elevated permission to give 'permanent access' to the folder - but half way through it performing this task - BSOD. Rebooted okay, I tried again - it worked, I now had access to those files. Most of the drivers seemed to install okay, but the main chipset driver, despite accepting a UAC prompt, appeared to do nothing. At least I couldn't see a window pop up or any kind of installer after multiple attempts. Rebooted after a few of these driver installations as prompted. Then the final driver to try, Nvidia's - "7-Zip CRC Failed". Opened Microsoft Edge to download Firefox from their website directly. Ran the installer - "File corrupt". Re-downloaded. "File corrupt". Downloaded Discord in the same fashion - also failed to install for some reason or another. Tried downloading the nvidia driver directly from the website - Edge, a browser I never use outside of the first hour of a fresh windows install (if I can help it), crashed during some of these nvidia pages. Tried rebooting the PC, still couldn't install any of these things after the reboot. Downloaded Steam, ran the installer - seemed to actually work! I suspect steam phantom installed some system necessities in the background because I of course tried to install Firefox and Discord again, and they actually worked. Same goes for the nvidia driver! But.. uh-oh - already, steam just crashed and rebooted itself 3 times and Discord, 5 times. The issue is persisting on this fresh windows installation, which I assume rules out the Operating System being the problem, which has always been my suspicious up until this happened. So immediately after realizing this, I rebooted back to the other partition as I have no need of a new installation of windows (or so it seems). Things are depressingly unstable and I can't really do much anymore.

Something has clearly happened, and I don't know what. I am at the mercy of my friends and experts viewing this post to help me further investigate and diagnose the culprit of these issues. Another $3,000-$4,000 CAD computer only a year after getting this one is really not in the budget. Please, if anyone can help with some ideas, I would appreciate you and them immensely. Thank you for your time.
 
Make sure your on the latest BIOS for the motherboard.

If you or a friend have a sata SSD i would pull the M.2 and try installing windows on that just to narrow out ether a bad drive or socket.

Also head over to Microsoft and download the latest windows 10 install to make sure you didn't have a bad install disc/usb.
 
  • Like
Reactions: Kirbyarm
I do not touch any kind of overclocking, unless you count using XMP Profile in BIOS, but that is all.
That is very much overclocking and since the issues look like they are ram related you should reset ram to defaults, if you are unlucky the mem controller of the cpu got degraded and you will have to go even below defaults.

(Also just because you didn't touch anything doesn't mean that the mobo doesn't overlock like crazy, you should look into that as well)
 

Kirbyarm

Distinguished
Nov 9, 2013
298
9
18,785
Make sure your on the latest BIOS for the motherboard.

If you or a friend have a sata SSD i would pull the M.2 and try installing windows on that just to narrow out ether a bad drive or socket.

Also head over to Microsoft and download the latest windows 10 install to make sure you didn't have a bad install disc/usb.
It took a while for me to gather the courage to flash the BIOS. I've always been paralyzed with fear of bricking the entire system by doing this.. so I had the very oldest BIOS version still on there. I've been relaying this thread's activity to friends. Friends' encouragement helped. It now has the latest BIOS, though after the BIOS updating part itself, rebooted around 4 or 5 times to solid black screens before it finally booted into the BIOS. Not sure what that was about, but that was messed up.

I wasn't wise enough to check if XMP was enabled before the flash, but my friend and I are 95% sure it was. I distinctly remember doing that specific adjustment when I put the system together. After the flash it was Disabled though, so I've left it disabled and didn't touch anything aside from boot order to the windows media USB so that I could pretend to install windows but use it to delete the 100 GB partition that was preventing booting into the main partition.

For some reason since the flash, it kept forcing me into the 100 GB partition windows install, so that's what I did to resolve that. But before it would let me onto the desktop it claimed my "PIN was no longer available" to some security thing.. so I had to do a bunch of computer hopping between this rig and the server rig at the other end of the house with backup emails and verifying identity/authorization code etc.

Anyway, it finally let me back on. Some windows oddly forgot their startup positions, but apart from that everything has been in order.

As for the SSD, I've had it as a high priority now to get a second M.2 for easy backups/referencing previous installations should I need to install on a new drive for whatever reason, and there was a juicy 33% discount on newegg for a Samsung 2 TB 980 Pro M.2, so I ordered that. Should be here in a week or so. Was going to get this whether I had PC problems or not, so I'm glad I checked and there was a nice sale.

Whenever I install windows again, I will make sure to remake the media USB with the latest download from Microsoft, thanks for the reminder.

As of writing this message I have opened everything I 'normally' have open and am awaiting for any tab or program to crash in dreadful anticipation. Also the games that weren't launching before are launching again... It's been a solid 10 minutes and nothing yet. Lord, let it be this simple. PLEASE
 

Kirbyarm

Distinguished
Nov 9, 2013
298
9
18,785
That is very much overclocking and since the issues look like they are ram related you should reset ram to defaults, if you are unlucky the mem controller of the cpu got degraded and you will have to go even below defaults.

(Also just because you didn't touch anything doesn't mean that the mobo doesn't overlock like crazy, you should look into that as well)
Yes, that's correct. Apologies, I meant there was nothing I touched myself or altered in the BIOS apart from that. My poor choice of words.

As mentioned above, I left XMP Disabled after the BIOS update and everything at default. Let's see how it goes.

How would you propose I look into that? Is there standard procedure to force the system not to overclock or boost anything?
 
How would you propose I look into that? Is there standard procedure to force the system not to overclock or boost anything?
It's very typical for "auto" in the bios settings to mean mobo makers choice and that usually is everything at full max. So if you go into bios and all the CPU settings are set to auto the you should take a closer look.
 
As for the SSD, I've had it as a high priority now to get a second M.2 for easy backups/referencing previous installations should I need to install on a new drive for whatever reason, and there was a juicy 33% discount on newegg for a Samsung 2 TB 980 Pro M.2, so I ordered that. Should be here in a week or so. Was going to get this whether I had PC problems or not, so I'm glad I checked and there was a nice sale.

Whenever I install windows again, I will make sure to remake the media USB with the latest download from Microsoft, thanks for the reminder.
If and when the new drive arrives and you try to install windows on it, make sure you remove the other M.2. If windows sees another boot loader installed on the second drive it will use that and not install one on the new drive. If for some reason you ever format or remove the second drive your computer will not boot as you would have removed the boot loader from the system.

It a good rule of thumb to only have the drive you want to install Windows on in the system, and remove or unplug all other drives.
 
  • Like
Reactions: Kirbyarm

Kirbyarm

Distinguished
Nov 9, 2013
298
9
18,785
Just an update. I haven't observed any browser tab crashes or program crashes as of yet. Still going strong. Really hoping it remains stable.

It's a bit saddening that I spent such good money on these components and the RAM randomly after a year needs to be defaulted. Or at least that seems to be the case. If it remains stable for a week, my friend suggested I then enable XMP and see if just the BIOS update was enough on its own. I'll do that in a few days, and if it's unstable.. I guess I'll just have to bite the bullet of less performance for a time. I'll post back if anything else crops up. Many thanks to you both!
It's very typical for "auto" in the bios settings to mean mobo makers choice and that usually is everything at full max. So if you go into bios and all the CPU settings are set to auto the you should take a closer look.
Makes sense, thanks for explaining.
If and when the new drive arrives and you try to install windows on it, make sure you remove the other M.2. If windows sees another boot loader installed on the second drive it will use that and not install one on the new drive. If for some reason you ever format or remove the second drive your computer will not boot as you would have removed the boot loader from the system.

It a good rule of thumb to only have the drive you want to install Windows on in the system, and remove or unplug all other drives.
Roger that, thanks for the heads up!
 

Kirbyarm

Distinguished
Nov 9, 2013
298
9
18,785
Right! So, the problem has returned. It seemed almost perfectly stable and fine for the past few days. But, despite the new BIOS and disabling of XMP profile, programs, browser tabs and games are crashing regularly and stability is starting to decline with each passing hour once again.

The new M.2 has arrived today. Not sure if I want to start using it straight away until I can figure out what part(s) is causing these issues and get a replacement going.

Just ordered a 64-GB kit of RAM (SKU F5-6400J3239G32GX2-RS5K). It's listed on the official MSI motherboard's webpage as compatible RAM with my specific board. Despite it looking eerily similar to the kit I already have installed, I'd like to try it and see if the issue persists.. but it probably won't be here for a solid week, maybe two.

Does anyone have any other advice or things to try to further diagnose the culprit with more certainty? I apologize I'm not really setup too well for this as I don't exactly have another LGA 1700 socket CPU lying around to test, nor any DDR5 memory (memory is on the way though, now).

Make sure your on the latest BIOS for the motherboard.

If you or a friend have a sata SSD i would pull the M.2 and try installing windows on that just to narrow out ether a bad drive or socket.

Also head over to Microsoft and download the latest windows 10 install to make sure you didn't have a bad install disc/usb.

I could try installing windows on the new M.2 and pull this one... do you think that's still a good idea despite not having a sata drive handy? By drive you mean the M.2 itself could be an issue and by socket you mean the connector that the M.2 is inserted into on the motherboard? Should I try it in a different M.2 socket then or?