Question PC has developed stability issues with no clear consistent cause or behaviour ?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Jan 14, 2025
15
0
10
This is an ongoing problem that pops up quite inconsistently, taking days to resurface. It could therefore take a while before i can verify whether anything that's been attempted works. I'm keeping a regular log of events. What i've got so far is this:

Me and my girlfriend have a pair of almost identical PCs which we've been using for about 1.5 years. Hers has a slightly better GPU, and an M2 SSD instead of my SATA. I will post the exact specs later once i can have a look at them. She might also have a better power supply, but i'm not sure. They are both running on windows 10. Hers has recently developed a variety of problems that i can't seem to find an explanation for. The current sequences of events is this (dates are not exact, but near enough insofar i can remember):

- Dec 19th: I updated the BIOS to prevent CPU damage as per the recent trouble with intel CPUs. I performed the update for both PCs in the exact same way, applying default settings on both. There were no immediate problems, and my system runs fine.

- Jan 10th: Her PC bluescreened during regular browsing. The percentage indicated on there remained at 0%, and the system rebooted afer a while on its own, into BIOS. The BIOS could not find the boot drive. I messed around with it for a bit (reboots, exploring menus, more reboots, but changed no settings), and finally settled on trying to arrange for a system recovery/repair, when the problem fixed itself before i did anything. We could just boot into windows as normal, and i did not do a repair.

There is no dump file for the BSOD. Logs were already enabled, and there is a file for the 4th of january, but nothing for the 10th. I'm positive that it did not happen on the 4th. I'm assuming this is because it never got past 0%, and had no drive to write it to at the time. Some older logs were also present.

- Jan 11th: Just in case, i switched the SSD to a different M2 port. According to the motherboard manual, this should be fine. The PC booted with no issues. I took the opportunity to run a health check for the drive using Samsung Magician, and it all came back healthy. Maybe the SSD is fine, but the system struggles to reliably access it? i don't know.

- Jan 14th: At 10:00, The PC booted, but was unable to properly load windows. It got past the login screen, but then entered an endless cycle of crashing and loading explorer. This was accompanied by the screen flashing black, but with visible cursor. She tried again at 11:00, but that time only got as far as the login screen. I have videos of both events if you want them. Another attempt to boot was made at around 14:00.

At 18:00 or so, i tried booting the system myself, and was able to boot normally, as if nothing was wrong. After that i did the following things:
- I ran sfc/ scannow. This seemed to fix some things, though i can't make much sense of the log. It seemed to just repair some duplicates.
- Ran dism /online /cleanup-image /restorehealth. Found nothing.
- I changed the memory dump settings to a small memory dump, and told to PC to not automatically restart on a system failure. This is in case it BSODs again.
- I did a startup repair. This failed because it found nothing to repair, judging by the log file.
- I explored the event viewer, which revealed some things: There are occasional series of memory access errors, mostly (if not all) from the time of the explorer crash loop. There were also a few WHEA-logger errors, ID 3. One of those dates to May last year, the other 3 up to a few weeks after the BIOS update. My system does not have these events.
- I ran the Windows Memory Diagnostic tool. 2 passes, no errors.
- I updated the graphics drivers, and removed Armoury Crate.
- Given that our systems are largely identical, i swapped our RAM sticks to see if it would transfer the problem to my PC. The systems booted as normal.

- Jan 15 & 16: followed various bits of advice here.

- Jan 17th: replaced the CMOS battery and reseated the GPU. PC booted into a setup screen followed by BIOS once, and then normally.
 
Last edited:
you might just set the xmp memory profile and run the memtest86
test. You can get memory errors from overclocked memory OR from underclocked memory (under voltage) basically the timing windows in the electronics gets violated. It often depends on which memory slots the modules are in. You can get several nanosecond delays. ie some RAM chip might work perfectly if it is in the slot that is closest to the CPU, but get errors if it is in a slot further away from the CPU. (distance slightly increases the Capacitance of the circuit and causes a slight delay of the timing signals to the circuit) It is one of the reasons I sometimes use 2T cmd delay on the memory timings to add a few nano second delay for the memory electronics stabilize. generally, most memory will not tell you the cmd delay requirement. cheaper memory often use 2T but the mother board vendor usually defaults to 1 T command delay. Then they update the bios months after the motherboard is released, sometime they update the memory qualified vendor list. motherboard vendor generally do not declare the setting change between bios version updates.
 
you might just set the xmp memory profile and run the memtest86
test. You can get memory errors from overclocked memory OR from underclocked memory (under voltage) basically the timing windows in the electronics gets violated. It often depends on which memory slots the modules are in. You can get several nanosecond delays. ie some RAM chip might work perfectly if it is in the slot that is closest to the CPU, but get errors if it is in a slot further away from the CPU. (distance slightly increases the Capacitance of the circuit and causes a slight delay of the timing signals to the circuit) It is one of the reasons I sometimes use 2T cmd delay on the memory timings to add a few nano second delay for the memory electronics stabilize. generally, most memory will not tell you the cmd delay requirement. cheaper memory often use 2T but the mother board vendor usually defaults to 1 T command delay. Then they update the bios months after the motherboard is released, sometime they update the memory qualified vendor list. motherboard vendor generally do not declare the setting change between bios version updates.
Okay... I think i got the gist of the reasoning behind it, but i have no idea how to actually accomplish it. I have never overclocked anything. If i'm understanding this correctly, the BIOS update may have changed something in the default settings that negatively affected the stability of the RAM? We didn't have any issues before that, and only got them several weeks after.
 
Last edited:
These images may have gotten lost on the previous page, since i edited them in later. I've now removed them from that post. I noticed this in the event viewer yesterday:





Don't know if it matters. I've seen the error show up a few times now.

Today I replaced the CMOS battery and reseated the GPU, because some idiot put the battery behind it. Two birds one stone i guess. The PC booted into a setup screen followed by BIOS once, and then normally.

The PC did seem fairly clean inside the case. I'll clean it properly this weekend, but it seems alright. One thing i noticed is the state of the wiring on the MOBO plug connecting to the front panel; it looks a bit gnarly. I'll take pictures of it during my inspection tomorrow.
 
FYI:

https://learn.microsoft.com/en-us/t...ade-and-drivers/system-log-event-id-7000-7026

From the link:

"This problem may occur if a device isn't connected to the computer but the driver service of the device is enabled."

Look in Device Manager. Ensure View > "Show hidden devices" is checked.

You can also use "driverquery" via the Command Prompt as another way to look.

Or use Powershell's Get-WmiObject Win32_PnPSignedDriver cmdlet. Long, detailed listing.
 
FYI:

https://learn.microsoft.com/en-us/t...ade-and-drivers/system-log-event-id-7000-7026

From the link:

"This problem may occur if a device isn't connected to the computer but the driver service of the device is enabled."

Look in Device Manager. Ensure View > "Show hidden devices" is checked.

You can also use "driverquery" via the Command Prompt as another way to look.

Or use Powershell's Get-WmiObject Win32_PnPSignedDriver cmdlet. Long, detailed listing.
Right, so worth fixing, but not critical.

I have a different question, maybe for johnbl: on the 14th, we rebooted the system several times. It behaved pretty much the same way every time - refusing to start. Why did it suddenly fix itself at around 18:00? Could we use that to temporarily fix its crashes if it happens again? (i realise this could be due to faulty wiring, in which case... i guess it's just down to what it feels like at the time)

And if it does happen again: what logs would you like to have, or want to me to look at?
 
Last edited: