[SOLVED] Windows Crashes Every Day Or So For Unknown Reason

Xulfin

Prominent
Sep 13, 2020
8
0
510
Before I get into the problem, my specs are as follows:

Motherboard: Asus Strix X 570 F
CPU: AMD Ryzen 5 3600 (usually auto boosting to 4.2ghz.) My CPU is also being cooled by a corsair AIO
GPU: Asus Strix RTX 2080 (non-oc model and no overclock has been administered on the card.)
Memory/RAM: Corsair Vengeance 16gb (2x8gb) kit running DOCP profile at 3600mhz
Storage: Samsung 970 Evo Plus 1tb (in top m.2 slot on mobo.).
PSU: EVGA 750w SuperNOVA G5

So for the past 2 months or so my computer has been having issues with system stability during certain instances.
  • The things that I have found to crash my system the most are things like
  • downloading a game,
  • a few minutes after launching a game,
  • running other simple applications such as Spotify while running a game,
  • and the rare purely un determinable crashes.

When I built this system about 3.5 months ago (my first build btw), the system would have a crash about once every day or two and it would always BSOD with error code of memory management, after resetting windows it seemed to fix this problem but only made my crashes go from blue screens, to no blue screen.

The process of the crash is as follows, my entire system will freeze; this being the screen, inputs, and outputs. The audio will play a loop of the last bit of sound over and over, and then the system will auto restart after about 6-8 seconds of this happening.

After a crash, the system doesn't seem to crash until it remains off for a long period of time, such as when I go bed. After that, its like the system somehow resets and will most likely crash again with the exact same crash. I get no blue screen, no error code upon restart or ever, and no real crash log in the event viewer.

I think I have narrowed the problem down to a possibly bad SSD or RAM sticks. I have tried updating my drivers for my mobo, gpu, ssd, and pretty much every other application or software, but that was about 1.5 months ago. But I do keep most of my drivers up to date about every week. I would also like to mention that I tried resetting windows again about a month ago but that did nothing to solve the problem. I have updated my mobo's BIOS about 2.5 months ago but I don't know if that reverted when I reset windows. If you think you can help, or know someone that might be able to help, please reach out. I am knowledgeable of computer parts and functions but not so much in troubleshooting or reading and analyzing windows logs, crash dumps, or other types of deeper software things.

[Mod edit - wall of text made easier]
 
Last edited by a moderator:
Solution
If you want to I can take a look at a kernel memory dump of the system before the system hangs. You would have set the registry to force a memory dump via keyboard. Change the memory dump to kernel run your system then force the dump before the hang. Then pm me and I can take a look. Even if the system hangs you might still be able to force a memory dump depending where hang happens. For example windows might not know that a GPU is hung and it will just make the memory dump.

Colif

Win 11 Master
Moderator
i added some spacing to make it less scary to read. 12 lines with no breaks is a hard ask for most people to read. I closed thread once and then decided I had to fix it for you :)

After a crash, the system doesn't seem to crash until it remains off for a long period of time, such as when I go bed. After that, its like the system somehow resets and will most likely crash again with the exact same crash.

I wonder if turning fast startup off will help - you don't need it on with an nvme, its not making any difference although if it is a driver at fault, turning this off may fix it - https://www.tenforums.com/tutorials/4189-turn-off-fast-startup-windows-10-a.html

I get no blue screen, no error code upon restart or ever, and no real crash log in the event viewer.
this means its not windows. its likely hardware

To test your hunch about ram or storage -
have you run Samsung magician on the NVME and run diagnostics? - https://www.samsung.com/semiconductor/minisite/ssd/download/tools/
ram - Try running memtest86 on each of your ram sticks, one stick at a time, up to 4 passes. Only error count you want is 0, any higher could be cause of the BSOD. Remove/replace ram sticks with errors. Memtest is created as a bootable USB so that you don’t need windows to run it

what are you using to see this speed?
AMD Ryzen 5 3600 (usually auto boosting to 4.2ghz.)
Windows probably has no idea how fast your CPU is. It sees my 3600XT as always doing 4.2 but Ryzen Master sees real scores and its often only on 1ghz
these both can't be right
cZ7c1t6.jpg


Memory/RAM: Corsair Vengeance 16gb (2x8gb) kit running DOCP profile at 3600mhz
what speed is the Fabric clock running at? should be half so I guess 1800?
 
Last edited:

Xulfin

Prominent
Sep 13, 2020
8
0
510
I disabled fast startup in both my BIOS and windows (we'll see if that works in the next few days), I ran a short and long test on my ssd through magician and there was no problems, I will try running memtest overnight; how do I run it to check one stick a a time? What is fabric clock? Ryzen Master says that my cpu is running a constant 4.0-4.2ghz, I don't have any oc in my BIOS or thrugh software, and my GPU is running 1890mghz with no oc or tweaking.
 

Colif

Win 11 Master
Moderator
how do I run it to check one stick a a time?
Take out 1 stick, and test it. Once it is finished, put the other stick in and test it in the same slot
What is fabric clock?
On Ryzen systems, the latency and available bandwidth for all connected components like the DRAM controller, PCIe bus, etc. is based off the Infinity Fabric speed (FCLK). All of this occurs in an interconnect AMD calls “Infinity Fabric.” The Infinity Fabric clock speed (FCLK) is configurable and directly relates to the memory clock (MCLK). For Ryzen 3000 CPUs, most will run a 1:1 ratio between FCLK and MCLK, which can be considered “synchronous” operation, up to 1,800MHz. There may be instances where the FCLK may be set to operate in an “asynchronous” mode (not a 1:1 ratio), which may introduce increased latency, thus negating any performance benefit. However, for extremely high MCLK values (> 3,600MHz), asynchronous operation may actually increase overall performance, at the cost of stability.
https://www.corsair.com/corsairmedia/sys_master/productcontent/Ryzen3000_MemoryOverclockingGuide.pdf

Ryzen Master says that my cpu is running a constant 4.0-4.2ghz, I don't have any oc in my BIOS or through software,
what power plan are you running?
right click start
choose Power Options
on far right pick Additional power settings
Are you AMD Balanced or AMD High Performance?

always at 100% CPU speed makes me think you on High Performance

Do you have the latest BIOS? on newer boards, bios updates can actually help. Especially on AMD boards.
 
This type of problem is difficult to isolate. I would go into bios and disable any device I was not using. Then boot into windows and disable all sound sources that are not being used. Disable any drivers for any hardware options you are not actively using. Stuff like the USB support coming out of a graphics card.

Disable your tcpip checksum offload for your Intel network driver.
See https://downloadcenter.intel.com/do...ad-Capability-with-Intel-1-10-GbE-Controllers

It will save time and fix issues related to bad checksum errors that you can only detect with a network packet sniffer.

After you make these changes reboot and see if you can reproduce the problem. The theory is that the problem is caused by a confused/hung pic/e bus. These are hard to isolate since so much hardware now uses the pic/e support. One bad/stupid driver can hang the bus. Things like a stupid USB led driver to flash led lights on your case. If you really do not need the support function disable it in bios until you figure out the cause of the hang. Debugging the problem takes a bunch of effort and would require a keyboard forced full kernel memory dump of a running system before the system crashes on its own but after the system starts to have issues. And it would require looking at internal error logs and various debugging passes to isolate the problem. Best to disable hardware in bios and see if you can reproduce the problem. For example you could move all of your USB devices to USB 2 ports. Then disable your USB3 in bios and see if you still get the problem. If you do then you know it was not the USB 3 interface as the cause. Kind of a tedious method but it can work.
From past experience I would start with HD audio devices, then USB devices especially any USB connected cooling device if you have one. I looked at one case a few years ago that was caused by aio CPU cooler that was putting out millions of messages in a few minutes and messing up the pic/e to the point it would hang. Fix was a bios upgrade for the USB and to move the device to a USB 2 port with no other devices on it.
 

Xulfin

Prominent
Sep 13, 2020
8
0
510
So I ran mem test on both sticks individually both as base clock, and on DOCP, no errors whatsoever. Also, when I launch Cold War, the game will crash within a minute after launch (by crash I mean the normal crash that my pc experiences) with no BSOD or error code. But after it forces the restart, my pc will be fine until I turn my system off, where it will then reset the ticking time bomb to when I launch the game again or when my pc just crashes in general. I have also run every test on my SSD again through magician, no errors either. At this point I have no clue what it could be. My drivers are up to date, my GPU performs great in game and I haven't noticed any overheating or throttling. I'm at a loss for what could be causing this problem besides conflicting software or something in or with my motherboard. Should I try fresh installing every software I have to find to possibly find a corrupt file?
 
If you want to I can take a look at a kernel memory dump of the system before the system hangs. You would have set the registry to force a memory dump via keyboard. Change the memory dump to kernel run your system then force the dump before the hang. Then pm me and I can take a look. Even if the system hangs you might still be able to force a memory dump depending where hang happens. For example windows might not know that a GPU is hung and it will just make the memory dump.
 
Solution

Colif

Win 11 Master
Moderator
Just tying to help...
You would have set the registry to force a memory dump via keyboard.
how do they do that? I mean, I am assuming they don't know how to do it as I know I don't.

Change the memory dump to kernel
at op, go here - https://www.tenforums.com/tutorials/5560-configure-windows-10-create-minidump-bsod.html - and see Have Windows Create a Kernel Memory Dump on BSOD. I don't know if it recreates the memory.dmp or just adds to it?
 
Just tying to help...

how do they do that? I mean, I am assuming they don't know how to do it as I know I don't.


at op, go here - https://www.tenforums.com/tutorials/5560-configure-windows-10-create-minidump-bsod.html - and see Have Windows Create a Kernel Memory Dump on BSOD. I don't know if it recreates the memory.dmp or just adds to it?
Generate a memory dump by forcing a system crash (trendmicro.com)
should show the registry setting for forcing a memory dump.
 

Xulfin

Prominent
Sep 13, 2020
8
0
510
Sorry for how long it has taken me to get back to you guys, I had some personal stuff come up and I just haven't had the time or energy to troubleshoot this. But I will get to it today or this weekend.