Question Windows is crashing multiple times a day ?

Jul 29, 2023
6
0
10
Hi, my Windows11 PC has been crashing for over a year now and I am at a complete loss on how to fix it. To date I have replaced RAM, replaced PSU, replaced GPU, updated BIOS, updated chipset drivers, uninstalled and reinstalled GPU drivers using DDU, done SFC scans, done check disk scans and ran windows memory diagnostic. Every issue I tried to address seemed futile so a couple days ago I also did a clean install of windows after wiping my drives. It feels like nothing works. I get BSODs from IRQL_NOT_LESS_OR_EQUAL to security violations. Sometimes the screen just goes black. Sometimes my system freezes. Sometimes it randomly reboots but by far the most common is BSOD.

For whatever reason my PC rarely crashes when gaming and typically crashes when I'm either watching videos (YouTube, Netflix and Prime) or when I'm at work or asleep. I've posted to the Microsoft forums to no avail. Any help is appreciated. Currently my next step would be to buy a new SSD and reinstall windows since it would be the cheapest part to replace. I am also not sure if this is related but recently my chrome tabs also started crashing with error code STATUS_STACK_BUFFER_OVERRUN not too sure what that indicates.

Pre-windows wipe minidumps + event viewer logs: https://drive.google.com/drive/folders/17MlwqBCESCzy1LDWGBKJFa5xxKadTvTD?usp=sharing
Post-windows wipe minidumps + event viewer logs: https://drive.google.com/drive/folders/12x9N8TdxL6aodO-jLyE1yKoJ-5Djqxp3?usp=sharing

Specs:
CPU: Ryzen 3600
GPU: EVGA RTX 3060ti
RAM: Corsair Vengeance 16gb rated for 3466mhz but XMP is currently disabled
PSU: EVGA 1000GQ
MOBO: X470 AORUS GAMING 7 WIFI
 
Did not download the .dmp files. No previews available and I deemed downloading to be too risky.

= = = =

Do the problems stop or decrease if a wired connection is used? Enable wired and disable wireless for testing purposes.

= = = =

Open Reliability History/Monitor and take a look at the error codes, warnings, and informational events being shown. Also look for patterns.

Look in Task Manager, Resource Monitor, and Process Explorer (Microsoft, free). Could be that there is some buggy or corrupt process that is carried along and causing the problems. Work to identify everything that is running on the system.

https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer

Manually download, reinstall, and reconfigure all drivers. Verify that the drivers are being downloaded directly via the applicable manufacturer's website. No third party tools or installers.

Sketch out a diagram of your PC that shows all peripherals - include the cables and connections being used for each peripheral be it a monitor(s), speakers, cameras, etc,. Audio, visual, USB, electrical, network. Any surge protectors, power strips, UPS's etc.? Include electrical outlets.

Any loops involved where A connects to B connects to C connects to D connects to A again?

That the problem has been continuing for over a year despite all that has been tried indicates to me that there is some fundamental issue with the PC.

Configure a basic and simple Windows 11 installation as possible. Minimal games, viewers, apps, utilities.

Objective being to establish a stable system. Then add in additional software one at a time allowing time between additions. Monitor system performance. Keep an eye on the logs.

If the problems re-appear the culprit is likely the last thing that you did.
 
Things to try; Update bios. Return bios to defaults. Memory in sots 2nd and 4th?, they should be. Try single stick of ram in 2nd slot from cpu. Slacken off cpu cooler tension screws a little and then inspect cpu for bent pins if nothing changes.
 
Did not download the .dmp files. No previews available and deemed downloading to be too risky.

= = = =

Do the problems stop or decrease if a wired connection is used? Enable wired and disable wireless for testing purposes.

= = = =

Open Reliability History/Monitor and take a look at the error codes, warnings, and informational events being shown. Also look for patterns.

Look in Task Manager, Resource Monitor, and Process Explorer (Microsoft, free). Could be that there is some buggy or corrupt process that is carried along and causing the problems. Work to identify everything that is running on the system.

https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer

Manually download, reinstall, and reconfigure all drivers. Verify that the drivers are being downloaded directly via the applicable manufacturer's website. No third party tools or installers.

Sketch out a diagram of your PC that shows all peripherals - include the cables and connections being used for each peripheral be it a monitor(s), speakers, cameras, etc,. Audio, visual, USB, electrical, network. Any surge protectors, power strips, UPS's etc.? Include electrical outlets.

Any loops involved where A connects to B connects to C connects to D connects to A again?

That the problem has been continuing for over a year despite all that has been tried indicates to me that there is some fundamental issue with the PC.

Configure a basic and simple Windows 11 installation as possible. Minimal games, viewers, apps, utilities.

Objective being to establish a stable system. Then add in additional software one at a time allowing time between additions. Monitor system performance. Keep an eye on the logs.

If the problems re-appear the culprit is likely the last thing that you did.
Hi,

Is there another way I can upload dmp files to ensure security?

I exclusively use ethernet with my computer however the wifi is always running as its integrated into the board. Should I try to turn off the wifi capabilities?

Crashes usually change codes. I have not been able to identify any patterns.

I'm in and out of the house today but I will check out out process explorer when I get back.

The only drivers I manually installed were chipset, gpu and audio. Everything else was installed by windows. Should I try reinstalling windows drivers? I have also used driver verifier. Driver verifier does not crash my system (I had it running for about 12 hours yesterday) but when I turned driver verifier off my system crashed 3 times overnight.

The only things I have connected to my pc is a monitor (using DVI tried HDMI as I heard people crashing using DVI but no dice), a mouse, a keyboard, an ethernet cable and a power cable (I've tried swapping power cables too). I also have a wireless transmitter for my headphones but I have tried removing it for a day and pc was still crashing. I have changed between surge protector and wall plugs. No difference.

Not 100% what you mean by loops but when I built my PC I built it like any other (others never had crashing issues). Externally everything is a one way connection to any peripheral. I don't use any of the extra usb ports that come on my monitor/keyboard

At this point the only parts I have not replaced is mobo, cpu and ssd. Some crashes say they relate to ram issues while other say they relate to gpu driver issues. One time it said gpu lost power. I have a hunch it could be faulty mobo but I don't have any evidence to support the claim. Is there a way to test mobo?

So far on my fresh windows install I have only downloaded Chrome, Steam, CSGO, Discord, Python, VS code, ICUE, geforce experience, nvdia drivers (via geforce experience), realtek audio driver (from mobo website) and chipset drivers (via AMD website)

Every night I go to bed I pray for a stable system. I recently started running code 24/7 on pc my and its a huge headache when it turns off for 8+ hours when I'm at work. To clear the air, my pc was also crashing before I started running code on my pc so I doubt it's related. I also wrote everything myself so no viruses in the code.

I really appreciate the write up. Thank you!
 
Things to try; Update bios. Return bios to defaults. Memory in sots 2nd and 4th?, they should be. Try single stick of ram in 2nd slot from cpu. Slacken off cpu cooler tension screws a little and then inspect cpu for bent pins if nothing changes.
BIOS is already up to date. I'll try restoring defaults. Should I do it through BIOS or should I clear CMOS? A while ago I upgraded to 32gb of ram and accidentally put them in slot 1 and 3 (lol) when I went to swap ram to old kit I realized the error and seated them in 2 and 4. I'll try one stick. Could the CPU bend pins while inside the socket? I haven't moved the CPU in the last 3ish years.
 
I did download the dumps and both the before wipe and after wipe smell very strongly of RAM problems. I can go into detail if you want but I think your best option now is to download Memtest86 (free), use the imageUSB.exe tool extracted from the download to make a bootable USB drive and then boot that USB drive. Memtest86 will start running as soon as it boots.

If no errors are found after the four iterations of the 13 different tests that the free version does, please restart Memtest86 and do another four iterations.

Let us know how that goes.
 
BIOS is already up to date. I'll try restoring defaults. Should I do it through BIOS or should I clear CMOS? A while ago I upgraded to 32gb of ram and accidentally put them in slot 1 and 3 (lol) when I went to swap ram to old kit I realized the error and seated them in 2 and 4. I'll try one stick. Could the CPU bend pins while inside the socket? I haven't moved the CPU in the last 3ish years.

Ram is usually the culprit for most of bsods. I suggested possible bent pins if nothing else works since this has gone on for awhile as you say, but 3yrs since installing cpu is longer so probably not that. Investigate your ram.
 
I did download the dumps and both the before wipe and after wipe smell very strongly of RAM problems. I can go into detail if you want but I think your best option now is to download Memtest86 (free), use the imageUSB.exe tool extracted from the download to make a bootable USB drive and then boot that USB drive. Memtest86 will start running as soon as it boots.

If no errors are found after the four iterations of the 13 different tests that the free version does, please restart Memtest86 and do another four iterations.

Let us know how that goes.
Ok I will do memtest now. I've tried two sets of ram with this pc. Is it possible that the mobo is failing to read/write ram properly? I feel like it's more likely than both kits being faulty. I will do the memtest regardless but I am curious about the details you mentioned. Thanks!

edit: Thanking about it, I did have more variety of crashes before I replaced my kit of ram (like the black screens and random restarts I mentioned in my initial post). Now it seems to be primarily BSOD. I suppose its possible I had multiple issues previously that I had fixed and it could just be this old kit that has issues. Regardless downloading memtest now. Thank you!
 
I just came across this thread: https://www.reddit.com/r/Amd/comments/j70utz/psa_if_you_have_a_early_launch_ryzen_5_3600/
My most common crashes are DPC_WATCHDOG_VIOLATION and IRQL_NOT_LESS_OR_EQUAL
So, I might try to buy a new CPU.
I will admit I made quite the rookie mistake. When I booted up memtest it told me I was on a single channel. Turns out when replacing my ram with my old kit I misread the manual and ended up putting both sticks on a single channel. Fixed that now. However, I think my old ram was set in properly and I mistook it to be on a single channel. I will see how the system performs in dual channel. If I get more crashes I will run memtest. If memtest passes then I will upgrade to a 5600 and see how it performs. I am now reading about the finicky memory controllers within the 3600 (especially early revisions) and the issues seem very relatable to my system. I appreciate all the support and I will update the thread with the results of my troubleshooting. Thank you everyone for such detailed support!
 
OK, let us know how things go.
Here is a little update on the issue. I would say that it is a bit to early to call the system "stable" but I have not crashed in the last two days and overall the system feels more stable. Less laggy, chrome tabs stopped crashing and strangely enough my pythons scripts stopped crashing (didn't even think this was related). Again, it's only been 2 days so we will see.

Here is the supposed fix, after looking into the problems other 3600 users reported I decided to downgrade my BIOS to AGESA 1.0.0.4. System crashed within a few hours. Looking again, other users reported stability with AGESA 1.0.0.3 AB. So I downgraded and so far it's been working like a charm. Oddly, the first thing I noticed was that my screen stopped auto sleeping. I bring this up because someone mentioned how the newer revisions of AGESA had some changes to how the CPU idled. They mentioned how the newer revisions had hard coded values for idle voltages. Assuming the early bins of the 3600 had finicky memory controllers (as other users suggested) I felt as if these idle voltages would undervolt the CPU too far and would cause memory controller instability hence all the memory related dumps. Some users reported success using newer AGESA versions if they changed their voltage settings. I didn't want to mess with idle voltage settings so I think I'll sit with AGESA 1.0.0.3 AB. If any crashes start appearing again I will be back but I am hopeful that this solved my issue.

Thank you everyone who took the time out of their day to troubleshoot my PC!