Question Could use some help with inconsistent BSODs ?

Jun 4, 2022
9
0
10
Hello. Trying to get some help with a few crashes I've been having lately. I have a few ideas I'll get into in a bit but otherwise I cannot find a 100% consistently trigger these crashes. I can run my machine for days on end, stressing it daily with no issues until one day it finally decides to crash. This has become extremely stressful for me because, even though my PC might be running "fine" most of the time, I'm constantly in a state of paranoia.

Here are my minidumps if you'd like to look through them: https://files.catbox.moe/jsmffw.7z

All my crashes are memory related (Memory_Management and Faulty_Hardware_Corrupted_Page) so of course the first thing I did was to run memtest86. God knows how many memtest runs I've done at this point but they all pass 100% every single time. I've also used SEAtools to test my SSD and it too has passed. I have updated all the relevant drivers I can think and am currently on the latest BIOS update for my motherboard. I'm at a complete loss.

Here are some theories and information that are hopefully relevant:
My machine is a few years old. Never had a problem in the past until I decided to reinstall Windows several months back then the crashes started creeping up. Could a faulty windows install be behind this misbehavior? I've ran the sfc and DISM commands everyone recommends countless times already. Most of the time they come out clean but a few times they did find corruption that was "successfully fixed". But every time I look at the log files after a successful repair I cannot find anything referencing any repair of any faulty files. I believe most of those are "false positives" and they're only repairing registry mismatches after various windows updates and the like. That's just what I gathered from them I'll freely admit that I'm not an expert at reading those logs.

Two of these crashes happened when I was in the middle of playing a game. The games I was playing seem to be very RAM intensive and have reports of memory leaks (my system has 16GB of RAM and both of the games that caused a crash also have 16GB as their recommended) . What I noticed is that these games use a lot of virtual memory and they've even crashed a few times because of extremely high virtual memory loads (program crashes, these crashes did not end in a computer crash). Could memory leaks potentially lead to these bugchecks? Maybe if a game starts devouring too much of my pagefile or if my pagefile isn't big enough that it starts overwriting things it could cause system instability?

minidumps all point towards "Memory Compression" leading to the fault. From what I understand that's a Windows 10 feature where windows might compress things inside the RAM if resources become scarce so I'm once again left wondering if these BSOD might be caused by a memory leak or resource exhaustion. Something inside the compressed RAM might be getting overwritten or windows is otherwise not able to decompress it properly leading to the crash. That's my theory but again I genuinely have no idea what I'm talking about I'm just trying to rationalize how all of this functions from what little reading I've done.

Recently I deleted and recreated my pagefile and manually set it to a higher value than normal. It's only been a few days though so I'm still testing this out to see if this helped.

Thoughts? I'm not sure if I should swap out my current set of RAM sticks, add more RAM, or do something else. Even though memtest isn't picking up any errors I assume it could still be some very obscure fault inside the ram which would explain why these crashes are triggered so rarely. I always assumed that resource exhaustion would simply cause an application to crash and not crash the entire system or lead to memory being overwritten/corrupted so I'm not sure if adding more RAM is the right idea here. If you have any suggestions or more tests you'd like me to run then feel free to tell me, I'm willing to try out anything at this point.
 
Last edited:

Lutfij

Titan
Moderator
Welcome to the forums, newcomer!

It would've been a good idea to list the specs to your build like so:
CPU:
Motherboard:
Ram:
SSD/HDD:
GPU:
PSU:
Chassis:
OS:
Monitor:

include the BIOS version for your motherboard as well as the age of the PSU(apart from it's make and model). As for your OS, what version(not edition) of the OS are you on? One of your dmp files show that Steam.exe was the trigger. The other two show memory issues. Which slots are the rams populating on the motherboard?
 
Jun 4, 2022
9
0
10
I'm really sorry I knew I was missing something. Here they are:

CPU: r7 3700x
Motherboard: B450 Tomahawk Max
Ram: TeamGroup DDR4 3200 CL16 (16GB (8x2))
SSD/HDD: Crucial MX500 1TB (OS drive), Seagate HDD 2TB, Western Digital External USB HDD 2TB
GPU: EVGA 2070 Super XC Ultra
PSU: EVGA 750 GQ
Chassis: Cooler Master H500
OS: Windows 10 Pro x64
Monitor: Viewsonic XG270QG

Bios version is 3.D0 It's the second latest version I believe.
PSU is about 2 years old now, all parts for this build were purchased around the same time and I haven't made any major adjustments to them.
OS version is 10.0.19044 build 19044
RAM is on DIMMA2 and DIMMB2, the second and fourth slot as recommended by the motherboard manual

One other thing I forgot to mention: the latest crash (the one that happened on 5/29) was a bit strange. It happened when my computer was in the process of shutting down. I didn't even notice when it happened actually, it wasn't until a few days later that I found out there was a crash while I was scrolling through event viewer. Not sure if this information helps but I thought I should mention it. All the other crashes happened while the system was in active use.
 
Last edited:
this was a single bit corruption that occur after windows attempted to decompress some data that was stored in memory. These are hard to find: you will want to update the bios to get the best default memory timings, confirm your memory timings in bios then boot and run memtest86 to see if you can isolate a problem to a RAM stick.
you will also want to update the chipset driver from the motherboard vendors website.

system was running 1 day 10 hours before the bugcheck.

I will check the other two dumps to see if they are
different.
second bugcheck got memory corruption while when it copied something to ram and tried to compress it.

third bugcheck was a single bit corruption attempting to compress memory again.

-------------
most current bugcheck had error code:
2: kd> !error 0xc00002c4
Error code: (NTSTATUS) 0xc00002c4 (3221226180) - The system file %1 has become corrupt and has been replaced.

-system got a error decompressing a driver from memory store
-could not read bios info

not sure what this driver does:
C:\Program Files (x86)\EVGA\Kernel\driver-x64.sys Mon Jul 20 04:45:05 2020

---------------
7: kd> !sysinfo machineid
Machine ID Information [From Smbios 2.8, DMIVersion 0, Size=2550]
BiosMajorRelease = 5
BiosMinorRelease = 17
BiosVendor = American Megatrends International, LLC.
BiosVersion = 3.C3
BiosReleaseDate = 09/27/2021
SystemManufacturer = Micro-Star International Co., Ltd
SystemProductName = MS-7C02
SystemFamily = To be filled by O.E.M.
SystemVersion = 1.0
SystemSKU = To be filled by O.E.M.
BaseBoardManufacturer = Micro-Star International Co., Ltd
BaseBoardProduct = B450 TOMAHAWK MAX (MS-7C02)
BaseBoardVersion = 1.0

Processor Version AMD Ryzen 7 3700X 8-Core Processor
Processor Voltage 8bh - 1.1V
External Clock 100MHz
Max Speed 4400MHz
Current Speed 3600MHz
 
Last edited:
Jun 4, 2022
9
0
10
I am currently on the second latest BIOS. I have updated my bios a few times in response to these crashes actually, although MSI has since taken down a lot of those BIOS versions Ive tried so maybe they weren't too stable... There's a new one out now so I'll update to that and do another round of memtest testing. I've done them quite a few times at this point and they always pass but going for another round won't hurt.

I'm currently using the latest chipset drivers that AMD released (4.03.03.431). Do you recommend I use the one provided by the motherboard vendor instead? It think that one is a year out of date.

The EVGA driver is from Precision X1. It's overclocking software but I only use it for its GPU fan control features.

So I assume faulty RAM is still the prime culprit here? Just feels strange that this issue is only suddenly creeping up after all these years and somehow continues to evade the various tests I've tried. Every time I run memtest I'm begging for it to detect even a single error so I can just pull the trigger on buying a new RAM kit haha
 
Jun 4, 2022
9
0
10
I ran memtest over night. Did 3 complete cycles of 4 passes, would've done more but I kept having to wake up and restart the tests which was annoying. Everything came out clean, no errors in any of the tests.

Yeah I've had suddenly ram go bad a few times in the past on different machines of course. Easy to fix since those computers would simply not boot with the bad ram stick installed. This is the first time I've struggled with completely random crashes like these though..

I'll start running some prime95 tests and update you all with the results. thanks you
 
Jun 4, 2022
9
0
10
Ok so I did a quick Prime95 run for about 30 minutes. Blend test since that's what uses a lot of ram. I manually stopped it because my cpu was hitting 85C which was a bit worrisome but other than that the results were perfectly clean. It did some 768K and 4K tests which all passed, no warnings, errors, or crashes for any of the cores or the ram.

Should I try running prime95 longer and just ignore the heat? Today is a hot day where I live so I might have to boot it up at night so I can get a good stress test going without slamming into any thermal issues first.
 
-With memory corruption you first suspect memory and run memest86

-the second suspect is the bios and poor default memory timings.
it is common for bios not to set the command rate for your memory chips correctly. I could not read the type of memory chips you had installed in your machine but you might set the command rate to 2n if you can not find a fix to this problem. the command rate specifies how many clock ticks must occur before the memory address lines are to be considered stable. I did not think it was this problem because it generally shows up as an access violation in the memory dump.

- the third suspect is a bug in storage drivers where your memory is ok but when your system changes sleep states the memory gets paged to disk but during that process the image is corrupted. later when the system wakes up the disk image is put back into memory but it has been modified. Later windows uses that modified image and something goes wrong. Now that windows is compressing memory the compression and decompression looks for errors and bugchecks when a error is detected.

(just fyi)
Most of the time (on intel cpu systems) I see people have old versions of intel rapid storage technology and they just need to update the driver. Intel has a bug list/fix list for their driver and they fixed a bunch of corruption issues related to sleeping. (I am not sure what storage driver you had installed, I would have to look again)
microsoft storage drivers are better since microsoft updates them an pushes out the fixes to all machines. Intel drivers you have to update yourself.

I have a hard time finding buglist/fix lists for amd cpus and chipsets. generally the ones I find are years out of date. I have not been able to find any buglist from the CPUs made in china. other than china ships some suspect cpus that were for china only but people sold them on the grey market to people in the USA. I found special bios patches for these CPU but only being put out in china.
 
Last edited:
Jun 4, 2022
9
0
10
Command rate for my ram is 1N according to hwinfo.

Jeeze now that you mentioned it I'm getting hit by some vague memories of my ram command rate being set to 2N in the past, back when my system was rock solid stable. I've gone through quite a few bios updates in the past few months so maybe MSI changed something?

I rarely ever mess with ram timings. I was always too intimidated by it so I just use the xmp profile and hope it works and it usually always does. Is there a way for me to check what the recommended command rate for my ram is or that something the motherboard is supposed to just handle? Ram is TLZRD416G3200HC16CDC01 but I can't find any info on the command rate for it.

As for drivers, I'm using the default microsoft storage drivers. Neither Crucial nor my motherboard manufacturer provide any additional drivers outside of some RAID controllers. Any way for me to check if the storage drivers are acting up?
 
Command rate for my ram is 1N according to hwinfo.

Jeeze now that you mentioned it I'm getting hit by some vague memories of my ram command rate being set to 2N in the past, back when my system was rock solid stable. I've gone through quite a few bios updates in the past few months so maybe MSI changed something?

I rarely ever mess with ram timings. I was always too intimidated by it so I just use the xmp profile and hope it works and it usually always does. Is there a way for me to check what the recommended command rate for my ram is or that something the motherboard is supposed to just handle? Ram is TLZRD416G3200HC16CDC01 but I can't find any info on the command rate for it.

As for drivers, I'm using the default microsoft storage drivers. Neither Crucial nor my motherboard manufacturer provide any additional drivers outside of some RAID controllers. Any way for me to check if the storage drivers are acting up?
this article shows ram timings and it shows a 2t command rate.
T-Force Vulcan Z 2x 8GB DDR4-3200 C16 Kit Review: More Value For Asus Motherboards - Tom's Hardware | Tom's Hardware (tomshardware.com)
some motherboard also specify that when you fully populate the ram slots you have to increase the command rate when you fully populate the ram slots. it was a * that you had to find in a foot note in the printed docs.
 
  • Like
Reactions: Eupfhoria
Jun 4, 2022
9
0
10
Ok I'll try setting it up as 2t command rate, hopefully this either fixes things or at least helps narrow thing down. Sadly I can't find a way to trigger this BSOD artificially so all I can do is wait.

I'll go through the prime95 testing later today as well and I guess I'll just have to keep monitoring for any instability, fingers crossed.

Thank you. If I encounter any more errors or BSODs I'll come back with any new logs.
 
Ok I'll try setting it up as 2t command rate, hopefully this either fixes things or at least helps narrow thing down. Sadly I can't find a way to trigger this BSOD artificially so all I can do is wait.

I'll go through the prime95 testing later today as well and I guess I'll just have to keep monitoring for any instability, fingers crossed.

Thank you. If I encounter any more errors or BSODs I'll come back with any new logs.
I have seen sleep/wake bugs that would normally take a long time to reproduce but I found one that would crash after 7 sleep wake cycles. only took a maybe 10 minutes if you focus on it.
 
Jun 4, 2022
9
0
10
Hello again. I wasn't sure if I should keep using this thread or if I should make a new post on the Windows 11 section but I guess I'll try here first.

So I decided to do a clean install of Windows 11 on a brand new nvme ssd. I thought it would help with my previous issues (and maybe it did, I have not experienced any memory memory related BSODs as of yet, fingers crossed) but just my luck I'm running into new problems :(


Here are my minidumps:
https://files.catbox.moe/tmxezx.7z

Two differently errors but they both seem to be pointing towards nvidia drivers. I reinstalled them twice already, the first time it happened and again today. Both times it seemed to have triggered when I was interacting with the Windows 11 UI. First BSOD when I was messing around with the gamebar overlay and the second BSOD happened when I was playing a game, I pressed the mute button and when the volume bar popped up my game froze. I could still interact with windows for a few minutes but it was very slow and buggy and eventually I got the blue screen.

I ran sfc and dism and it found no issues. I ran checkdisk on my new ssd and it did find some issues though... The repairs were successful but it did delete a two files, msvcp40.dll and another msvcp dll, i had to reinstall the c++ packages to get them back. Crystal disk info says my drive is perfect with 100% health so I hope those two things were just flukes and not another headache I'll have to worry about... Anyway I'm not sure if that's related but I thought I'd mention it.

Any ideas what the issue here might be? I appreciate any help.
 
download autoruns64 and use it to remove this driver:

C:\Program Files (x86)\EVGA\Kernel\driver-x64.sys Mon Jul 20 04:45:05 2020

reboot and see if you still have a problem with graphics subsystem.

(note:
  • symbols not available for the windows insider build)
  • 3 modified windows core files.
you should run a malware scan and run
cmd.exe as an admin and run
dism.exe /online /cleanup-image /restorehealth
to fix modified files.
 
Jun 4, 2022
9
0
10
Huh. That's Precision X1 again. That's the second time you've found it in my crash dumps so I guess this is a good sign that I should get rid of it.

I'll follow your suggestions and see how things go. Thank you.
 

TRENDING THREADS