Question System BSODs and crashes

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

AspV

Distinguished
Oct 7, 2014
46
1
18,535
Hi, I sincerely need help with the constant crashes and BSOD's I have been having after upgrading my build's mobo,cpu,psu and ram.

Since the build went live, I have been experiencing the following BSODs:
  • Kmode exception not handled (most common one)
  • Driver IRQL not less or equal (second most common one)
  • *latest one: Kernel security check failure
and occasionnaly these as well mostly during startup:
  • Critical process died
  • Unexpected Kernel mode trap
  • System service exception
I have been trying to isolate where the issue(s) might be, and these are the following steps I have gone through so far:
  • Reseated each hardware component.
  • Checked that bios is up to date.
  • Removed two of the 4 ram modules, to see if there were compatibility issues between modules.
  • memtest86 on all 4 ram modules (Passed test)
  • Removed gpu drivers with DDU and reinstalled newest.
  • Used the "Reset this pc" option in windows 10 while keeping files.
  • In cmd prompt, did DISM.exe /Online /Cleanup-image /Restorehealth
  • In cmd prompt, did sfc /scannow
  • In cmd prompt did chkdsk and chkdsk /r until completion. (Seemed to have restored some partitioned parts on the C: Drive).
  • Used whocrashed program to check dump files and removed a program that had a .sys file that was associated with a "Kmode exception not handled" BSOD.
  • Removed a bit of seemingly extra thermal paste on cpu.
  • Switched the present ram modules, for two other ram modules and slotted them in a2 and b2 dimms, as advised by mobo manual.
  • Switch ram frequency to 6000mhz in bios. Reverted to auto, as problems persisted.
  • Tried other games/programs, namely Subnautica worked flawlessly until getting a memory related error.
  • Restarted the bios update procedure and flashed bios following some advice from another thread but still getting crashes.
  • Re-launched system from a clean windows install.
  • Removed Nvidia Drivers in safe mode using DDU.
  • *(last) Opened up case again to reseat cpu, switched the two ram modules for the two others and slotted in a2 & b2, and removed wifi adaptator that came with motherboard. Also set xmp setting in bios for ram.
Additional Info:

- I tried to launch benchmarks in 3D Mark, the tests begin to run and then stop and crash while the demo is presenting, ending in a saturated pixelated stopped frame of the demo.

- Also, I tried to launch Overwatch game on battlenet to see if it would launch , and the program launches a black screen, at the same time a window pops up with the following message: " An external program has caused Overwatch to crash. Please add Overwatch to your anti-virus exception list and close any streaming, recording or overlay software." BLZOWCLI00000006.

- And, while using chrome, I randomly got the status_access_violation a few times.

Latest dumpfiles:

https://we.tl/t-gtPuIATpQo
*most recent one:
https://we.tl/t-NpIx711hJW
(mosly IRQL error now).

*Here is a recent crash report:
Source
Subnautica.exe

Summary
Stopped working

Date
‎12/‎5/‎2022 9:48 AM

Status
Report sent

Description
Faulting Application Path: C:\Program Files (x86)\Steam\steamapps\common\Subnautica\Subnautica.exe

Problem signature
Problem Event Name: BEX64
Application Name: Subnautica.exe
Application Version: 2019.2.17.441
Application Timestamp: 5ed2782e
Fault Module Name: StackHash_fc3f
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 00000000
Exception Offset: PCH_4E_FROM_ntdll+0x000000000009DC14
Exception Code: c0000005
Exception Data: 0000000000000008
OS Version: 10.0.19045.2.0.0.768.101
Locale ID: 1033
Additional Information 1: fc3f
Additional Information 2: fc3f37b96f365cc38a68eb46be07153d
Additional Information 3: 2020
Additional Information 4: 2020226c285db419bf84b223def456bc

Extra information about the problem
Bucket ID: 23dd974a79173b3c3723213b52185597 (1667212825721329047)

Current build:
Motherboard: Rog Strix 2690-F gaming wifi
CPU: I9-13900KF
RAM: Kingston DDR5 6000Mhz 16gb (x2) *(a2 & b2)
GPU: RTX 3090
PSU: Seasonic 1300w G
SSds: Samsung SSD 840 evo 1tb & 960 PRO 1tb

Any help would be extremely appreciated, something I didn't understand from the dumpfiles maybe ? In any case please help if possible, thank you.

*updates
 
Last edited:
i would ask on the Asus forums as the ram is on their list right?
I think its top of list - https://rog.asus.com/motherboards/rog-strix/rog-strix-z690-f-gaming-wifi-model/helpdesk_qvl_memory/
J9kje7C.jpg

So they tested it, it should work.
 
  • Like
Reactions: AspV
Maybe that could be a conflict with the mobo if it doesnt support lower speeds than 4400mhz.

According to Asus, lowest what your MoBo supports is 4800 Mhz. And if you do have one of the earlier DDR5 RAM sticks, that were made to run 4000 Mhz, then there are compatibility issues.

Though, according to wiki, lowest JEDEC standard for DDR5 sits at 4400 Mhz. So, i have no idea how your RAM can even sit at 4000 Mhz. Yours looks like engineering sample or similar in this case. 🤔
DDR5 wiki: https://en.wikipedia.org/wiki/DDR5_SDRAM

I either need new set of ram, or need to find the correct timing/values in bios for them.

If you have any ideas about what I could modify in bios settings for the ram, any input is welcomed.

You might get lucky with manual adjustment of memory timings, to make your RAM stable enough. But going with new RAM could be the only fix, in the end.

As far as memory timings go, and what to do with those; you have two options:
  1. Tighten the timings.
  2. Loosen the timings.

Tightening the timings are best done when you have stable RAM and you want to squeeze every bit of performance out of the RAM. Loosening timings is best done when default XMP doesn't hold and you need to get to the frequency target. (E.g when XMP timings are 40-40-40-80, then loosening would mean 40-42-42-86. Tightening would be 38-38-38-78.)

Then there are frequency increase and voltage adjustments as well, that go along with the manual RAM OC,
further reading: https://www.tomshardware.com/how-to/overclock-ddr5-ram

So, if you have willpower and time, to put into manual RAM OC (RAM tweaking), you can put in the effort for it. Though, keep in mind that it all could be in vain and if you'd rather not waste your effort in doing all this, then going with new RAM set would be better.

I tried Unigine superposition tests:

Your GPU seems running fine, since it was able to complete Superposition. And it wasn't bottom of the barrel either (compared to the submitted results),
link: https://benchmark.unigine.com/leaderboards/superposition/1.x/1080p-extreme/single-gpu

Just put 3090 into search bar and select Nvidia RTX 3090 to view results. Out of all submitted 3090 results in 1080p Extreme, yours is around page 8. (Then again, your GPU isn't OCd either.) So, i think that we can assume your GPU not to be culprit in all of this.

As of why 3D Mark doesn't want to work, well, perhaps your RAM is interfering with it. 🤔
 
  • Like
Reactions: AspV
According to Asus, lowest what your MoBo supports is 4800 Mhz. And if you do have one of the earlier DDR5 RAM sticks, that were made to run 4000 Mhz, then there are compatibility issues.

Though, according to wiki, lowest JEDEC standard for DDR5 sits at 4400 Mhz. So, i have no idea how your RAM can even sit at 4000 Mhz. Yours looks like engineering sample or similar in this case. 🤔
DDR5 wiki: https://en.wikipedia.org/wiki/DDR5_SDRAM



You might get lucky with manual adjustment of memory timings, to make your RAM stable enough. But going with new RAM could be the only fix, in the end.

As far as memory timings go, and what to do with those; you have two options:
  1. Tighten the timings.
  2. Loosen the timings.
Tightening the timings are best done when you have stable RAM and you want to squeeze every bit of performance out of the RAM. Loosening timings is best done when default XMP doesn't hold and you need to get to the frequency target. (E.g when XMP timings are 40-40-40-80, then loosening would mean 40-42-42-86. Tightening would be 38-38-38-78.)

Then there are frequency increase and voltage adjustments as well, that go along with the manual RAM OC,
further reading: https://www.tomshardware.com/how-to/overclock-ddr5-ram

So, if you have willpower and time, to put into manual RAM OC (RAM tweaking), you can put in the effort for it. Though, keep in mind that it all could be in vain and if you'd rather not waste your effort in doing all this, then going with new RAM set would be better.



Your GPU seems running fine, since it was able to complete Superposition. And it wasn't bottom of the barrel either (compared to the submitted results),
link: https://benchmark.unigine.com/leaderboards/superposition/1.x/1080p-extreme/single-gpu

Just put 3090 into search bar and select Nvidia RTX 3090 to view results. Out of all submitted 3090 results in 1080p Extreme, yours is around page 8. (Then again, your GPU isn't OCd either.) So, i think that we can assume your GPU not to be culprit in all of this.

As of why 3D Mark doesn't want to work, well, perhaps your RAM is interfering with it. 🤔
i would ask on the Asus forums as the ram is on their list right?
I think its top of list - https://rog.asus.com/motherboards/rog-strix/rog-strix-z690-f-gaming-wifi-model/helpdesk_qvl_memory/

So they tested it, it should work.

Thanks for your help.

- Regarding the ram I am not sure. I read that when 4 sticks are inserted, some users have reported the frequency listed as 4000mhz in bios.
Interestingly, in that system in the end I had only two sticks in a2 & b2, and In auto/jedec the bios definitely intermitently would list the frequency as 4800mhz, and then 4000mhz after a reboot, and then back to 4800mhz etc.
Very odd behaviour to say the least.

- Thank you for taking the time to give me some memory timing basics. I would like to use it to OC at some point, but as you say, It would be best to start from a healthier base with sticks that actually function properly.

- Yes I was quite happy with the benchmarking results, which makes it even more frustrating because when the system actually works it feels great, and then it crashes or bsods.

Today I got frustrated by the 72hrs+ of non stop troubleshooting. I took the z690-F, the i9-13900k and the 4 sticks of ddr5 16gb and went to the component store where I had gotten the parts and explained my issues and the developments.
We sort of concluded that since memtests passed, and the cpu diags passed, and based on the technician's experience as well, that the issue just had to be the motherboard. So I left them the z690-F to RMA, and took a z690 Torpedo (msi) and and a M.2 ssd to make a fresh install of windows, to definitely rule out the current ssd I use for OS, just in case.

And here I am, with the updated bios on the new mobo, on a clean install of windows that has updated, on a new drive, with just 1 set of 2 sticks of ram that are QVL for that mobo (same ones) in a1 and b2 dimm slots.
And guess what, I get the exact same crashes and bsods as with the old system with the z690-F.
I am cursed, thats just it I guess. I mean, could it be that the two separate sets of ram are bad ? The probability would be insane. That leaves the cpu, and the ram. And the psu ???

I am about to run some memtest on these two sticks. Maybe the downclocking from 4 sticks corrupted the first test I did I don't know.
 
I mean, could it be that the two separate sets of ram are bad ?

Not bad, per se, but not fully optimized is my guess.

Since you only switched out MoBo, while keeping the same CPU and RAM, and IF the issue is with RAM, then you'd also have same issue with new MoBo.

It could come down between different RAM OEMs. Samsung, Micron and SK Hynix. Your Kingston DIMMs are SK Hynix, how about trying Samsung (e.g G.Skill)? Since that would be biggest change in RAM and most likely give you the biggest difference as well.

DDR5_Table1.jpg


A bit further reading between the 3 RAM OEMs,
link: https://www.eetasia.com/comparing-ddr5-memory-from-micron-samsung-sk-hynix/
 
  • Like
Reactions: AspV
Not bad, per se, but not fully optimized is my guess.

Since you only switched out MoBo, while keeping the same CPU and RAM, and IF the issue is with RAM, then you'd also have same issue with new MoBo.

It could come down between different RAM OEMs. Samsung, Micron and SK Hynix. Your Kingston DIMMs are SK Hynix, how about trying Samsung (e.g G.Skill)? Since that would be biggest change in RAM and most likely give you the biggest difference as well.
A bit further reading between the 3 RAM OEMs,
link: https://www.eetasia.com/comparing-ddr5-memory-from-micron-samsung-sk-hynix/

Yes I really wonder what the issue is by now. Regarding the G.Skill ram, it could be a solve, although the technician I spoke with yesterday from the component store mentioned a system he is building for a client with a z790 board and Gskill ddr5 that is currently failing memtest with tons of errors everywhere.
In the meantime, I’m running memtest on the Kingston sticks right now and there are still no errors just like with my initial testing a few days ago.

ddr5 doesn't seem ready yet. Half baked, it shouldn't drop speeds randomly on restarts, and it has problems running 4 sticks.

Honestly, I wish I had I had a ddr4 compatible board at this point. But then again, there seems to be a lot of reports of ddr5 ram and boards working extremely well, so its all very confusing.
 
mentioned a system he is building for a client with a z790 board and Gskill ddr5 that is currently failing memtest with tons of errors everywhere.
But then again, there seems to be a lot of reports of ddr5 ram and boards working extremely well, so its all very confusing.

What you mostly see in the net, are failures, since people love to post failures. But you rarely see success stories. This can skew your outlook regarding PC hardware (or any item in that matter).

Sometimes, even i get that illusion, that none of the PC hardware ever works, since what i see here in TH forums, on daily basis, are only failures. (Since people post their issues here to get help.) But at that moment, i need to remind myself that TH forums is the place where failures are posted, while the bulk of PCs out there, work just fine, and i shouldn't worry about this on a grand scale. This gives me willpower to continue on in here, to help those in need. :)

I don't know if you've followed PC news, but lately, the Nvidia PCI-E 16-pin adapter (aka 12VHPWR), for RTX 40-series, has melted during usage. Internet blew up when first reports came in, since it melted with very expensive RTX 4090.

At the beginning of the saga, it felt like "all" Nvidia adapters shipped with RTX 4090 GPUs are at fault and do melt at some point. 😱 It took some time for press (GamersNexus) and OEM (Nvidia) to get to the bottom of this and find out why the new 16-pin adapter melted.
Conclusion:

View: https://www.youtube.com/watch?v=_QmKYJzJhB4


Out of all RTX 4090 units sold, ~125000 units, only 50 produced the issue. That's 0.4% of all RTX 4090 GPUs sold. So, the failure rate is actually very low. And most of them (if not all) were found out to be user error (not plugging in the adapter fully).

So...yeah. Need to keep in mind that there are far more success stories than failures. But due to the human nature, you're bound to see failures far more often than success stories.
 
  • Like
Reactions: helper800 and AspV
People with no problems generally don't come to help forums to say everything is fine... unless they answer questions on them regularly. It can make you think everything is bad... that and good news doesn't get clicks.

Funny, I watched this earlier and thought of this thread

I have one argument with that video. Okay, maybe on intel you can mix sticks that aren't from the same set and it works fine, but on AMD if you mix sticks that aren't tested with each other, you more likely to get BSOD. Mixing sets is never the ideal idea.

btw, 4000 is the base speed of DDR5, he hits it in that video, as well as 4800.
 
Last edited:
  • Like
Reactions: AspV
What you mostly see in the net, are failures, since people love to post failures. But you rarely see success stories. This can skew your outlook regarding PC hardware (or any item in that matter).

Sometimes, even i get that illusion, that none of the PC hardware ever works, since what i see here in TH forums, on daily basis, are only failures. (Since people post their issues here to get help.) But at that moment, i need to remind myself that TH forums is the place where failures are posted, while the bulk of PCs out there, work just fine, and i shouldn't worry about this on a grand scale. This gives me willpower to continue on in here, to help those in need. :)

I don't know if you've followed PC news, but lately, the Nvidia PCI-E 16-pin adapter (aka 12VHPWR), for RTX 40-series, has melted during usage. Internet blew up when first reports came in, since it melted with very expensive RTX 4090.

At the beginning of the saga, it felt like "all" Nvidia adapters shipped with RTX 4090 GPUs are at fault and do melt at some point. 😱 It took some time for press (GamersNexus) and OEM (Nvidia) to get to the bottom of this and find out why the new 16-pin adapter melted.
Conclusion:

View: https://www.youtube.com/watch?v=_QmKYJzJhB4


Out of all RTX 4090 units sold, ~125000 units, only 50 produced the issue. That's 0.4% of all RTX 4090 GPUs sold. So, the failure rate is actually very low. And most of them (if not all) were found out to be user error (not plugging in the adapter fully).

So...yeah. Need to keep in mind that there are far more success stories than failures. But due to the human nature, you're bound to see failures far more often than success stories.
People with no problems generally don't come to help forums to say everything is fine... unless they answer questions on them regularly. It can make you think everything is bad... that and good news doesn't get clicks.

Funny, I watched this earlier and thought of this thread

I have one argument with that video. Okay, maybe on intel you can mix sticks that aren't from the same set and it works fine, but on AMD if you mix sticks that aren't tested with each other, you more likely to get BSOD. Mixing sets is never the ideal idea.

btw, 4000 is the base speed of DDR5, he hits it in that video, as well as 4800.

FINAL UPDATE !

First of all, thanks a lot to both of you for your help in troubleshooting the issues I had with my system. I am very thankful for your help.

This morning I went back to the component store, to refund the z690 Torpedo, since the mobo wasn't the issue, and while speaking with the technician, we assumed that the ram was the problem. So I left them the two kits of kingston ram for testing, and took a set of T-Force ddr5 2x16gb 6000Mhz, because I needed ram urgently for the system and the technician was confident that that ram worked well on other pc's he worked on. And I grabbed the z690-F back that I had left for RMA.
I set up the new system with the z690-F and the T-force ram, on a clean windows install etc. And the second that the computer booted windows after installation, I get a kmode_exception_not_handled BSOD. Needless to say I was about to lose it.

At this point, I had the phone number of the technician, and showed him the new errors, and he basically said, try it with xmp off, which I did, and I got the same BSODs. That's when he said, well come back to the store and we will exchange your i9-13900KF for a new one.

And there you go... a little while later, after rebuilding that setup for the upteenth time I finally (FINGERS CROSSED) have a stable system. No BSODs yet, and 3DMARK was able to proceed to completion, which is the first time since I changed my system's parts.

So, to anyone interested, it very well seems that the issue was a faulty cpu since the beginning.

Thanks again Aeacus and Colif for your patience and help !
 
  • Like
Reactions: Aeacus
it very well seems that the issue was a faulty cpu since the beginning.

CPUs are very reliable and it is extremely rare to see CPU to work (boot to OS) but still not work right. :heink: Now, if CPU would've been OC'd, then this would've given us more suspicion about CPU. But running stock clocks and still getting stability issues... 🤔 - i think this is the 1st time i see it.

RAM issues are far more common and hence why i (we) focused more on the RAM.

That's when he said, well come back to the store and we will exchange your i9-13900KF for a new one.

Process of elimination eventually lead to that. :)
We tested your GPU and found it to work fine. You got MoBo replacement and issues remain. Then you tried 2nd RAM (T-Force) and still no dice. Only thing to swap was the CPU. And it's nice to hear that you got your issues solved. 👍

Thanks again Aeacus and Colif for your patience and help !

You're welcome. :)
 
  • Like
Reactions: AspV