Question I've had my PC since last November, but i've had 30-40 BSODs in the last few months - all different reasons ?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Oct 16, 2023
23
0
10
EDIT (17/10/23): I think this issue may be becoming semi-reproducable. After playing a game for a bit, the crash usually happens within 10-20 minutes of closing it. Could it be temperature related? Would that explain the inconsistency of time/crash code?

EDIT (19/10/23): I have now reinstalled Windows and ran Memtest on both sticks (at the same time) with 0 errors. I will carry out individual tests when I have time.

Man.. where do I start?

The first crash I suppose? Quick google of the code, generic bad driver advice. Right, so.. drivers. So I had a look. NVidia GPu drivers up to date, windows up to date, checked iCUE and updated all the firmware and hardware on there. Still crashing. During games, during firefox usage, it often crashes on the sign-in page before even signing in if I leave it for a while!

I have poured through loads of minidumps with 0 prior knowledge but i've learned a lot about what sort of things to look for, however I am yet to see any similarities between the dump files. ntkrnlmp.exe is a frequent visitor in my minidumps. No idea what it is or what it means.

I should note, that I have a custum built PC, and it definitely isn't my first. I was careful with the installation and have treated the PC well since. It is a small form mini ITX PC, so its pretty damn small and compact, and I've noticed the idle temperatures are creeping up over the months. I'm wondering if this could be a factor. Perhaps it just needs a thorough clean (which it's had already before, again, carefully.)

Many other unresolved cases of this type have pointed towards faulty RAM, or maybe the PSU, and I am yet to test that. I have also so far NOT completed a full windows install and reset, though I am tempted to completely blitz and sterilize the PC as much as possible and reinstall, maybe even to Windows 11 if that would help.

I'd like to upload the 5 most recent minidumps but im not sure how on here, unless I just use a third-party and add a link to this.

LINK to MINIDUMPS: https://www.mediafire.com/folder/qdxufhsve929e/Minidumps (FOUR in one day!)

LINK TO MINIDUMPS (19/10/23): https://www.mediafire.com/folder/xdhfvyendfbjx/Minidumps

I do have more specific information but thats mostly from the content of the dumps, so I will let you make your own conclusions so not to bias your responses, feel free to ask any questions after! If anyone could provide their help and expertise it would mean a lot, I really am at the end of my knowledge and capabilities with this one.

Specs:
CM NR200P case
i5-13600k (Kraken X53 240mm cooler)
ASRock Intel Z690M-ITX/ax mini-ITX
Corsair SF750 PSU
MSI RTX 3080
2 x 16GB Corsair Vengeance RS RGB 3200 RAM
1x Samsung 980 1Tb PCIe Gen3 M.2 SSD (Windows on C partition, games on rest of it).
2x SanDisk SSD PLUS 1Tb
1x SanDisk Ultra ii 500Gb

Many thanks.
 
Last edited:
Oct 16, 2023
23
0
10
What are these reasons that make a very basic part of PC ownership a last resort? It would help if we understood the complete situation.
Well, living and working away from home with poor internet is definitely a factor. Not sure how my personal situation is relevant though, this is clearly a technical problem! Yes, I understand a windows reset is a valuable troubleshooting option, but I have to admit I really want to find out what it is before I do that.
 

DSzymborski

Curmudgeon Pursuivant
Moderator
Well, living and working away from home with poor internet is definitely a factor. Not sure how my personal situation is relevant though, this is clearly a technical problem! Yes, I understand a windows reset is a valuable troubleshooting option, but I have to admit I really want to find out what it is before I do that.

Knowing *why* a crucial corner is being cut is crucial to understanding the underlying problems that stem from that shortcut. The more we understand, the better the advice given. There's been a lot of advice given in this thread that's very good, but all of it is meaningless if the basic stuff isn't done first.

The fact is, this is like going to the ER with leg pain and a knife sticking out of your leg and wanting them to rule out all possible other causes of leg pain before considering it might be the stab wound. You skipped something crucial that can cause bad things to happen. Bad things are happening.

There could be something else going on. Maybe that stabbed leg also has a torn ACL from an unrelated injury. But the first thing you have to do before checking that out is to remove the dang knife.
 
  • Like
Reactions: kira-faye
Oct 16, 2023
23
0
10
Knowing *why* a crucial corner is being cut is crucial to understanding the underlying problems that stem from that shortcut. The more we understand, the better the advice given. There's been a lot of advice given in this thread that's very good, but all of it is meaningless if the basic stuff isn't done first.

The fact is, this is like going to the ER with leg pain and a knife sticking out of your leg and wanting them to rule out all possible other causes of leg pain before considering it might be the stab wound. You skipped something crucial that can cause bad things to happen. Bad things are happening.

There could be something else going on. Maybe that stabbed leg also has a torn ACL from an unrelated injury. But the first thing you have to do before checking that out is to remove the dang knife.
Well I can definitely say its not windows. Reset it today, and it crashed on setup 🤣 irql less or not equal.

Also crashed earlier today after changing ram frequency to 3200 as suggested before.
 

kira-faye

Notable
Oct 11, 2023
387
167
890
Strongly recommend you bump RAM voltage up like .05v from whenever your sticks are rated at and reinstall Windows fully. Ideally boot from a USB installer and wipe the drive before install, not using the reset from within Windows.
 
Oct 16, 2023
23
0
10
Strongly recommend you bump RAM voltage up like .05v from whenever your sticks are rated at and reinstall Windows fully. Ideally boot from a USB installer and wipe the drive before install, not using the reset from within Windows.
Upped the voltage and reinstalled windows properly, still the same. Got another USB stick today so going to load it up with memtest and give it a go.

After 2 windows reinstall I would suspect its hardware issue now.
 
Oct 16, 2023
23
0
10
As long as you installed all Windows Updates and all necessary drivers, If you still have the issue after a reinstall then it is a hardware problem.
Is there a way to check what drivers I need exactly? I've gone through all my components and downloaded everything I might need drivers wise from manufacturer websites, and updated windows. Not sure what else there might be?
 
Oct 16, 2023
23
0
10
Open Device Manager. Do any entries have a yellow triangle with a black exclamation mark in it next to them?
None.

I also ran Memtest with both sticks in, came back with a pass, 0 errors. I will run it with each stick seperately tomorrow, but I'm not sure whether to keep in the slots that they are in now or put them both in the first slot?
 
Oct 16, 2023
23
0
10

Most recent minidumps. Could someone take a look at these for me please?

I assume using driver verifier is the next course of action?
 

ubuysa

Distinguished
I don't think Driver Verifier is an appropriate tool here, I think this is more likely to be a hardware problem.

With Memtest86 you really want to run it twice, one after the other, so you do 8 iterations of the 13 different tests, to give it the best chance of detecting RAM errors. Even then it can never detect 100% of errors, which is why I'm an advocate of removing one RAM stick and running on just one for a time. Then swap sticks and run on just the other for a time. That will show whether one is flaky or not.

These two dumps are interesting however, because they both occurred during a storage drive access. That might give us a different line of enquiry...

The 0xA IRQL_NOT_LESS_OR_EQUAL bugcheck, which is usually a third-party driver memory allocation foul-up, occurs in this dump in a Microsoft function - that strongly suggests a hardware cause. The call stack shows calls to the Microsoft stornvme.sys driver, this of course manages access to a NVMe storage drive...
Code:
5: kd> !dpx
Start memory scan  : 0xffffc08b3806f638 ($csp)
End memory scan    : 0xffffc08b38070000 (Kernel Stack Base)

               rsp : 0xffffc08b3806f638 : 0xfffff8041b610e29 : nt!KiBugCheckDispatch+0x69
0xffffc08b3806f638 : 0xfffff8041b610e29 : nt!KiBugCheckDispatch+0x69
0xffffc08b3806f660 : 0xfffff8041b44c8ed : nt!KiRetireDpcList+0x2fd
0xffffc08b3806f778 : 0xfffff8041b60c9e3 : nt!KiPageFault+0x463
0xffffc08b3806f780 : 0xffffe781ea7c0180 :  Trap @ ffffc08b3806f780
0xffffc08b3806f790 : 0xffffc08b3806f8a0 : 0xffffc08b3806f7e8 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806f798 : 0xfffff8041b44d650 : nt!KiExecuteAllDpcs+0x460
0xffffc08b3806f7e8 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806f8a0 : 0xffffc08b3806f7e8 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806f8e8 : 0xfffff8041b44c8ed : nt!KiRetireDpcList+0x2fd
0xffffc08b3806f9d8 : 0xfffff8041b52a8aa : nt!HalPerformEndOfInterrupt+0x1a
0xffffc08b3806f9f0 : 0xfffff8041b200000 : "nt!VrpRegistryString <PERF> (nt+0x0)"
0xffffc08b3806fa08 : 0xfffff8041b5feda5 : nt!KiInterruptDispatch+0x85
0xffffc08b3806fa28 : 0xfffff8041b443b28 : nt!PoIdle+0x3a8
0xffffc08b3806fa60 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806fb98 : 0xfffff8041b60166e : nt!KiIdleLoop+0x9e
The Windows stornvme.sys driver is not at fault of course, but some NVMe drives have a third-party driver between stornvme.sys and the NVMe drive itself and that could be at fault. On the other hand, it could be the NVMe drive that's a bit flaky.

The 0x21 QUOTA_UNDERFLOW bugcheck indicates a drive storage quota issue. These can be caused by bad third-party storage drivers, third-party anti-malware products, or hardware issues with the storage drive. The problem in this dump happens with the nt!PspReturnQuota function, where unused quota is returned to the pool...
Code:
5: kd> knL
 # Child-SP          RetAddr               Call Site
00 fffff080`db286ef8 fffff801`08a23cd7     nt!KeBugCheckEx
01 fffff080`db286f00 fffff801`08816e24     nt!PspReturnQuota+0x20d457
02 fffff080`db286f60 fffff801`08fb80b9     nt!ExFreeHeapPool+0x464
03 fffff080`db287040 fffff801`089f136a     nt!ExFreePool+0x9
04 fffff080`db287070 fffff801`08817a73     nt!IopProcessBufferedIoCompletion+0x7a
05 fffff080`db2870b0 fffff801`08854d26     nt!IopCompleteRequest+0xd3
06 fffff080`db287180 fffff801`088544f7     nt!IopfCompleteRequest+0x816
07 fffff080`db287260 fffff801`90246883     nt!IofCompleteRequest+0x17
08 fffff080`db287290 ffffc184`00000000     cpuz157_x64+0x6883
09 fffff080`db287298 00000000`00000000     0xffffc184`00000000
Although just prior to that function is another Windows function (nt!ExFreeHeapPool) freeing a heap (a heap is a memory allocation unit), and that points us back at RAM of course.

There is a third-party driver referenced here too (cpuz157_x64.sys) and they are always suspect, but I think that because both BSODs reference a storage device in different ways that might suggest a potential NMVe drive issue?

I've seen many M.2 drives that were poorly seated, removing and re-seating them solved all sorts of niggly problems. Try removing and firmly re-seating the NVMe drive(s). If that doesn't help, then download Samsung Magician and run a full diagnostic on the 980 NVMe drive. Also check for firmware and/or driver updates with Magician.

None of this excludes RAM from being the potential root cause, and we can never be fully certain that this isn't some sort of strange software issue until you either do a clean install of Windows or, alternatively, run Windows in Safe Mode for several hours.
 
Last edited:
Oct 16, 2023
23
0
10
I don't think Driver Verifier is an appropriate tool here, I think this is more likely to be a hardware problem.

With Memtest86 you really want to run it twice, one after the other, so you do 8 iterations of the 13 different tests, to give it the best chance of detecting RAM errors. Even then it can never detect 100% of errors, which is why I'm an advocate of removing one RAM stick and running on just one for a time. Then swap sticks and run on just the other for a time. That will show whether one is flaky or not.

These two dumps are interesting however, because they both occurred during a storage drive access. That might give us a different line of enquiry...

The 0xA IRQL_NOT_LESS_OR_EQUAL bugcheck, which is usually a third-party driver memory allocation foul-up, occurs in this dump in a Microsoft function - that strongly suggests a hardware cause. The call stack shows calls to the Microsoft stornvme.sys driver, this of course manages access to a NVMe storage drive...
Code:
5: kd> !dpx
Start memory scan  : 0xffffc08b3806f638 ($csp)
End memory scan    : 0xffffc08b38070000 (Kernel Stack Base)

               rsp : 0xffffc08b3806f638 : 0xfffff8041b610e29 : nt!KiBugCheckDispatch+0x69
0xffffc08b3806f638 : 0xfffff8041b610e29 : nt!KiBugCheckDispatch+0x69
0xffffc08b3806f660 : 0xfffff8041b44c8ed : nt!KiRetireDpcList+0x2fd
0xffffc08b3806f778 : 0xfffff8041b60c9e3 : nt!KiPageFault+0x463
0xffffc08b3806f780 : 0xffffe781ea7c0180 :  Trap @ ffffc08b3806f780
0xffffc08b3806f790 : 0xffffc08b3806f8a0 : 0xffffc08b3806f7e8 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806f798 : 0xfffff8041b44d650 : nt!KiExecuteAllDpcs+0x460
0xffffc08b3806f7e8 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806f8a0 : 0xffffc08b3806f7e8 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806f8e8 : 0xfffff8041b44c8ed : nt!KiRetireDpcList+0x2fd
0xffffc08b3806f9d8 : 0xfffff8041b52a8aa : nt!HalPerformEndOfInterrupt+0x1a
0xffffc08b3806f9f0 : 0xfffff8041b200000 : "nt!VrpRegistryString <PERF> (nt+0x0)"
0xffffc08b3806fa08 : 0xfffff8041b5feda5 : nt!KiInterruptDispatch+0x85
0xffffc08b3806fa28 : 0xfffff8041b443b28 : nt!PoIdle+0x3a8
0xffffc08b3806fa60 : 0xfffff8041f0f30f0 : stornvme!NVMeCompletionDpcRoutine
0xffffc08b3806fb98 : 0xfffff8041b60166e : nt!KiIdleLoop+0x9e
The Windows stornvme.sys driver is not at fault of course, but some NVMe drives have a third-party driver between stornvme.sys and the NVMe drive itself and that could be at fault. On the other hand, it could be the NVMe drive that's a bit flaky.

The 0x21 QUOTA_UNDERFLOW bugcheck indicates a drive storage quota issue. These can be caused by bad third-party storage drivers, third-party anti-malware products, or hardware issues with the storage drive. The problem in this dump happens with the nt!PspReturnQuota function, where unused quota is returned to the pool...
Code:
5: kd> knL
 # Child-SP          RetAddr               Call Site
00 fffff080`db286ef8 fffff801`08a23cd7     nt!KeBugCheckEx
01 fffff080`db286f00 fffff801`08816e24     nt!PspReturnQuota+0x20d457
02 fffff080`db286f60 fffff801`08fb80b9     nt!ExFreeHeapPool+0x464
03 fffff080`db287040 fffff801`089f136a     nt!ExFreePool+0x9
04 fffff080`db287070 fffff801`08817a73     nt!IopProcessBufferedIoCompletion+0x7a
05 fffff080`db2870b0 fffff801`08854d26     nt!IopCompleteRequest+0xd3
06 fffff080`db287180 fffff801`088544f7     nt!IopfCompleteRequest+0x816
07 fffff080`db287260 fffff801`90246883     nt!IofCompleteRequest+0x17
08 fffff080`db287290 ffffc184`00000000     cpuz157_x64+0x6883
09 fffff080`db287298 00000000`00000000     0xffffc184`00000000
Although just prior to that function is another Windows function (nt!ExFreeHeapPool) freeing a heap (a heap is a memory allocation unit), and that points us back at RAM of course.

There is a third-party driver referenced here too (cpuz157_x64.sys) and they are always suspect, but I think that because both BSODs reference a storage device in different ways that might suggest a potential NMVe drive issue?

I've seen many M.2 drives that were poorly seated, removing and re-seating them solved all sorts of niggly problems. Try removing and firmly re-seating the NVMe drive(s). If that doesn't help, then download Samsung Magician and run a full diagnostic on the 980 NVMe drive. Also check for firmware and/or driver updates with Magician.

None of this excludes RAM from being the potential root cause, and we can never be fully certain that this isn't some sort of strange software issue until you either do a clean install of Windows or, alternatively, run Windows in Safe Mode for several hours.
Really, really fantastic input, thank you! I had seen some crashes to do with storage when I googled some older crashes, particularly the ones that referenced VolMgr. I ran a magician sweep and it came back clean, though I haven't reseated it yet. I shall give that a try.

Does it matter that windows is installed on the m.2 drive? Would that increase the likelihood of foul play between windows and drivers/the drive itself?
 
Last edited:

kira-faye

Notable
Oct 11, 2023
387
167
890
Really, really fantastic input, thank you! I had seen some crashes to do with storage when I googled some older crashes, particularly the ones that referenced VolMgr. I ran a magician sweep and it came back clean, though I haven't reseated it yet. I shall give that a try.

Does it matter that windows is installed on the m.2 drive? Would that increase the likelihood of foul play between windows and drivers/the drive itself?
No, not really. Widows should always be installed on your fastest drive, so nothing wrong there. If the drive is bad it's bad, you're going to have problems. Yes, you might see them more frequently if it's your OS drive, but it needs to be fixed regardless.
 
  • Like
Reactions: ubuysa
Oct 16, 2023
23
0
10
No, not really. Widows should always be installed on your fastest drive, so nothing wrong there. If the drive is bad it's bad, you're going to have problems. Yes, you might see them more frequently if it's your OS drive, but it needs to be fixed regardless.
Thank you. Will try to reinstall drivers for the m2 and try for an RMA. Would temperatures affect instability? Either way, I will reseat the drive and see how it goes
 
Oct 16, 2023
23
0
10
Been away for a few days, come back and a new crash. New error this time. SYSTEM_SERVICE_EXCEPTION (3b).


PROCESS_NAME: firefox.exe

STACK_TEXT:
fffffa02`e66ef000 fffff807`6087a08c : ffffcd8b`5b4e1000 00000266`b6585cc8 fffffa02`00000000 00000000`00000338 : nt!MmCopyToCachedPage+0x22f
fffffa02`e66ef0d0 fffff807`608c734a : ffffac0d`6d974da0 00000266`b6585cc8 fffffa02`e66ef2c8 00000000`00000000 : nt!CcMapAndCopyInToCache+0x41c
fffffa02`e66ef270 fffff807`626fc8ac : fffffa02`e66ef3e0 00000000`00001000 00000000`00021338 00000000`00001000 : nt!CcCopyWriteEx+0xea
fffffa02`e66ef2f0 fffff807`5c4b783c : 00000000`00000000 fffffa02`e66ef708 fffffa02`e66ef6c8 00000266`b6585000 : Ntfs!NtfsCopyWriteA+0x5fc
fffffa02`e66ef620 fffff807`5c4b464a : fffffa02`e66ef730 fffffa02`e66ef6c8 ffffac0d`6e368b70 ffffac0d`6e368a70 : FLTMGR!FltpPerformFastIoCall+0x16c
fffffa02`e66ef680 fffff807`5c4e9595 : fffffa02`e66f0000 fffffa02`e66e9000 fffffa02`e66ef800 fffff807`60a04fa6 : FLTMGR!FltpPassThroughFastIo+0x10a
fffffa02`e66ef700 fffff807`60bcdfd3 : fffffa02`e66ef801 00000000`00000000 000000e7`115bdda0 00000000`00000000 : FLTMGR!FltpFastIoWrite+0x165
fffffa02`e66ef7b0 fffff807`60c68066 : ffffac0d`8998fc90 000000e7`115bdda0 00000000`00000000 000000e7`115bdda0 : nt!IopWriteFile+0x137
fffffa02`e66ef8b0 fffff807`60a105f5 : 00000000`00000000 00000000`00000000 00000000`00000000 000000e7`115bdda0 : nt!NtWriteFile+0x996
fffffa02`e66ef9d0 00007ff8`e66acf54 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25
000000e7`115bdcf8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ff8`e66acf54

SYMBOL_NAME: nt!MmCopyToCachedPage+22f

MODULE_NAME: nt

IMAGE_VERSION: 10.0.19041.3448

STACK_COMMAND: .cxr 0xfffffa02e66ee600 ; kb

IMAGE_NAME: ntkrnlmp.exe

BUCKET_ID_FUNC_OFFSET: 22f

FAILURE_BUCKET_ID: 0x3B_C000001D_nt!MmCopyToCachedPage

OS_VERSION: 10.0.19041.1

BUILDLAB_STR: vb_release

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

FAILURE_ID_HASH: {eece179b-5d7e-3c22-9fb5-b54b5caf4ef5}

Followup: MachineOwner

FLTMGR this time. Had firefox on the second monitor and had just clicked it from another program open on other screen and it crashed. Is this also storage/drive related?