Question Custom PC Windows 11 BSOD ntoskrnl.exe - DRIVER_IRQL_NOT_LESS_OR_EQUAL - DRIVER_OVERRAN_STACK_BUFFER- SYSTEM_SERVICE_EXCEPTION

Oct 31, 2024
7
4
15
I'm at my wits' end and hoping to find assistance on an issue I'm having with my PC build. My computer has been experiencing BSOD between 2:30 a.m and 5:00 a.m. every night. The PC functions fine during the day, which I use as a combination workstation and PLEX server. Here is a brief history of the origination of the issue(s) and what I did to troubleshoot. I'm not sure if any of these are related to each other, just thought it may help.

  • About four weeks ago, both my desktop screens went black. Assuming it was a GPU failure, I purchased a new GPU card and installed it, with no resolve. I reinstalled the old GPU and began pulling RAM sticks (Corsair, 4x 16gb). One of the stick I pulled brought back my video signal. I contacted Corsair and received four new replacement RAM sticks and installed them. Everything was working fine for a couple of days.
  • I began experiencing BSOD crashes when scanning new media using my PLEX server software. I caught the brief BSOD a few times, and it was a "SYSTEM_SERVICE_EXCEPTION" and "DRIVER_VERIFIER_IOMANAGER_VIOLATION".
  • I then began trying to diagnose the issue for about a week. Then suddenly, one of my internal Seagate 16tb failed. I pulled it and tried to recover the data with no luck.
  • This week I installed BlueScreenView and monitored the crash points, all pointing to a "ntoskrnl.exe" driver/filename. Screenshot

Troubleshooting:
  • Updated BIOS
  • Clean uninstall and reinstall GPU drivers and software (2x)
  • Ran "Windows Memory Diagnostic" with no issues found.
  • Performed and available Windows update.
  • Manually installed all drivers available for the ROG STRIX X570-E GAMING motherboard.
  • Installed the most up-to-date AMD Chipset drivers.
  • Ran "Chkdsk" on all drives.

Recent dump file: Minidump

Any other suggestions or help would be appreciated. Thank you in advance.

System Information:

OS: Windows 11 Pro
System Type: x64-based PC
Processor: AMD Ryzen 9 3900X 12-Core Processor, 4200 Mhz, 12 Core(s), 24 Logical Processor(s)
BIOS Version/Date: American Megatrends Inc. 5013, 3/18/2024
GPU: NVIDIA GeForce RTX 3070 Ti
Motherboard: ROG STRIX X570-E GAMING
Installed Physical Memory (RAM): 64.0 GB
Primary Drive: 2 TB NVMe SSD M.2
Secondary Drive: 4 TB NVMe SSD M.2
Internal 3.5 Drive 1: 6TB
Internal 3.5 Drive 2: 16TB (failed)
Internal 3.5 Drive 3: 14TB
 

ubuysa

Distinguished
Please make the minidumps public ('anyone with the link'). ALso, why did you enable Driver Verifier? Since you had a Driver Verified triggered BSOD then you almost certainly have a flaky driver. We should be able to see that in the dump.
 
Oct 31, 2024
7
4
15
Please make the minidumps public ('anyone with the link'). ALso, why did you enable Driver Verifier? Since you had a Driver Verified triggered BSOD then you almost certainly have a flaky driver. We should be able to see that in the dump.
I made the minidumps accessible, my apologies.

In all honesty, I went down a bad rabbit hole searching for a resolution to this problem. The Driver Verifier was one of the suggested troubleshooting methods I read online or watched a video of. At this point my brain is fried and I don't even recall why I activated the verifier.
 
https://learn.microsoft.com/en-us/sysinternals/downloads/autoruns

download Microsoft autoruns64.exe, run it and disable this driver:
pwdrvio.sys Mon Jun 15 18:43:45 2009
Partition Wizard Disk I/O driver
reboot and see if the problem just goes away.

this is just a guess. i did not see the actual cause of the bugcheck in the minidump. just a null ptr.
system was running for 7 hours before the bugcheck
 
Last edited:
Oct 31, 2024
7
4
15
  • Like
Reactions: Roland Of Gilead

ubuysa

Distinguished
In addition to the above from @Roland Of Gilead...

The BSOD happened because of an invalid instruction pointer. Here's the failing instruction...
Code:
TRAP_FRAME:  ffffd60f44564830 -- (.trap 0xffffd60f44564830)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000003b65cf9225 rbx=0000000000000000 rcx=0000000000000001
rdx=000000000000108c rsi=0000000000000000 rdi=0000000000000000
rip=0000000000000000 rsp=ffffd60f445649c0 rbp=ffffd60f44564ab0
 r8=8000000000000000  r9=ffffd60f44564ab0 r10=0000fffff8025a56
r11=ffffcb7cd4c00000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up di ng nz na pe nc
00000000`00000000 ??              ???
Resetting default scope
The RIP register is the instruction pointer and zero is not a valid instruction pointer!

I note too from the dump that this happened as the processor came out of the idle state and I have seen a tiny number of CPUs (mostly AMD) that become unstable when transitioning from the low power idle state into the high power running state. The way to test for this is to go into your BIOS setup and disable C-States for all processors. Most BIOSes have a global C-State disable/enable switch.

With C-States disabled the processors cannot enter a low power state when idle. If this is the cause of your BSODs then disabling C-States will stop the BSODs.

Note that the only downsides of disabling C-States is a slightly hotter CPU when idle, but a decent cooler can handle that easily, and a slightly higher power draw when idle, but that's not worth worrying about.
 
  • Like
Reactions: Roland Of Gilead
Oct 31, 2024
7
4
15
In addition to the above from @Roland Of Gilead...

The BSOD happened because of an invalid instruction pointer. Here's the failing instruction...
Code:
TRAP_FRAME:  ffffd60f44564830 -- (.trap 0xffffd60f44564830)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000003b65cf9225 rbx=0000000000000000 rcx=0000000000000001
rdx=000000000000108c rsi=0000000000000000 rdi=0000000000000000
rip=0000000000000000 rsp=ffffd60f445649c0 rbp=ffffd60f44564ab0
 r8=8000000000000000  r9=ffffd60f44564ab0 r10=0000fffff8025a56
r11=ffffcb7cd4c00000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up di ng nz na pe nc
00000000`00000000 ??              ???
Resetting default scope
The RIP register is the instruction pointer and zero is not a valid instruction pointer!

I note too from the dump that this happened as the processor came out of the idle state and I have seen a tiny number of CPUs (mostly AMD) that become unstable when transitioning from the low power idle state into the high power running state. The way to test for this is to go into your BIOS setup and disable C-States for all processors. Most BIOSes have a global C-State disable/enable switch.

With C-States disabled the processors cannot enter a low power state when idle. If this is the cause of your BSODs then disabling C-States will stop the BSODs.

Note that the only downsides of disabling C-States is a slightly hotter CPU when idle, but a decent cooler can handle that easily, and a slightly higher power draw when idle, but that's not worth worrying about.
As stated above, I performed a CMOS clear and haven't experienced a BSOD for the first time in two weeks. Should I leave everything as is or disable the C-States as well?
 
  • Like
Reactions: Roland Of Gilead