[SOLVED] BSOD frequently. (solved)

chabongden · May 23, 2024

Hi everyone,

I'm having multiple bsod recently, they usually display random errors like irql_not_less_or_equal or hypervisor error. I have used memtest to test my ram individually, passed all the test. I also stress test the cpu and gpu as well, no error with low temperature. So, the only thing left I couldn't test is the motherboard (b350m tomahawk, I have been using this for 3 years).

What I have done:

- Clean installed my drivers for the gpu using ddu
- Reinstall all chipset drivers and CPU driver.
- I also tested each ram. Initially, Rams are installed in the DIMM slot 2 and 4 (crash), I switched to 1 and 3 (crash), put 1 stick of RAM to slot 2 (crash), then slot 1 (crash).
- I used BlueScreenView software to analyze the dump files, all the BSOD caused by the driver called ntoskrnl.exe but I'm not sure how to fix it.
- I also run sfc /scannow, no error.
- I don't overclock my device.
- CMOS was reset.
At this point, I don't really know what to test.

I have my dump file in case someone can analyze it.
https://www.mediafire.com/file/2yaxqhz6mtg7lem/Minidump.rar/file

beyondlogic · May 23, 2024

chabongden said:
Hi everyone,

I'm having multiple bsod recently, they usually display random errors like irql_not_less_or_equal or hypervisor error. I have used memtest to test my ram individually, passed all the test. I also stress test the cpu and gpu as well, no error with low temperature. So, the only thing left I couldn't test is the motherboard (b350m tomahawk, I have been using this for 3 years).

What I have done:

- Clean installed my drivers for the gpu using ddu
- Reinstall all chipset drivers and CPU driver.
- I also tested each ram. Initially, Rams are installed in the DIMM slot 2 and 4 (crash), I switched to 1 and 3 (crash), put 1 stick of RAM to slot 2 (crash), then slot 1 (crash).
- I used BlueScreenView software to analyze the dump files, all the BSOD caused by the driver called ntoskrnl.exe but I'm not sure how to fix it.
- I also run sfc /scannow, no error.
- I don't overclock my device.
- CMOS was reset.
At this point, I don't really know what to test.

I have my dump file in case someone can analyze it.
https://www.mediafire.com/file/2yaxqhz6mtg7lem/Minidump.rar/file

double check your power cord

also what is your power supply and gpu

make sure chipset driver and bios are both up to date as well.

chabongden · May 23, 2024

beyondlogic said:
double check your power cord

also what is your power supply and gpu

make sure chipset driver and bios are both up to date as well.

Power cord plugged in all the way. I'm using Rosewill capstone 650w and Zotac rtx 3070 Twin Edge, Ryzen 5 3600.
I have installed the latest chipset and bios as well.
Update:
So i just swapped a new pair of Rams and it still crashed, it's positive that the rams are ok.

181951 · May 24, 2024

Almost no information about this bug check has been released. As the name suggests, it stands as a Hyper V exception.

Rich (BB code):

HYPERVISOR_ERROR (20001)
The hypervisor has encountered a fatal error.
Arguments:
Arg1: 0000000000000011
Arg2: 000000000023221c
Arg3: 0000000000001005
Arg4: ffffe70001405a70

We can see this in the stack. However, we should also pay attention to the preceding NMI interrupts. These are the interrupts that are called in case of hardware problems. This leads me to detect a hardware problem in the system. You can try to turn off virtualization, but I think There's a bad CPU here..

Rich (BB code):

3: kd> KnL
 # Child-SP          RetAddr               Call Site
00 ffffb780`50287ca8 fffff804`6fd45101     nt!KeBugCheckEx
01 ffffb780`50287cb0 fffff804`6fc6ed58     nt!HvlSkCrashdumpCallbackRoutine+0x81
02 ffffb780`50287cf0 fffff804`6fc25f82     nt!KiProcessNMI+0x261e28
03 ffffb780`50287d30 fffff804`6fc25cee     nt!KxNmiInterrupt+0x82
04 ffffb780`50287e70 fffff804`6fbc692b     nt!KiNmiInterrupt+0x26e
05 ffffc784`86c68220 00000000`00000000     nt!HalpHvCounterQueryCounter+0xb

chabongden · May 24, 2024

181951 said:
Almost no information about this bug check has been released. As the name suggests, it stands as a Hyper V exception.

Rich (BB code):

HYPERVISOR_ERROR (20001) The hypervisor has encountered a fatal error. Arguments: Arg1: 0000000000000011 Arg2: 000000000023221c Arg3: 0000000000001005 Arg4: ffffe70001405a70

We can see this in the stack. However, we should also pay attention to the preceding NMI interrupts. These are the interrupts that are called in case of hardware problems. This leads me to detect a hardware problem in the system. You can try to turn off virtualization, but I think There's a bad CPU here..

Rich (BB code):

3: kd> KnL # Child-SP RetAddr Call Site 00 ffffb780`50287ca8 fffff804`6fd45101 nt!KeBugCheckEx 01 ffffb780`50287cb0 fffff804`6fc6ed58 nt!HvlSkCrashdumpCallbackRoutine+0x81 02 ffffb780`50287cf0 fffff804`6fc25f82 nt!KiProcessNMI+0x261e28 03 ffffb780`50287d30 fffff804`6fc25cee nt!KxNmiInterrupt+0x82 04 ffffb780`50287e70 fffff804`6fbc692b nt!KiNmiInterrupt+0x26e 05 ffffc784`86c68220 00000000`00000000 nt!HalpHvCounterQueryCounter+0xb

I also noticed that so I have tried to disable the virtualization in BIOS. It still crashed but the weird thing is that when virtualization is on, BSOD error is Hypervisor Error. When it's off, the error is something else, most of the time rql_not_less_or_equal. Can I test the CPU somehow? I stress tested it but it was fine, no crash or anything.

181951 · May 24, 2024

Look there!

Rich (BB code):

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 0000000000000080, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000001, bitfield :
    bit 0 : value 0 = read operation, 1 = write operation
    bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80709c2ef21, address which referenced memory

The system crashed twice on this file! On the 2nd crash there is a Hyper v exception again. Probably related to CPU interrupts - addresses accessed.

Rich (BB code):

8: kd> KnL
 # Child-SP          RetAddr               Call Site
00 ffffc88c`4a3a9a78 fffff807`09c2de29     nt!KeBugCheckEx
01 ffffc88c`4a3a9a80 fffff807`09c29289     nt!KiBugCheckDispatch+0x69
02 ffffc88c`4a3a9bc0 fffff807`09c2ef21     nt!KiPageFault+0x489
03 ffffc88c`4a3a9d58 fffff807`09bcb587     nt!memset+0x61
04 ffffc88c`4a3a9d60 fffff807`09d42560     nt!HvcallInitiateHypercall+0x57
05 ffffc88c`4a3a9de0 fffff807`09d6589a     nt!HvlNotifyRootCrashdump+0x68
06 ffffc88c`4a3a9e50 fffff807`09c18087     nt!KeBugCheck2+0xca
07 ffffc88c`4a3aa5c0 fffff807`09c97d4e     nt!KeBugCheckEx+0x107
08 ffffc88c`4a3aa600 fffff807`09c921bc     nt!MiCheckSystemNxFault+0x1f638a
09 ffffc88c`4a3aa640 fffff807`09a30213     nt!MiRaisedIrqlFault+0x200e38
0a ffffc88c`4a3aa690 fffff807`09c2917e     nt!MmAccessFault+0x363
0b ffffc88c`4a3aa7b0 fffff807`09af7420     nt!KiPageFault+0x37e
0c ffffc88c`4a3aa940 00000000`00000000     nt!KxReleaseQueuedSpinLock

chabongden said:
Can I test the CPU somehow? I stress tested it but it was fine, no crash or anything.

Prime95. Actually, I can tell you from pure experience that not every problem CPU fails the stress test. Probably, it would be better to try with another CPU.

chabongden · May 24, 2024

181951 said:

Look there!

Rich (BB code):

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 0000000000000080, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000001, bitfield :
    bit 0 : value 0 = read operation, 1 = write operation
    bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80709c2ef21, address which referenced memory

The system crashed twice on this file! On the 2nd crash there is a Hyper v exception again. Probably related to CPU interrupts - addresses accessed.

Rich (BB code):

8: kd> KnL
 # Child-SP          RetAddr               Call Site
00 ffffc88c`4a3a9a78 fffff807`09c2de29     nt!KeBugCheckEx
01 ffffc88c`4a3a9a80 fffff807`09c29289     nt!KiBugCheckDispatch+0x69
02 ffffc88c`4a3a9bc0 fffff807`09c2ef21     nt!KiPageFault+0x489
03 ffffc88c`4a3a9d58 fffff807`09bcb587     nt!memset+0x61
04 ffffc88c`4a3a9d60 fffff807`09d42560     nt!HvcallInitiateHypercall+0x57
05 ffffc88c`4a3a9de0 fffff807`09d6589a     nt!HvlNotifyRootCrashdump+0x68
06 ffffc88c`4a3a9e50 fffff807`09c18087     nt!KeBugCheck2+0xca
07 ffffc88c`4a3aa5c0 fffff807`09c97d4e     nt!KeBugCheckEx+0x107
08 ffffc88c`4a3aa600 fffff807`09c921bc     nt!MiCheckSystemNxFault+0x1f638a
09 ffffc88c`4a3aa640 fffff807`09a30213     nt!MiRaisedIrqlFault+0x200e38
0a ffffc88c`4a3aa690 fffff807`09c2917e     nt!MmAccessFault+0x363
0b ffffc88c`4a3aa7b0 fffff807`09af7420     nt!KiPageFault+0x37e
0c ffffc88c`4a3aa940 00000000`00000000     nt!KxReleaseQueuedSpinLock

Prime95. Actually, I can tell you from pure experience that not every problem CPU fails the stress test. Probably, it would be better to try with another CPU.

So i just tried prime95 and occt, there are some cores doesn't work correctly, is this the cause? OCCT gave errors as soon as I ran it.

https://imgur.com/a/3hW30VV

View: https://imgur.com/a/3hW30VV

181951 · May 24, 2024

There are many factors affecting Prime95 and let us go through it. Can you specify which test you did? F examp: Small, Smallest, Large...

chabongden · May 24, 2024

chabongden said:
So i just tried prime95 and occt, there are some cores doesn't work correctly, is this the cause? OCCT gave errors as soon as I ran it.

https://imgur.com/a/3hW30VV

View: https://imgur.com/a/3hW30VV

https://imgur.com/a/N3YgBTZ

View: https://imgur.com/a/N3YgBTZ

I used the default option.

181951 · May 24, 2024

Blend is a heavy test and affects multiple pieces of hardware. So it's hard to understand. For CPU I recommend smallest and small tests.

chabongden · May 24, 2024

181951 said:
Blend is a heavy test and affects multiple pieces of hardware. So it's hard to understand. For CPU I recommend smallest and small tests.

Small and Smallest results:

Worker 5 always fails immediately, then 3.

https://imgur.com/a/okszCYG

View: https://imgur.com/a/okszCYG

181951 · May 24, 2024

Everything is clearer now. You have CPU problems. So crashes, BSoD's are the result of that.

chabongden · May 24, 2024

181951 said:
Everything is clearer now. You have CPU problems. So crashes, BSoD's are the result of that.

Yea, I suppose replacing another one is the only solution. Thank you for your time!

Colif · May 25, 2024

Prime 95 rounding errors can be the cause of all the BSOD. Try giving ram more power, that can fix them.
raise it a little, test with prime until it stops giving errors. PC isn't stable.

chabongden · May 25, 2024

Colif said:
Prime 95 rounding errors can be the cause of all the BSOD. Try giving ram more power, that can fix them.
raise it a little, test with prime until it stops giving errors. PC isn't stable.

OCCT also gave errors for the cores as in the Prime. I tested another cpu from another pc, no errors at all.

Also, how much is little? I never mess with the voltage of the pc, can you give an example?

Colif · May 25, 2024

probably was CPU if it works fine now.

chabongden · May 25, 2024

Colif said:
probably was CPU if it works fine now.

I'm still testing, it usually crashes after a day or 2 so we'll see.

181951 · May 25, 2024

Colif said:
Prime 95 rounding errors can be the cause of all the BSOD. Try giving ram more power, that can fix them.
raise it a little, test with prime until it stops giving errors. PC isn't stable.

chabongden said:
I'm still testing, it usually crashes after a day or 2 so we'll see.

Raisin' voltage, lowering the power limit may be a solution, but not what I recommend. It also instills laziness in companies, and it seems that Intel is already avoiding responsibility by implementing this under the fail safe title. It is obvious that these applications also reduce CPU life. Why do we need to fix a processor that doesn't work by default? (it seems chabongden didn't play with the processor settings, everything was normal as far as I can see from the dump files) If it's broken, it's broken.

chabongden · May 25, 2024

So the pc I tested with the CPU just crashed, 100% faulty CPU confirmed. Thank you guys.

181951 · May 25, 2024

Good to hear.

Search

[SOLVED] BSOD frequently. (solved)

chabongden

beyondlogic

Distinguished

chabongden

181951

chabongden

181951

chabongden

181951

chabongden

181951

chabongden

181951

chabongden

Colif

Win 11 Master

chabongden

Colif

Win 11 Master

chabongden

181951

chabongden

181951

TRENDING THREADS

Latest posts

Moderators online

Share this page