Question PC hard-lock ups driving me insane

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Dec 20, 2022
19
0
10
Hi all,


I am seeking some assistance from the tom's Hardware collective please to stop me going insane!

First I guess the info of the system:

Windows 10 Pro, build 19044 (up to date)
I9 9900K, not overclocked - bios all default settings / auto
2x8GB Corsair DDR4 2132 w/XMP @ 3596 - non XMP profile loaded
MSI Z390 Gaming Pro Carbon
Nvidia 3090Ti FE
1000w EVGA 80+ modular PSU
Crucial CT250 MX200 SSD for Operating System
Corsair MP600 Core 2TB SSD for game installs
2x WD 2TB standard HDDs
Widescreen 2560x1080 (so not many pixles!)
Quest 2 @ 1.6x SS / 90hz

I built this system around Jan 2022 (originally with my titan XP, then upgraded to 3090ti later in the year).

My issues started not long after building the PC, I purchased BF2042, and when I was playing the game, I experienced random locking up of the PC, with no mini dump, or system errors, only buttoning and rebooting would free up the PC.

I could play a game sometimes for 2 hours or so, other times maybe I couldn’t even play a single round and the PC would lockup.

At this time I had XMP profile loaded, and was running a titan XP. I have only experienced one other lockup, and that was after around 3-4 hours of Warzone, but mostly, all games run OK.


Fault finding:

Re-installed game
Tweaked XMP profile, as notice manufacturer timings differed to what bios was recommending, but still locked up
Removed XMP and set to default speed
Reset bios / CPU settings all to default and auto
Ran numerous OCCT tests, RAM and CPU tests passed
Conducted GPU stress tests with furmark - all passed
Memcheck – passed
Thermals checked – all good (CPU rarely more than mid 60’s, GPU same)
OCCT power test (crashed / locked after around 3 hours)
Checked 3.3v, 5v and 12v for any instability (only via software OCCT) – all good
Underclocked GPU RAM and core clock / reduced power target
Removed / reinstalled GPU drivers
Scanned all HDDs / checked for errors
Fresh install of Windows 10
Added Corsair NVME and fresh install of all game to a new / different HDD
Swapped PSU power connectors to GPU
Have since changed GPU to 3090ti

When the machine locks up, it is often during title screen, loading, after game, although can happen during game.

During last lockup:
CPU @ 49% utilisation
CPU package temp 57c
RAM @ 12GB usage
Disk 0, 1, 2 @ 0% (disk 0 OS volume)
Disk 3 @ 5% (game SSD)
Ethernet usage high-ish 344kbps
GPU usage 11%
GPU power 27%
VRAM 6579mb
GPU temp 53c

There is never anything in the event log I can see other than unexpected shutdowns / kernel power caused by me buttoning.

I also play VR, and I run @ 1.6SS, so that is 5408x2736, this depends on title but according to EVGA my GPU usage can vary from 60ish up to 90% or so. I play Project cars for hours, no problem, and other games such as Alyx, utilising over 16GB VRAM and with high GPU utilisation with no crashing.

The only needle I can find is that last night, on one of the crashes I sat and waited (I have before for a long time and never a BSOD or error), one of my screens went almost black and I did get a BSOD!) now this may be a coincidence, but anyway it was ‘DPC WATCHDOG VIOLATION’ and I had at least a dump file:


On Mon 19/12/2022 21:35:01 your computer crashed or a problem was reported


Crash dump file: C:\WINDOWS\Minidump\121922-8828-01.dmp (Minidump)
Bugcheck code: 0x133(0x1, 0x1E00, 0xFFFFF805758FB320, 0x0)
Bugcheck name:DPC_WATCHDOG_VIOLATION
Driver or module in which error occurred: aswVmm.sys (aswVmm+0x28363)
File path:C:\WINDOWS\system32\drivers\aswVmm.sys
Bug check description:The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL or above. This could be caused by either a non-responding driver or non-responding hardware. This bug check can also occur because of overheated CPUs (thermal issue).
Analysis:This is likely caused by a hardware problem, but there is a possibility that this is caused by a misbehaving driver.
This bugcheck indicates that a timeout has occurred. This may be caused by a hardware failure such as a thermal issue or a bug in driver for a hardware device.
Read this article on thermal issues
A full memory dump will likely provide more useful information on the cause of this particular bugcheck.
Google query:aswvmm DPC_WATCHDOG_VIOLATION


On Mon 19/12/2022 21:35:01 your computer crashed or a problem was reported


Crash dump file: C:\WINDOWS\MEMORY.DMP (Kernel memory dump)
Bugcheck code: 0x133(0x1, 0x1E00, 0xFFFFF805758FB320, 0x0)
Bugcheck name:DPC_WATCHDOG_VIOLATION
Driver or module in which error occurred: aswVmm.sys (aswVmm+0x28363)
File path:C:\WINDOWS\system32\drivers\aswVmm.sys
Bug check description:The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL or above. This could be caused by either a non-responding driver or non-responding hardware. This bug check can also occur because of overheated CPUs (thermal issue).
Analysis:This is likely caused by a hardware problem, but there is a possibility that this is caused by a misbehaving driver.
This bugcheck indicates that a timeout has occurred. This may be caused by a hardware failure such as a thermal issue or a bug in driver for a hardware device.
Read this article on thermal issues
A full memory dump will likely provide more useful information on the cause of this particular bugcheck.
Google query:aswvmm DPC_WATCHDOG_VIOLATION


So I removed and cleaned AVG in safe mode – still had a lock up in BF2042.

When this BSOD was generated thermals were fine, I want to add that on custom settings in BF with everything on ultra, and all additional settings enable, my GPU rarely goes about 38%, so GPU temps in BF are normally in the 50’s or lower, and CPU never goes over mid 60’s.

Windows debugger tells a slightly different story though?


Symbol search path is: srv*

Executable search path is:

Windows 10 Kernel Version 19041 MP (16 procs) Free x64

Product: WinNt, suite: TerminalServer SingleUserTS

Edition build lab: 19041.1.amd64fre.vb_release.191206-1406

Machine Name:

Kernel base = 0xfffff80574c00000 PsLoadedModuleList = 0xfffff8057582a2b0

Debug session time: Mon Dec 19 21:35:01.932 2022 (UTC + 0:00)

System Uptime: 4 days 21:29:46.062

Loading Kernel Symbols

...............................................................

................................................................

................................................................

...............................

Loading User Symbols

Loading unloaded module list

......................

For analysis of this file, run !analyze -v

nt!KeBugCheckEx:

fffff80574ff92d0 48894c2408 mov qword ptr [rsp+8],rcx ss:ffffbd01de692e20=0000000000000133

3: kd> !analyze -v

***
  • *
  • Bugcheck Analysis *
  • *
***


DPC_WATCHDOG_VIOLATION (133)

The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL

or above.

Arguments:

Arg1: 0000000000000001, The system cumulatively spent an extended period of time at

DISPATCH_LEVEL or above.

Arg2: 0000000000001e00, The watchdog period (in ticks).

Arg3: fffff805758fb320, cast to nt!DPC_WATCHDOG_GLOBAL_TRIAGE_BLOCK, which contains

additional information regarding the cumulative timeout

Arg4: 0000000000000000


Debugging Details:

------------------



*
* *
* *
* Either you specified an unqualified symbol, or your debugger *
* doesn't have full symbol information. Unqualified symbol *
* resolution is turned off by default. Please either specify a *
* fully qualified symbol module!symbolname, or enable resolution *
* of unqualified symbols by typing ".symopt- 100". Note that *
* enabling unqualified symbol resolution with network symbol *
* server shares in the symbol path may cause the debugger to *
* appear to hang for long periods of time when an incorrect *
* symbol name is typed or the network symbol server is down. *
* *
* For some commands to work properly, your symbol path *
* must point to .pdb files that have full type information. *
* *
* Certain .pdb files (such as the public OS symbols) do not *
* contain the required information. Contact the group that *
* provided you with these symbols if you need this command to *
* work. *
* *
* Type referenced: TickPeriods *
* *
*


KEY_VALUES_STRING: 1


Key : Analysis.CPU.mSec

Value: 3749


Key : Analysis.DebugAnalysisManager

Value: Create


Key : Analysis.Elapsed.mSec

Value: 4088


Key : Analysis.IO.Other.Mb

Value: 0


Key : Analysis.IO.Read.Mb

Value: 0


Key : Analysis.IO.Write.Mb

Value: 0


Key : Analysis.Init.CPU.mSec

Value: 249


Key : Analysis.Init.Elapsed.mSec

Value: 3917


Key : Analysis.Memory.CommitPeak.Mb

Value: 86


Key : Bugcheck.Code.DumpHeader

Value: 0x133


Key : Bugcheck.Code.Register

Value: 0x133


Key : WER.OS.Branch

Value: vb_release


Key : WER.OS.Timestamp

Value: 2019-12-06T14:06:00Z


Key : WER.OS.Version

Value: 10.0.19041.1



FILE_IN_CAB: 121922-8828-01.dmp

BUGCHECK_CODE: 133

BUGCHECK_P1: 1

BUGCHECK_P2: 1e00

BUGCHECK_P3: fffff805758fb320

BUGCHECK_P4: 0


DPC_TIMEOUT_TYPE: DPC_QUEUE_EXECUTION_TIMEOUT_EXCEEDED


TRAP_FRAME: ffffe403cb271ee0 -- (.trap 0xffffe403cb271ee0)

NOTE: The trap frame does not contain all registers.

Some register values may be zeroed or incorrect.

rax=0000000000000001 rbx=0000000000000000 rcx=00000000003506f8

rdx=00000000000c00e1 rsi=0000000000000000 rdi=0000000000000000

rip=fffff80574ebb970 rsp=ffffe403cb272070 rbp=0000000000000003

r8=000000000000082f r9=00000000000000e1 r10=fffff80574f376e0

r11=0000000000000000 r12=0000000000000000 r13=0000000000000000

r14=0000000000000000 r15=0000000000000000

iopl=0 nv up ei ng nz na pe nc

nt!KiIpiSendRequestEx+0xb0:

fffff80574ebb970 8b87802d0000 mov eax,dword ptr [rdi+2D80h] ds:0000000000002d80=????????

Resetting default scope

BLACKBOXBSD: 1 (!blackboxbsd)

BLACKBOXNTFS: 1 (!blackboxntfs)

BLACKBOXPNP: 1 (!blackboxpnp)

BLACKBOXWINLOGON: 1

CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: BF2042.exe

STACK_TEXT:

ffffbd01de692e18 fffff8057505bf02 : 0000000000000133 0000000000000001 0000000000001e00 fffff805758fb320 : nt!KeBugCheckEx

ffffbd01de692e20 fffff80574ed2973 : 00001796ea01adaf ffffbd01de640180 0000000000000000 ffffbd01de640180 : nt!KeAccumulateTicks+0x186d32

ffffbd01de692e80 fffff80574ed245a : ffffd30b342b6d40 ffffe403cb271f60 0000000000000000 ffff87000beb4d02 : nt!KeClockInterruptNotify+0x453

ffffbd01de692f30 fffff80574e08a45 : ffffd30b342b6d40 fffff80574f545e7 0000000000000000 0000000000000000 : nt!HalpTimerClockIpiRoutine+0x1a

ffffbd01de692f60 fffff80574ffb26a : ffffe403cb271f60 ffffd30b342b6d40 0000000000000001 0000000000000000 : nt!KiCallInterruptServiceRoutine+0xa5

ffffbd01de692fb0 fffff80574ffba37 : 00000000d79761d4 ffffbd01de643088 ffffe403cb272011 0000000000000003 : nt!KiInterruptSubDispatchNoLockNoEtw+0xfa

ffffe403cb271ee0 fffff80574ebb970 : 0000000000000028 fffff80574ebd2e9 0000000000000001 0000000000000000 : nt!KiInterruptDispatchNoLockNoEtw+0x37

ffffe403cb272070 fffff80574eda8ac : 0000000000000000 0000000000000002 0000009000000000 ffff8a0000007000 : nt!KiIpiSendRequestEx+0xb0

ffffe403cb2720b0 fffff80574eda5eb : 0000000000000004 0000000000000000 0000000003800000 0000000000000002 : nt!KxFlushEntireTb+0x1bc

ffffe403cb272100 fffff80574e86592 : 0000000000000000 0000000000000000 0000000000000000 fffff80575850b00 : nt!KeFlushTb+0x7b

ffffe403cb272160 fffff80574e888d1 : 0000000000000000 0000000000000000 ffff87000beb4d40 ffff87000beb4d40 : nt!MiFlushEntireTbDueToAttributeChange+0x3a

ffffe403cb272250 fffff80574e3ffd5 : 8a000003f919c96b 0000000000000001 ffff87000beb4d62 0000000000000000 : nt!MiChangePageAttributeBatch+0x91

ffffe403cb2722c0 fffff80574e3cd58 : ffffe403cb272690 0000000000000000 0000000000000001 0000000000000000 : nt!MiGetPageChain+0xd75

ffffe403cb272500 fffff80574e3c2a8 : ffffe403cb272690 ffffe403cb2728e0 ffffd30b00000000 0000000000000000 : nt!MiResolvePrivateZeroFault+0x6e8

ffffe403cb272630 fffff80574e3b67d : 00000000c0000016 0000000000000002 0000000000000000 0000000000000002 : nt!MiResolveDemandZeroFault+0x208

ffffe403cb272720 fffff80574e39769 : 0000000000000111 0000000000000004 00000000c0000016 0000000000000000 : nt!MiDispatchFault+0x22d

ffffe403cb272860 fffff80575008dd8 : 000000004e71d084 ffffe403cb272a80 000000004e71d098 0000000000000001 : nt!MmAccessFault+0x189

ffffe403cb272a00 00000001422cc1fc : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiPageFault+0x358

000000004e718f60 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : 0x00000001`422cc1fc

SYMBOL_NAME: nt!KeAccumulateTicks+186d32

MODULE_NAME: nt

IMAGE_NAME: ntkrnlmp.exe

IMAGE_VERSION: 10.0.19041.2251

STACK_COMMAND: .cxr; .ecxr ; kb

BUCKET_ID_FUNC_OFFSET: 186d32

FAILURE_BUCKET_ID: 0x133_ISR_nt!KeAccumulateTicks

OS_VERSION: 10.0.19041.1

BUILDLAB_STR: vb_release

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

FAILURE_ID_HASH: {65350307-c3b9-f4b5-8829-4d27e9ff9b06}

Followup: MachineOwner

---------

In a last ditch attempt at anything meaningful, I noticed some DPC watchdog errors related to iastora.sys, and so re-installed the driver for the OS SSD, but you guessed it, still several lockups this evening, resulting in me buttoning the PC and no dump files, nothing I can see being meaningful in event viewer.

I am now leaning towards a PSU issue, it is after all, 7 years old now! However, if BF2042 is taxing my system less overall than running 14 million pixels in VR, and drawing a hell of a lot more power, how can it be stable there and fall over when ticking over in BF2042?

The PSU is a single rail, but I am not familiar with how the CPU / mobo circuit works vs PCIE and HDD supply.

My other thoughts were HDD, but the OS HDD although also an older crucial drive from a previous build seems reliable – no vanishing files, no corrupt files or weirdness, no errors.

With power ‘appearing’ stable in VR and OCCT tests I am not sure.

Perhaps I got a dodgy stick of RAM, but memtest checks out, and OCCT memory tests always seem to work just fine using 95% of memory.

Maybe I got a dodgy CPU? But a CPU burn or stress test at 100% also seems stable (I’ve done more memory and CPU tests this eve on OCCT and all good, no errors).

So then motherboard? But again, seems to be OK?

I will check my motherboard bios, but I am pretty sure when these crashes started it was also up to date at the time, although another release may be out now.
So here I am, I have a PC with all ‘new’ parts in this build other than OS HDD and old PSU. But it would be great to actually pin point the issue without throwing money at something that doesn’t need replacing.

My only other thought is to perhaps play with the CPU voltage, it runs at 1.286v max on auto, with package power around 117w. Temperatures aren’t an issue, so did I get a bad one that just needs more juice to run stable at the stock 4.68? Again if that is the case wouldn’t this manifest in VR?

Pretty much stumped guys, and if you’ve made it this far well done, and thanks for reading.

Seriously welcome any ideas / testing I can do to pin this one down.



Thanks.
 
At this juncture I think it's essential to stick with running it only barebones with only your nvme drive, mobo, cpu & cooler, gpu, and ram attached, no other devices whatsoever, until you trace the problem.

Switch out your power supply for the cooler master mwe when it arrives and remember to remove all the old modular cables from your evga lest it be a bigger pain when it fries. Stressing to only use the cables that ship with and belong to each respective power supply.

Keep running it barebones with no other devices attached when you have installed your cooler master power supply.

Give it a while 'til you have played it enough without crashing that it seems stable and add the devices back one at a time and test them in each step while you go.

So re-attach your 2tb hdd after a few days, run it a few more days, see if it's all still stable. Then you can think about attaching other devices. Do the same for them - run them a while too before adding another. You need to detect the device that crashes.

You just want to be sure the core system is stable before adding anything is all. Then be sure that each device you add isn't the cause.

The evga is a seven year old power supply so it can have a ? over it.

If it still crashes with your cooler master power supply might turn attention to the mobo. That's why you need to keep running it barebones. if you just add all the devices back at once thinking it was only the power supply you still won't know which crashed.

So go easy, take it one step at a time and see if it crashes with your new cooler master installed next. It might be another tier b power supply, but it has one thing going for it - it's brand new.
 
Last edited:
Well, bad news.

The new PAU has less voltage fluctuation, but, even with usb devices and hdds unplugged, at the 3hr point the PC locked up.

I guess now I order RAM and try that.

seriously getting pissed off of throwing money at a PC build thats only 1yr old and giving me so many problems 🙁

just getting fed up now. I’ve got a £3000 pile of crap at the moment, sigh.
 
I downloaded MemTest (windows executable). It will only run up to 3.5gb at a time, so I opened five sessions and ran with 3GB, this used 100% ram and tested writing to page file. Within around 50 mins the PC restarted with a BSOD

Who crashed reports:

On Sat 24/12/2022 21:54:28 your computer crashed or a problem was reported


Crash dump file: C:\Windows\Minidump\122422-10250-01.dmp (Minidump)
Bugcheck code: 0x1A(0x3F, 0xC086A, 0x6E900F4, 0x4611337F)
Bugcheck name:MEMORY_MANAGEMENT
Bug check description:This indicates that a severe memory management error occurred.
Analysis:This is possibly a software problem. This is likely a case of memory corruption.
This bugcheck is often associated with overheating problems. Read this article on memory corruption. Read this article on thermal issues


Windows debug analyzer shows following:


Code:
MEMORY_MANAGEMENT (1a)
    # Any other values for parameter 1 must be individually examined.
Arguments:
Arg1: 000000000000003f, An inpage operation failed with a CRC error. Parameter 2 contains
    the pagefile offset. Parameter 3 contains the page CRC value.
    Parameter 4 contains the expected CRC value.
Arg2: 00000000000c086a
Arg3: 0000000006e900f4
Arg4: 000000004611337f

Debugging Details:
------------------


KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.mSec
    Value: 1562

    Key  : Analysis.DebugAnalysisManager
    Value: Create

    Key  : Analysis.Elapsed.mSec
    Value: 2985

    Key  : Analysis.IO.Other.Mb
    Value: 10

    Key  : Analysis.IO.Read.Mb
    Value: 0

    Key  : Analysis.IO.Write.Mb
    Value: 30

    Key  : Analysis.Init.CPU.mSec
    Value: 233

    Key  : Analysis.Init.Elapsed.mSec
    Value: 19402

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 98

    Key  : Bugcheck.Code.DumpHeader
    Value: 0x1a

    Key  : Bugcheck.Code.Register
    Value: 0x1a

    Key  : Dump.Attributes.AsUlong
    Value: 1008

    Key  : Dump.Attributes.DiagDataWrittenToHeader
    Value: 1

    Key  : Dump.Attributes.ErrorCode
    Value: 0

    Key  : Dump.Attributes.KernelGeneratedTriageDump
    Value: 1

    Key  : Dump.Attributes.LastLine
    Value: Dump completed successfully.

    Key  : Dump.Attributes.ProgressPercentage
    Value: 0

    Key  : Memory.System.Errors.PageHashErrors
    Value: 1


FILE_IN_CAB:  122422-10250-01.dmp

DUMP_FILE_ATTRIBUTES: 0x1008
  Kernel Generated Triage Dump

BUGCHECK_CODE:  1a

BUGCHECK_P1: 3f

BUGCHECK_P2: c086a

BUGCHECK_P3: 6e900f4

BUGCHECK_P4: 4611337f

ADDITIONAL_DEBUG_TEXT:  Memory Manager detected corruption of a pagefile page while performing an in-page operation.
The data read from storage does not match the original data written.
This indicates the data was corrupted by the storage stack, or device hardware.


BLACKBOXBSD: 1 (!blackboxbsd)


BLACKBOXNTFS: 1 (!blackboxntfs)


BLACKBOXPNP: 1 (!blackboxpnp)


BLACKBOXWINLOGON: 1

CUSTOMER_CRASH_COUNT:  1

PROCESS_NAME:  MemCompression

PAGE_HASH_ERRORS_DETECTED: 1

TRAP_FRAME:  ffffe1872c59efc0 -- (.trap 0xffffe1872c59efc0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=fffff80103ba20f0 rbx=0000000000000000 rcx=ffffb1000a7e7000
rdx=ffffb1000a7e7000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff80103ba2132 rsp=ffffe1872c59f150 rbp=ffffb1000a7e8000
 r8=00000155cbdb73c0  r9=0000000000000713 r10=fffff80103ba20f0
r11=00000155cbdb7ad3 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl zr na po nc
nt!RtlDecompressBufferXpressLz+0x42:
fffff801`03ba2132 418b00          mov     eax,dword ptr [r8] ds:00000155`cbdb73c0=????????
Resetting default scope

STACK_TEXT: 
ffffe187`2c59ea58 fffff801`03c5421b     : 00000000`0000001a 00000000`0000003f 00000000`000c086a 00000000`06e900f4 : nt!KeBugCheckEx
ffffe187`2c59ea60 fffff801`03c5319a     : 000c086a`00000002 00000000`00000000 ffffc280`041491e0 00000000`00000000 : nt!MiPageHashBugCheck+0x4f
ffffe187`2c59eaa0 fffff801`03a706ff     : ffffc280`041491e0 00000000`00000000 00000000`00000000 fffff801`07e18e21 : nt!MiArePagefileContentsCorrupted+0x1ca
ffffe187`2c59eb20 fffff801`03a997cd     : 00000000`00000001 ffffffff`ffffffff ffff800c`4985e610 ffffe187`2c59eee0 : nt!MiValidatePagefilePageHash+0x1a3
ffffe187`2c59eca0 fffff801`038d5a06     : 00000000`00000000 ffffe187`0000d000 ffffe187`2c59ee48 fffff801`00000000 : nt!MiWaitForInPageComplete+0x1c390d
ffffe187`2c59eda0 fffff801`03853d2a     : 00000000`c0033333 00000000`00000000 00000155`cbdb73c0 00000000`00000000 : nt!MiIssueHardFault+0x246
ffffe187`2c59eea0 fffff801`03a39329     : 00000000`00000000 00000000`00000000 ffffe187`2c59f3d8 0000001c`00000000 : nt!MmAccessFault+0x31a
ffffe187`2c59efc0 fffff801`03ba2132     : ffff800c`531f1000 ffffb100`0a7e7000 00000000`00000713 fffff801`039285c0 : nt!KiPageFault+0x369
ffffe187`2c59f150 fffff801`039285c0     : ffffb100`0a7e7000 ffff800c`51b15050 00000155`cbdb73c0 ffffb100`0a7e7000 : nt!RtlDecompressBufferXpressLz+0x42
ffffe187`2c59f170 fffff801`03a66e31     : ffff800c`51b16788 ffff800c`51b15050 00000155`cbdb73c0 ffff800c`4ef62698 : nt!RtlDecompressBufferEx+0x60
ffffe187`2c59f1c0 fffff801`03bb5e48     : 00000000`00000004 fffff801`03baf49e 00000000`00000000 00000000`00000001 : nt!ST_STORE<SM_TRAITS>::StDmSinglePageCopy+0x22b
ffffe187`2c59f290 fffff801`03bb6aa8     : 00000000`00000001 00000000`000044d3 ffff800c`51b15050 ffff800c`00001000 : nt!ST_STORE<SM_TRAITS>::StDmSinglePageTransfer+0xa0
ffffe187`2c59f2e0 fffff801`03bb5635     : ffffe187`ffffffff ffff800c`531f1000 ffffe187`2c59f3c0 ffff800c`5d126190 : nt!ST_STORE<SM_TRAITS>::StDmpSinglePageRetrieve+0x1c4
ffffe187`2c59f380 fffff801`03baee9b     : ffffe187`2c59f5b9 00000000`00000001 00000000`00000000 ffff800c`5d126030 : nt!ST_STORE<SM_TRAITS>::StDmPageRetrieve+0xc9
ffffe187`2c59f430 fffff801`03baed71     : ffff800c`51b15000 ffff800c`5d126190 ffff800c`531f1000 ffff800c`51b169c0 : nt!SMKM_STORE<SM_TRAITS>::SmStDirectReadIssue+0x93
ffffe187`2c59f4b0 fffff801`038cf93a     : ffff800c`5387b0f4 ffff800c`51b15000 00000000`00000000 ffff800c`531f1000 : nt!SMKM_STORE<SM_TRAITS>::SmStDirectReadCallout+0x21
ffffe187`2c59f4e0 fffff801`03a6694e     : fffff801`03baed50 ffffe187`2c59f580 00000000`00000002 ffff800c`00000000 : nt!KeExpandKernelStackAndCalloutInternal+0x7a
ffffe187`2c59f550 fffff801`03bb0e18     : ffffe187`00000001 ffff800c`49ae41c8 00000000`000003ff ffff800c`5d126190 : nt!SMKM_STORE<SM_TRAITS>::SmStDirectRead+0xd6
ffffe187`2c59f620 fffff801`03bae583     : 00000000`000003ff 00000000`000003ff ffffe187`2c59f6d0 ffff800c`5d126190 : nt!SMKM_STORE<SM_TRAITS>::SmStWorkItemQueue+0x48
ffffe187`2c59f670 fffff801`03a663e2     : 00000000`0000000c ffff800c`5d126190 00000000`00000001 00000000`00000001 : nt!SMKM_STORE_MGR<SM_TRAITS>::SmIoCtxQueueWork+0x1d7
ffffe187`2c59f700 fffff801`03bb8122     : ffff800c`5cb8ad70 ffffe187`2c59f7c0 00000000`00000001 ffffe187`00000000 : nt!SMKM_STORE_MGR<SM_TRAITS>::SmPageRead+0x1cc
ffffe187`2c59f780 fffff801`03a54898     : ffff800c`2071cc47 00000000`00000001 ffff800c`54c0b700 fffff801`038c19f9 : nt!SmPageRead+0x42
ffffe187`2c59f7c0 fffff801`038d59c7     : 00000000`00000002 ffffe187`2c59f900 ffff800c`5cb8ac60 ffff800c`5cb8ac60 : nt!MiIssueHardFaultIo+0x17e578
ffffe187`2c59f810 fffff801`03853d2a     : 00000000`c0033333 00000000`00000001 00000190`b9e40320 00000000`00000000 : nt!MiIssueHardFault+0x207
ffffe187`2c59f8c0 fffff801`03a39329     : ffff800c`5387b080 00007ffc`c334a804 00000000`00000001 ffff800c`00000000 : nt!MmAccessFault+0x31a
ffffe187`2c59f9e0 00007ffc`c53a180d     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiPageFault+0x369
0000008b`5c1ff8d8 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffc`c53a180d


SYMBOL_NAME:  PAGE_HASH_ERRORS_INPAGE

MODULE_NAME: Unknown_Module

IMAGE_NAME:  Unknown_Image

STACK_COMMAND:  .cxr; .ecxr ; kb

FAILURE_BUCKET_ID:  PAGE_HASH_ERRORS_0x1a_3f

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {6a2d4548-0eec-578d-e8f1-9e2239aa9a00}

Followup:     MachineOwner
---------

 *** Memory manager detected 1 instance(s) of corrupted pagefile page(s) while performing in-page operations.

Could again be coincidence, as I hammered all physical RAM and was trying to use a couple of GB extra. Page file would be on the NVME as this is the only volume connected on the new build.

Still no HDD's, extra USB devices connected, and new PSU with all new cables.

I am running another now but only at 12GB physical RAM, total use with windows is still 15GB.

Games typically use 11GB total system RAM and crash.

I am not sure if this really still proves anything if I can get another BSOD, after all, CPU and MOBO still control the RAM...
 
2x8GB Corsair DDR4 2132 w/XMP @ 3596 - non XMP profile loaded

Why the ram? This looks a bit knarly. Why doesn't it say 2133 / 3600 and why's it a 'non xmp' profile?

The manual doesn't say alot about the bios memory settings only that it has a memory try it! feature whatever that means.


You could also update the bios to the latest version.

'
ADDITIONAL_DEBUG_TEXT: Memory Manager detected corruption of a pagefile page while performing an in-page operation.
The data read from storage does not match the original data written.
This indicates the data was corrupted by the storage stack, or device hardware.'

Storage stack or device hardware? In english 'the drives' or 'the controller' which is situated on the mobo, 'presumably'.

The original memtest86+ is back and refurbed or you could also use passmark memtest86 which is a fork of the original memtest.

So you could run memtest off a usb flash drive which doesn't write to the page file. Whether to remove the nvme or not in that process - don't think memtest needs a page file.

So that should tell you if the memory has a problem.

however I still can't really find a reason to blame the nvme for this since it was crashing while your crucial ssd was hosting windows.

Mobo has been there all along so wonder if it's got a problem with it's drive controllers.

Update the bios, try again, and try memtest.
 
Hi, just using standard jdec, non xmp profile, since I figured it was one of the first things that may cause a crash, but neither made any difference to stability.

it is on the latest bios for my board.

I only have free version of memtest, limited to 4 passes, but ran 4 passes, then another 4, then 2, then another 4 - no errors.

I can also try moving gpu to another slot if it would fit, but don’t think it will. I can’t recall a crash in windows only when gaming.

Maybe I should try processor, I still can’t understand why stress tests are perfect though. I could underclock CPU / disable turbo, or up voltage maybe to 1.3 fixed to see if it helps stability.

Only other things are I haven’t switched my keyboard and mouse, but doubt that would be causing it, and I run three monitors, not sure if one malfunctioning could cause gpu to lock the system. Really clutching at straws now before just giving up and getting a new proc, ram and mobo bundle.
 
Any reason you can't simply RMA your mobo?

Tests are taking a long time but it seems you have tested most things and only got a flag on the page file error. That was a drive read or write error than a memory error. Since it was crashing with both your crucial drive and nvme drive, maybe it's not the drives but the controller on the mobo.

It might've been an idea to run all these tests with only 1 monitor attached. Try to stick to a methodology of running it as 'barebones' as possible until you've found the problem.

When you say you have only the free version of memtest you mean the passmark trial version? So you haven't installed the open source memtest86+ on a usb and tried it?

Think when he said many hours probably means one long test rather than several short concurrent tests. Well if you only have the trial version of passmark, you're limited in what you can do it's not worth frittering $50 for memtest pro if you only have 1 machine to test.

Well I don't suspect the ram or the cpu anyway but that's only opinion. You can only just keep trying, with only one monitor at a time.

It's a complex system with a lot to do.

I was thinking, try prime 95 to stress your cpu if you want to do that there's a thread with advice on how to do it for 9900k and the guide in the overclocking section towards the end explains a bit of how to use prime 95 with avx disabled, got me a bit stumped there though since prime 95 has changed since I last used it I just downloaded the latest one and it doesn't even load for me so idk either. You aren't overclocking but the test is the same anyway.

You used to be able to download earlier versions from mersenne.org but don't know what they've now done with that option.

I'd expect the p95 test to crash and show errors before 5 minutes of a stress test if it really had any actual problem. Remember you must only run it with avx disabled because with avx it's an unrealistic test and load for the 9900k so if that doesn't work for you

Seems like you've hit a wall for the moment. Take a break and see if you get any more clues or suggestions. My last gasp is practically, why can't you rma the mobo if it's less than a year old and still within warranty?

*oh I no what I did downloaded the wrong p95 service instead of the one at the top and disable avx option is on the options menu/torture test dialogue page.

Got the systems and chipset driver installed? And all the mobo specific drivers and let windows update decide if it wants to download any more recent drivers?

What if it's a crash related to the specific games you're running? You said most games were ok but game specific crashes are a whole another kettle of fish. They could have patches or as yet undetected bugs and problems with new video drivers.

All you can practically do is read their patch notes and keep the games up to date or hit their forums with a question.
 
Last edited: