Question Mystery crashes on new machine ?

Apr 8, 2024
6
0
10
First post, admittedly making the account to figure out what is going on here.
This is a new machine, barely outside of the Newegg return window. Crashes started about a week ago. Crashes were 'fixed' by reseating the CPU, but crashes still persist in some pieces of software (Most notably, certain games. Ready or Not is consistent, but I've found no documentation on RoN causing total system failures.)

Affected software tends to crash quite quickly, within two to ten minutes.
Bluescreenview calls out ntoskrnel.exe and pshed.dll. Bug check code 0x00000124. Eventviewer reports "Component: Memory Error Source: Machine Check Exception"
All components have up to date drivers. Bios was updated. All components were reseated.

In short, I've gone and done everything I can think of and had IT roommates trying to figure it out, and I've just hit one wall after another. So I'm hoping someone has ideas on what this machine wants before I have to Ship of Theseus it through fighting customer service.

Specs:
AMD Ryzen 9 5900X
ASUS ROG Strix B550-F Gaming AMD
ASUS TUF Gaming NVIDIA GeForce RTX™ 4070 Ti Super OC
G.SKILL Ripjaws V Series (Intel XMP) DDR4 RAM 32GB
SAMSUNG 980 PRO SSD 1TB
Vetroo V240 White Liquid CPU Cooler

Edit: Dump file. https://www.dropbox.com/scl/fi/hr9h...4-01.dmp?rlkey=97trcjbiw8zviywa4okkwx6ud&dl=0
 
Last edited:
First post, admittedly making the account to figure out what is going on here.
This is a new machine, barely outside of the Newegg return window. Crashes started about a week ago. Crashes were 'fixed' by reseating the CPU, but crashes still persist in some pieces of software (Most notably, certain games. Ready or Not is consistent, but I've found no documentation on RoN causing total system failures.) Affected software tends to crash quite quickly, within two to ten minutes.
Bluescreenview calls out ntoskrnel.exe and pshed.dll. Bug check code 0x00000124. Eventviewer reports "Component: Memory Error Source: Machine Check Exception"
All components have up to date drivers. Bios was updated. All components were reseated.

In short, I've gone and done everything I can think of and had IT roommates trying to figure it out, and I've just hit one wall after another. So I'm hoping someone has ideas on what this machine wants before I have to Ship of Theseus it through fighting customer service.

Specs:
AMD Ryzen 9 5900X
ASUS ROG Strix B550-F Gaming AMD
ASUS TUF Gaming NVIDIA GeForce RTX™ 4070 Ti Super OC
G.SKILL Ripjaws V Series (Intel XMP) DDR4 RAM 32GB
SAMSUNG 980 PRO SSD 1TB
Vetroo V240 White Liquid CPU Cooler

Edit: Dump file. https://www.dropbox.com/scl/fi/hr9h...4-01.dmp?rlkey=97trcjbiw8zviywa4okkwx6ud&dl=0
Ready made PC ? Likely set up just so it would just work but when starting to use it many changes take place.
Some necessary changes may be in order. Clean install OS and make sure it's fully updated
Reset CMOS to factory defaults.
Update BIOS to last version.
Run a program like OCCT
https://www.ocbase.com/download
to test components.
 
At a glance i see you're using an Intel XMP Ripjaws kit but that may not mean anything. Please post the exact kit model.
- Open command prompt (hit winkey, search for "cmd").
- Type in or paste this: "wmic memorychip get devicelocator, partnumber"
the partnumber is what i would like to know.

You left out the PSU from your specs.
 
If you have more dumps upload them and we can take a look.

Crashes were 'fixed' by reseating the CPU, but crashes still persist in some pieces of software (Most notably, certain games.
Did you clean and reapply themral paste? Is the cooler properly sitting on CPU? Are the screws evenly thightened?

Latest BIOS and chipset drivers srouced from the Asus website are installed?

Have you monitored the temps? Specially CPU temps under load/gaming when these crashes happen? Do they always end in a BSOD or sometimes crash to desktop?
 
At a glance i see you're using an Intel XMP Ripjaws kit but that may not mean anything. Please post the exact kit model.
- Open command prompt (hit winkey, search for "cmd").
- Type in or paste this: "wmic memorychip get devicelocator, partnumber"
the partnumber is what i would like to know.

You left out the PSU from your specs.
DIMM_A1 F4-3200C16-16GVK
DIMM_B1 F4-3200C16-16GVK
I've already disabled XMP, as I was under the impression that can cause instability.
My mistake. Thermaltake Toughpower 750W

If you have more dumps upload them and we can take a look.


Did you clean and reapply themral paste? Is the cooler properly sitting on CPU? Are the screws evenly thightened?

Latest BIOS and chipset drivers srouced from the Asus website are installed?

Have you monitored the temps? Specially CPU temps under load/gaming when these crashes happen? Do they always end in a BSOD or sometimes crash to desktop?
All drivers are, as best I can tell, up to date. Clean new thermal paste was applied and screws tightened down adequately and evenly. All temperatures are well within normal specs and have shown no strange fluctuations.
I'll add more dump logs but they've all appeared almost identical from what I can tell.
https://www.dropbox.com/scl/fi/hr9h...4-01.dmp?rlkey=97trcjbiw8zviywa4okkwx6ud&dl=0
https://www.dropbox.com/scl/fi/ucc6...8-01.dmp?rlkey=7x25l6jjamrgtkmhse9ax2qwg&dl=0
Ready made PC ? Likely set up just so it would just work but when starting to use it many changes take place.
Some necessary changes may be in order. Clean install OS and make sure it's fully updated
Reset CMOS to factory defaults.
Update BIOS to last version.
Run a program like OCCT
https://www.ocbase.com/download
to test components.
No, built here by myself and IT roommate.
It has had three fresh Windows 11 installs up to date. I haven't fiddled with CMOS at all. BIOS is up to date as of yesterday. OCCT was run and a full stress test was completed with no errors found.
 
You said XMP is off, no CPU OC of any kind? Reset CMOS?

Also are you running any snti-virus application/software?

What is your storage, just the one SSD drive? You're not RAIDing any other storage by any chance?

DIMM_A1 F4-3200C16-16GVK
DIMM_B1 F4-3200C16-16GVK
You mean the modules are in A1 and B1 slots? According to board manual here when populating 2 of slots the modules should be in B2 and A2. Second and fourth from CPU socket. Confirm that's actually your board.

44tJKT5.png
 
You said XMP is off, no CPU OC of any kind? Reset CMOS?

Also are you running any snti-virus application/software?

What is your storage, just the one SSD drive? You're not RAIDing any other storage by any chance?


You mean the modules are in A1 and B1 slots? According to board manual here when populating 2 of slots the modules should be in B2 and A2. Second and fourth from CPU socket. Confirm that's actually your board.

44tJKT5.png
Correct. XMP (Or as my board manufacturer calls it, DOCP...) is disabled. Zero overclocking. I have not yet reset CMOS. I'll have to try that tonight when IT people get home.
AV software is the bundled Windows Defender, nothing outside that.
Storage is one SSD and the little NVME Samsung thing. No RAID storage.
That is the correct board, and that is how my memory is arranged. I previously had the cards in the 2 and 4 slots, but IT roommate moved them to 1 and 3 for a test. I'm not sure why but it didn't really change much if anything. It's a simple matter to put them back in 2 and 4 though. I'll swap those back in here.
 
Correct. XMP (Or as my board manufacturer calls it, DOCP...) is disabled. Zero overclocking. I have not yet reset CMOS. I'll have to try that tonight when IT people get home.
AV software is the bundled Windows Defender, nothing outside that.
Storage is one SSD and the little NVME Samsung thing. No RAID storage.
That is the correct board, and that is how my memory is arranged. I previously had the cards in the 2 and 4 slots, but IT roommate moved them to 1 and 3 for a test. I'm not sure why but it didn't really change much if anything. It's a simple matter to put them back in 2 and 4 though. I'll swap those back in here.
Yes, DOCP on AMD systems. Reseting the CMOS might not be a bad idea.

No AV except Defender and no RAID.

Asked to make sure. If RAM was in B2 and A2 and BSODs/crashes happened and now they still happens in B1 and A1 too that's not a problem. Although it's recommended to populate the slot the manufacturer specifies (B2 and A2).

Will go through the dumps later on and in the meantime others may have other input.
 
And I will add the suggestion to look in Reliability History/Monitor and Event Viewer for any error codes, warnings, or informational events being captured just before or at the time of the crashes.

Reliability History/Monitor presents a timeline format so look for entries that started appearing when the problems began "a week ago".

Also look in Update History for any failed or problem updates.
 
DIMM_A1 F4-3200C16-16GVK
DIMM_B1 F4-3200C16-16GVK
I've already disabled XMP, as I was under the impression that can cause instability.
Interesting. But not conclusive. Those are two 1x16 RAM modules - https://www.gskill.com/product/165/184/1535687484/F4-3200C16S-16GVK
That is not the same as using a 2x16 kit - https://www.gskill.com/product/165/184/1536110922/F4-3200C16D-32GVK

Mixing modules, even if they're the same model is known to be unreliable.

Try your system with just one RAM module. As recommended, stick it into B2 slot.
 
And I will add the suggestion to look in Reliability History/Monitor and Event Viewer for any error codes, warnings, or informational events being captured just before or at the time of the crashes.

Reliability History/Monitor presents a timeline format so look for entries that started appearing when the problems began "a week ago".

Also look in Update History for any failed or problem updates.
Update history shows nothing failed. Reliability monitor is a new tool to me and I have to say I feel like I'll be getting very well acquainted with it... However no, no strangeness in there. I unfortunately suspect there'll be nothing either, since it doesn't show anything before the last reinstall of Windows.
Interesting. But not conclusive. Those are two 1x16 RAM modules - https://www.gskill.com/product/165/184/1535687484/F4-3200C16S-16GVK
That is not the same as using a 2x16 kit - https://www.gskill.com/product/165/184/1536110922/F4-3200C16D-32GVK

Mixing modules, even if they're the same model is known to be unreliable.

Try your system with just one RAM module. As recommended, stick it into B2 slot.
I can give that a go, though I'm not sure I see where you're getting that it's 1x16 vs 2x16. Here's the exact link I ordered in from Newegg. https://www.newegg.com/g-skill-32gb-288-pin-ddr4-sdram/p/N82E16820232091?
 
I can give that a go, though I'm not sure I see where you're getting that it's 1x16 vs 2x16.
The partnumber response in post #5 is F4-3200C16-16GVK

What is confusing me is there is no S nor D after the C16 in the code, but that is how it's listed on G.Skill's website. S is single and D double/dual. So i can't tell for sure whether it's a 1x16 or 2x16.
If they came together in a package, it's pretty much certainly a 2x16. The code is throwing me off is all.
 
The partnumber response in post #5 is F4-3200C16-16GVK

What is confusing me is there is no S nor D after the C16 in the code, but that is how it's listed on G.Skill's website. S is single and D double/dual. So i can't tell for sure whether it's a 1x16 or 2x16.
If they came together in a package, it's pretty much certainly a 2x16. The code is throwing me off is all.
Yeah that's what had me so confused too. They both came together in a single package so I'd doubt they're two mispackaged single cards.
 
Re: "Nothing failed".

Makes sense but the objective is to discover anything that is a predecessor to the crashes.

Or some change just before the crashes began.

In any case, RAM is certainly something to be looked at and followed up on.

Noted that you did not mention Event Viewer per se. Understandable because Event Viewer requires more time and effort to navigate and understand.

To help:

How To - How to use Windows 10 Event Viewer | Tom's Hardware Forum (tomshardware.com)

Take your time, explore Event Viewer and you will soon get a sense of it all.

= = = =

And, overtaken by events....

Re: "Yeah that's what had me so confused too. They both came together in a single package so I'd doubt they're two mispackaged single cards."

I am more cynical. With production and profit requirements being as they are in so many industries I can certainly see the production line workers/managers just tossing in the next available module: right, wrong, or indiferent.,....

I.e., Need 12 screws with only 10 screws available - grab two screws from the next bin - toss them in the bag. Ship.
 
Re: "Nothing failed".

Makes sense but the objective is to discover anything that is a predecessor to the crashes.

Or some change just before the crashes began.

In any case, RAM is certainly something to be looked at and followed up on.

Noted that you did not mention Event Viewer per se. Understandable because Event Viewer requires more time and effort to navigate and understand.

To help:

How To - How to use Windows 10 Event Viewer | Tom's Hardware Forum (tomshardware.com)

Take your time, explore Event Viewer and you will soon get a sense of it all.

= = = =

And, overtaken by events....

Re: "Yeah that's what had me so confused too. They both came together in a single package so I'd doubt they're two mispackaged single cards."

I am more cynical. With production and profit requirements being as they are in so many industries I can certainly see the production line workers/managers just tossing in the next available module: right, wrong, or indiferent.,....

I.e., Need 12 screws with only 10 screws available - grab two screws from the next bin - toss them in the bag. Ship.
Oh I had checked Event Viewer, I've gone through it quite thoroughly. Here's one of the entries from immediately after the crash. I found nothing abnormal preceding the crash.
- <Event xmlns=" ">


- <System>


<Provider Name="Microsoft-Windows-WHEA-Logger" Guid="{c26c4f3c-3f66-4e99-8f8a-39405cfed220}" />


<EventID>46</EventID>


<Version>0</Version>


<Level>2</Level>


<Task>0</Task>


<Opcode>0</Opcode>


<Keywords>0x8000000000000000</Keywords>


<TimeCreated SystemTime="2024-04-08T02:43:38.2252637Z" />


<EventRecordID>4339</EventRecordID>


<Correlation ActivityID="{c8ef6e68-d9df-43ff-9e3a-56651d1f808e}" />


<Execution ProcessID="5728" ThreadID="2668" />


<Channel>System</Channel>


<Computer>CDavis-LT1</Computer>


<Security UserID="S-1-5-19" />


</System>


- <EventData>


<Data Name="ErrorSource">3</Data>


<Data Name="FRUId">{00000000-0000-0000-0000-000000000000}</Data>


<Data Name="FRUText" />


<Data Name="ValidBits">0x2</Data>


<Data Name="ErrorStatus">0x0</Data>


<Data Name="PhysicalAddress">0x100659800</Data>


<Data Name="PhysicalAddressMask">0x0</Data>


<Data Name="Node">0x0</Data>


<Data Name="Card">0x0</Data>


<Data Name="Module">0x0</Data>


<Data Name="Bank">0x0</Data>


<Data Name="Device">0x0</Data>


<Data Name="Row">0x0</Data>


<Data Name="Column">0x0</Data>


<Data Name="BitPosition">0x0</Data>


<Data Name="RequesterId">0x0</Data>


<Data Name="ResponderId">0x0</Data>


<Data Name="TargetId">0x0</Data>


<Data Name="ErrorType">0</Data>


<Data Name="Extended">0</Data>


<Data Name="RankNumber">0</Data>


<Data Name="CardHandle">0</Data>


<Data Name="ModuleHandle">0</Data>


<Data Name="Length">1019</Data>


<Data Name="RawData">435045521002FFFFFFFF04000100000002000000FB0300001A2B0200080418140000000000000000000000000000000000000000000000000000000000000000BDC407CF89B7184EB3C41F732CB57131FE6FF5E89C91C54CBA8865ABE14913BBB96A3A875E89DA01020000000000000000000000000000000000000000000000A00100005000000000030000010000001411BCA5646FDE4EB8633E83ED7C83B100000000000000000000000000000000010000000000000000000000000000000000000000000000F0010000C00000000003000000000000ADCC7698B447DB4BB65E16F193C4F3DB00000000000000000000000000000000010000000000000000000000000000000000000000000000B0020000240100000003000000000000011D1E8AF94257459C33565E5CC3F7E800000000000000000000000000000000010000000000000000000000000000000000000000000000D4030000270000000003000000000000A13248C3C302524CA9F19F1D5D7723FC0000000000000000000000000000000003000000000000000000000000000000000000000000000002000000000000000000000000000000009865000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007F010000000000000002040300010000120FA2000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000A0000000000000000000000000000000000000000000000000000000000000000000000000000000300000002000000173B6E885E89DA010A000000000000000000000000000000000000000100000059080C06000880BC009865000100000000000000FE0F12D00A0000000A00000000000000B00001000000000000000000F9010000030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003B00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000FF00000000000000000000000000000000000000000000000000</Data>


</EventData>


</Event>

Here's another, where it's just saying it rebooted and gives a bugcheck code.
- <Event xmlns=" ">


- <System>


<Provider Name="Microsoft-Windows-WER-SystemErrorReporting" Guid="{abce23e7-de45-4366-8631-84fa6c525952}" />


<EventID>1001</EventID>


<Version>1</Version>


<Level>2</Level>


<Task>0</Task>


<Opcode>0</Opcode>


<Keywords>0x8000000000000000</Keywords>


<TimeCreated SystemTime="2024-04-08T02:43:32.9800025Z" />


<EventRecordID>4306</EventRecordID>


<Correlation />


<Execution ProcessID="1236" ThreadID="1240" />


<Channel>System</Channel>


<Computer>CDavis-LT1</Computer>


<Security UserID="S-1-5-18" />


</System>


- <EventData>


<Data Name="param1">0x00000124 (0x0000000000000000, 0xffffb406ec95b028, 0x00000000bc800800, 0x00000000060c0859)</Data>


<Data Name="param2">C:\Windows\Minidump\040724-8484-01.dmp</Data>


<Data Name="param3">749ce0cc-7de6-480d-a582-7e2dfa2db188</Data>


</EventData>


</Event>
 

TRENDING THREADS