Question Potential memory controller failure 3600x

Aravind92

Distinguished
Apr 1, 2014
714
10
19,065
Hi All,

My PC specs are in the signature.

So, I've had some instabilities with the pc intermittently for a while now. Today, all of a sudden the pc crashed and restarted when playing bf1, I go open event viewer right away and the pc freezes, no response but display was there, just frozen, had to turn off power supply to shutdown, turning power supply back on caused the pc to boot loop couple of times and then stuck on DRAM, removed one stick from the 4th slot, worked fine BIOS reset and turning on XMP worked with one stick, so I tried another stick in 4th slot again after resetting bios and clearing CMOS, stuck in either CPU or DRAM LED again, switched ram sticks, both sticks ran fine on 1st slot. My first 3600x failed under similar circumstances in 2022, memory controller issue as far as I could tell. Figured it's happened again and left just one stick in the pc, left for a while and then gave it a shot and plugged the 2nd stick back in 4th slot, guess what it decides to work fine again. XMP is turned on too. I had a similar experiance couple of months ago when I cleaned the cpu cooler when I had to remove the ram sticks, then too it just decided to work randomly.

Also, event viewer shows Dump file creation failed due to error during dump creation before the first crash.

Edit;
Ryzen 5 3600x
MSI B450 Tomoawk Max
Zotac 3070 white
Corsair Vengeance LPX 2x8 3000 Mhz
CM MWE 650W
WD SN550 1 TB
Acer VG240YP
 
Last edited by a moderator:
My PC specs are in the signature.
Sig space specs can awill change over time and when that happens this thread and any relevant suggestions will end up moot to the end user in the same boat as you're in now. This is why we ask users to include their specs in their thread's body as opposed to saying, specs in my sig. I've shoehorned your specs in your thread's body.

How old is your PSU? What BIOS version are you currently on for your motherboard?

Perhaps try and reflash the BIOS on your motherboard to the latest version and see if that helps. To rule out your motherboard being faulty or failing(or the BIOS chip being corrupt), drop your processor onto another known working motherboard with the right BIOS version to see if the issue persists. If it does, you've narrowed down your processor's integrated memory controller being the root of the issue.
 
My PC specs are in the signature.
Sig space specs can awill change over time and when that happens this thread and any relevant suggestions will end up moot to the end user in the same boat as you're in now. This is why we ask users to include their specs in their thread's body as opposed to saying, specs in my sig. I've shoehorned your specs in your thread's body.

How old is your PSU? What BIOS version are you currently on for your motherboard?

Perhaps try and reflash the BIOS on your motherboard to the latest version and see if that helps. To rule out your motherboard being faulty or failing(or the BIOS chip being corrupt), drop your processor onto another known working motherboard with the right BIOS version to see if the issue persists. If it does, you've narrowed down your processor's integrated memory controller being the root of the issue.
Noted, I am on the latest stable BIOS, psu is barely 2 months old, I am at a point where swapping cpus is not going to confirm what the issue is as the pc is working fine again, been playing rhe same game again for a while now. Not sure where to go from here? I will be able to figure it out if it's consistent, I thought it was going to be this time as it failed to boot on numerous attempts and suddenly it works.
 
My first 3600x failed under similar circumstances in 2022, memory controller issue as far as I could tell.
If you want to check both your CPU's integrated memory controller channels and RAM at the same time, have you tried MemTest86?
https://www.memtest86.com/

You should be OK running two 3000MT/s DIMMs on a 3600X (which is rated at DDR4-3200), but it's possible one of your DIMMs may be failing.

Have you been overclocking the 3600X long term with PBO? Have you manually increased any CPU voltages? After many years, electro-migration effects may be appearing if the CPU has been overclocked really hard.
 
If you want to check both your CPU's integrated memory controller channels and RAM at the same time, have you tried MemTest86?
https://www.memtest86.com/

You should be OK running two 3000MT/s DIMMs on a 3600X (which is rated at DDR4-3200), but it's possible one of your DIMMs may be failing.

Have you been overclocking the 3600X long term with PBO? Have you manually increased any CPU voltages? After many years, electro-migration effects may be appearing if the CPU has been overclocked really hard.
Yeah, I never enabled PBO or even tried overlooking with both my 3600xs and yeah I ran memtest last month foru passes and everything came out fine.
 
Hmmm. There go my obvious candidates. Next on my checklist would be the Windows drive (run CHKDSK, SFC and DISM). Then I'd slowly change components, PSU, RAM, GPU, PSU, boot drive, motherboard, cables. Fine if you have an infinite spares box or a similar PC to swap bits over.
 
Hmmm. There go my obvious candidates. Next on my checklist would be the Windows drive (run CHKDSK, SFC and DISM). Then I'd slowly change components, PSU, RAM, GPU, PSU, boot drive, motherboard, cables. Fine if you have an infinite spares box or a similar PC to swap bits over.
Haha, Unfortunately I don't. I did run SFC a little while ago, didn't find anything. I was planning to change my cpu to a 5700x anyway, short of the current pc consistently failing to boot I am not sure swapping components will tell me what the root cause is right away I guess. The fact that the boot was stuck in either DRAM or CPU led before it decided to work again leads me to believe it's either the motherboard or cpu seeing as the RAM passed memtest and both sticks did a successful boot in he 2nd DRAM slot.

I was planning to swap the cpu for a 5700x and rams sticks for 16 x 2 anyway, guess I will do that in a bit and see how it goes, if the issue happens after all that and does so consistently then I can narrow it down by swapping components at a store. What do you reckon?

Well, unless the pc completely crashes again consistently before I get around to swapping the parts. In that case, the inevitable change is going to tell me if it was the cpu or the ram. Yeah?
 
The fact that the boot was stuck in either DRAM or CPU led before it decided to work again leads me to believe it's either the motherboard or cpu
You're probably correct. Changing the motherboard is often the last thing I try, because most of the other components are cheaper or easier to swap.

Have you tried (carefully) removing the CPU from the socket and looking for slightly bent pins, foreign objects, or discoloured areas on the gold pads (under side of CPU)? Repaste the CPU when replacing the cooler.

CM MWE 650W
If you can access a different PSU (beg, borrow, buy) I'd give that a try. Some problems vanish when you fit a new PSU.

Is your PSU a "fully wired" unit or "modular" with removeable cables? If you've had it several years, the warranty might have expired. Some people use the warranty period as a guide to quality. 2 years or less, avoid. 5 years, better. 7 years, good. 10 to 12 years excellent. Cheap PSUs use cheap electrolytics which eventually dry out and ripple voltage on the DC rails increases to unacceptable levels.

I'm still using some very old PSUs in old unimportant systems, but as a result, I've had the occasional explosion (luckily no damage to anything else). For all my normal builds, I buy good quality PSUs with decent warranties.

In summary, unless it happens consistently enough for me to swap and test I have no way of narrowing it down, Yeah?
There's a vague chance you might find something in the Event Logs, but often all you see is "Unexpected Shutdown" which isn't very helpful.
 
You're probably correct. Changing the motherboard is often the last thing I try, because most of the other components are cheaper or easier to swap.

Have you tried (carefully) removing the CPU from the socket and looking for slightly bent pins, foreign objects, or discoloured areas on the gold pads (under side of CPU)? Repaste the CPU when replacing the cooler.
Yeap, motherboard is a hassle, But, no I haven't removed the CPU yet, I have already placed an order for a 5700x and ram sticks, once it arrives I will have to swap anyway, I will check for and post here.

If you can access a different PSU (beg, borrow, buy) I'd give that a try. Some problems vanish when you fit a new PSU.

Is your PSU a "fully wired" unit or "modular" with removeable cables? If you've had it several years, the warranty might have expired. Some people use the warranty period as a guide to quality. 2 years or less, avoid. 5 years, better. 7 years, good. 10 to 12 years excellent. Cheap PSUs use cheap electrolytics which eventually dry out and ripple voltage on the DC rails increases to unacceptable levels.

I'm still using some very old PSUs in old unimportant systems, but as a result, I've had the occasional explosion (luckily no damage to anything else). For all my normal builds, I buy good quality PSUs with decent warranties.
I can try and do that, but the PSU is only 2 months old, I just bought it along with the 3070(used), I used to have a 550 which I've given to a friend of mine. And it is non modular. I could probably borrow a PSU, but currently the issue is the PC is working fine and I won't be able to tell if that was the issue even if I swap it. I could probably RMA the PSU and et a new one, but I doubt it is the issue as it wasn't an immediate reset when the pc crashed, but one with the usual keeech sound before restarting and the fact that the pc froze upon the restart and then proceeded to not boot with DRAM leads me to not suspect the PSU at this point.
There's a vague chance you might find something in the Event Logs, but often all you see is "Unexpected Shutdown" which isn't very helpful.
Yeap, I did check the event log right away, all I got was unexpectd shutdown along with a volmgr message sayingit couldn't create a dump for that shutdown.