Question RAM errors out, but works (and tests) perfectly after being unplugged for a while

howtobeironic

Honorable
Jun 16, 2018
395
23
11,115
So, I have a pair of Kingston HyperX Fury 8 GB RAM sticks (see end of the post for full specs). About eight months back then, I started getting BSOD's on MEMORY_MANAGEMENT, and process crashes, getting worse and worse by the day. No XMP enabled by the way. So, I decided to vigorously test the sticks out. I used Memtest86+, tested both sticks (in dual AND non-dual channel slots), and it's riddled with one-bit errors (the result is one bit different than the expected value). I pull off one stick and test, one stick passes 4+ errors, one stick fails with 1500+ errors in a single pass. I try combinations of slots, but it seems obvious: one stick bit the bullet. So, I send it to the service. The service returns the sticks to me, saying that they tested both sticks and there's no issue with them. In disbelief, I test it again with a fresh USB with memtest86+, and both sticks pass 4 runs without any issue. I call it an eerie problem paired with a bad USB and keep on using it.

Fast forward 6 months, I see the same issues, tests return all the same with errors on one stick, and I pull the "broken" stick off once more, thinking that it's really gone, and put it away until I can afford a new pair. Two months later, shopping for a new kit, I decide to test the stick out once more, and surprise surprise, it's spotless for 6 passes. I plug it back in. Now, 7 days later, I got my first MEMORY_MANAGEMENT, and the test is again showing errors.

I can't pinpoint the location of the issue at all. It seems like the sticks, but well, it passes tests, it even passed the manufacturer's tests. It could be the motherboard, but running it with the working stick is completely stable and it reports no errors on periodic testing (without XMP, though. Using XMP, the system is unstable in 1 or 2 sticks.) Now that I have the budget to replace a part, I need help finding the problem.

What I have tried:
*Simply replugging the sticks
*Swapping channels, using non-dual channels
*Contact cleanup using %99 isopropyl alcohol
*Testing the sticks with different USB/computers (service counts as a different computer, right?)

Specs:
CPU: AMD Ryzen 5 2600
GPU: Gigabyte RX 580 8 GB
MB: Gigabyte AX370M Gaming 3 (on BIOS version F41)
RAM: Kingston HyperX Fury 2x8 GB (CL16, 3200 MHz)
PSU: Cooler Master Masterwatt Lite 500W
Disk: WD Green NVMe 240 GB + Seagate Barracuda 2 TB 7200 RPM HDD
 
So, I have a pair of Kingston HyperX Fury 8 GB RAM sticks (see end of the post for full specs). About eight months back then, I started getting BSOD's on MEMORY_MANAGEMENT, and process crashes, getting worse and worse by the day. No XMP enabled by the way. So, I decided to vigorously test the sticks out. I used Memtest86+, tested both sticks (in dual AND non-dual channel slots), and it's riddled with one-bit errors (the result is one bit different than the expected value). I pull off one stick and test, one stick passes 4+ errors, one stick fails with 1500+ errors in a single pass. I try combinations of slots, but it seems obvious: one stick bit the bullet. So, I send it to the service. The service returns the sticks to me, saying that they tested both sticks and there's no issue with them. In disbelief, I test it again with a fresh USB with memtest86+, and both sticks pass 4 runs without any issue. I call it an eerie problem paired with a bad USB and keep on using it.

Fast forward 6 months, I see the same issues, tests return all the same with errors on one stick, and I pull the "broken" stick off once more, thinking that it's really gone, and put it away until I can afford a new pair. Two months later, shopping for a new kit, I decide to test the stick out once more, and surprise surprise, it's spotless for 6 passes. I plug it back in. Now, 7 days later, I got my first MEMORY_MANAGEMENT, and the test is again showing errors.

I can't pinpoint the location of the issue at all. It seems like the sticks, but well, it passes tests, it even passed the manufacturer's tests. It could be the motherboard, but running it with the working stick is completely stable and it reports no errors on periodic testing (without XMP, though. Using XMP, the system is unstable in 1 or 2 sticks.) Now that I have the budget to replace a part, I need help finding the problem.

What I have tried:
*Simply replugging the sticks
*Swapping channels, using non-dual channels
*Contact cleanup using %99 isopropyl alcohol
*Testing the sticks with different USB/computers (service counts as a different computer, right?)

Specs:
CPU: AMD Ryzen 5 2600
GPU: Gigabyte RX 580 8 GB
MB: Gigabyte AX370M Gaming 3 (on BIOS version F41)
RAM: Kingston HyperX Fury 2x8 GB (CL16, 3200 MHz)
PSU: Cooler Master Masterwatt Lite 500W
Disk: WD Green NVMe 240 GB + Seagate Barracuda 2 TB 7200 RPM HDD
Not all ram works with all mobo's.
You might try updating the bios.
What speed does the ram run at with xmp off?
 

howtobeironic

Honorable
Jun 16, 2018
395
23
11,115
Not all ram works with all mobo's.
You might try updating the bios.
What speed does the ram run at with xmp off?
Thank you for the reply.
The RAM is in the compatibility list of the motherboard and had been working without issues for 2+ years prior to what happened. The problem seems to persist between BIOS updates, because I remember updating it at least once in this timeframe. Also, I don't think there was a patch note about RAM issues in general in the Gigabyte page. The RAM works at 2400 MHz CL16 in non-XMP mode.
 
Thank you for the reply.
The RAM is in the compatibility list of the motherboard and had been working without issues for 2+ years prior to what happened. The problem seems to persist between BIOS updates, because I remember updating it at least once in this timeframe. Also, I don't think there was a patch note about RAM issues in general in the Gigabyte page. The RAM works at 2400 MHz CL16 in non-XMP mode.
There is a newer bios perhaps try it.

A new bios bat is a cheap test.

Reseating the cpu might be worth a shot or just start swapping parts.
 

howtobeironic

Honorable
Jun 16, 2018
395
23
11,115
There is a newer bios perhaps try it.

A new bios bat is a cheap test.

Reseating the cpu might be worth a shot or just start swapping parts.

There are newer BIOSes, I originally kept back from those because Gigabyte not putting them off the beta and people repeatedly reporting issues with them. I should give it a shot.

I have done part swapping (tested about 5-6 combinations of channel/non-channel seating, single stick in all slots for both sticks)/full reseat during the initial testing phase, might try again though.
 

howtobeironic

Honorable
Jun 16, 2018
395
23
11,115
There are newer BIOSes, I originally kept back from those because Gigabyte not putting them off the beta and people repeatedly reporting issues with them. I should give it a shot.

I have done part swapping (tested about 5-6 combinations of channel/non-channel seating, single stick in all slots for both sticks)/full reseat during the initial testing phase, might try again though.

Today I managed to grab a memtest log in the erroneous state, I'll upload the summary part here. I couldn't see any pointers, but a more trained eye probably could make more sense out of that.
I'm holding back on BIOS updates yet because the power grid's been unreliable, had regular brownouts yesterday. Didn't want to risk it.
If I can find a computer shop with a DDR4 mobo (surprisingly, that equals none of the shops within the 4 km range), I'll have it tested there as well. Being able to replicate the errors there or not should give a clear cut conclusion about it being the sticks or else.

Screenshot-1.png