Question Files get corrupted while being copied, I'm getting random BSODs, yet memory tests show no errors.

Oct 14, 2024
3
0
10
Hello. Desperately need some fresh ideas here. Here's the story. I'm having somewhat old desktop PC. i7-3770 with four DDR3 Kingston ValueRAM (I think) modules, 8Gb each. 32Gb RAM total (naturally). About a month ago I've started noticing random BSODs. Especially after, like, a couple of days without reboot and Chrome with, like, a million opened tabs (it's a work PC I rarely shut down). I run some games on this thing too from time to time, but BSODs happen mostly when I use Chrome or sort my files. Don't think I even got BSOD while playing things like Diablo 4 or Final Fantasy XIV. Therefore, I didn't think much of it. I do all sorts of things on this PC and there are more reasons for Windows to get BSOD than there are stars in the night sky. In other words, I just ignored the problem. Turned out to be a stupid thing to do (duh), since now I noticed that my files get corrupted while being copied.

Doesn't look HDD / SSD related. I have one WD Black HDD and two different SSDs. All three pass tests perfectly and files get corrupted even when I simply copy them to another folder on the same disk. And it happens a lot. Like, one file of twenty is guaranteed to become corrupted / get different checksum after being copied. Big video files can get notable problems when played, but it doesn't look like size matters. Things like tiny eBooks get corrupted too.

Naturally, I suspected RAM. Don't have enough time to check every module individually (work PC is, well, for work), so I just started MemTest86 and went through default test. Four passes and stuff. Guess what? Test found nothing. Not a single error. Like... what? How is that even possible? BSODs? Check. Files getting corrupted painfully often? Check. Yet both MemTest86 and Windows' default memory test find nothing. I'm totally out of ideas. I was ready to just replace all four modules to save time, but now when tests didn't find a single error, I'm totally confused. Any help? Thanks.

The specs are:

ASUS P8Z77-V LX2, i7-3770 3.40 Ghz, four sticks of Kingston's KVR1333D3N9/8G (same exact model in all four slots, bought from the same store the same exact day back in the days), Corsair TX850M PSU, GeForce GTX 1080 Ti, PLEXTOR PX-365M5Pro SSD, Samsung SSD 850 PRO 512Gb, WD2002FAEX007BA0 (WD Black HDD), Win 10 Pro.
 
Last edited:
You can run memtest86+ overnight. This software is basically its own o/s running in RAM, so you just have to boot to it, e.g., on a thumb drive. There will be no dependency on the actual o/s during that test, and the software goes away by rebooting without the thumb drive (or whatever media you choose for memtest86+). Did you run memtest86+ for a day? More information why follows, and then at the end, some comments on other sources of error.

In memtest86+, if a particular location in RAM is bad, then that location will show up more than once, and you'll know the RAM is bad (you could run with single sticks if that happens and test again; the stick(s) responsible would show up). It would not be unusual for the location/address to change each time there is an error, in which case it is probably not going to have any error on any particular RAM stick, and the issue would be that the sticks don't match closely enough in timing. If you have a random address failure issue on just one stick, then probably that stick is bad; if you get a random address failure on all sticks, then likely it is either (A) the motherboard itself has an issue with the slot(s) used (many mobos don't support individual sticks on any slot, so you might not be able to test any slot except the first slot), or (b) you got unlucky enough to have all RAM sticks bad.

Incidentally, if all RAM sticks are bad due to random address failures, then it could also be from power supply noise. Power supply issues can make the entire mobo look bad. If a power supply is marginal under load, then there are times when the noise will corrupt memory or any data communications, but only when the noise is injected. Any part of the system can appear bad due to the power supply being marginal, and this will be nearly indistinguishable from actual part failure.

Also, the reason for running overnight is that each pass takes significant time, and in the case of random issues, this might need several passes for that issue to show up; or it might need temperature changes from running over time.

Note that there are other reasons for corrupted files. Simply crashing and not shutting down correctly will likely cause this. The disk storage might also be failing. Even a cable might need reseating after years of heating and cooling cycles ("thermal creep" causes connectors to slightly change each time they temperature cycle). Thermal creep of a connector can make the power supply essentially appear to be bad for that cable. Data cables to hard drives can do similar.

If you were to get replacement modules, then you'd need to be certain those modules are a matched set. This is very difficult to do if the RAM sticks are not shipped together in an origin sealed package. If memtest86+ actually runs for a full day without issue, then I would ignore RAM ad the issue; if running memtest86+ for only a few hours does not show a problem, then I'd run it for at least 24 hours (one of my old jobs related to Beowulf clusters required burn-in with memtest86 for 48 hours).
 
Oct 14, 2024
3
0
10
So. As suggested, I've left MemTest overnight. Still no errors at all.

Test Start Time2024-10-14 22:05:08
Elapsed Time5:50:30
Memory Range Tested0x0 - 81F000000 (33264MB)
CPU Selection ModeParallel (All CPUs)
CPU Temperature Min/Max/Ave44C/51C/48C
Lowest memory speed666 MT/s (9-9-9-24)
Highest memory speed666 MT/s (9-9-9-24)
# Tests Completed48/48 (100%)
# Tests Passed48/48 (100%)

Test# Tests PassedErrors
Test 0 [Address test, walking ones, 1 CPU]4/4 (100%)0
Test 1 [Address test, own address, 1 CPU]4/4 (100%)0
Test 2 [Address test, own address]4/4 (100%)0
Test 3 [Moving inversions, ones & zeroes]4/4 (100%)0
Test 4 [Moving inversions, 8-bit pattern]4/4 (100%)0
Test 5 [Moving inversions, random pattern]4/4 (100%)0
Test 6 [Block move, 64-byte blocks]4/4 (100%)0
Test 7 [Moving inversions, 32-bit pattern]4/4 (100%)0
Test 8 [Random number sequence]4/4 (100%)0
Test 9 [Modulo 20, ones & zeros]4/4 (100%)0
Test 10 [Bit fade test, 2 patterns, 1 CPU]4/4 (100%)0
Test 13 [Hammer test]4/4 (100%)0

Except I've noticed that the PASS screen is yellow and it says "RAM may be vulnerable to high frequency row hammer bit flips". Googled it. MemTest's website says "This warning appears when errors are detected during the first pass (maximum hammer rate) but no errors are detected during the second pass (lower hammer rate)." In other words, MemTest doesn't treat that as a problem and I've seen many forums (this one included) claiming that it's OK and can be ignored. But. In my case we clearly see BSODs and files corruption. Where does that leave us? Faulty sticks? Power? The mention of power actually made me suspicious. As I said, it's not a new PC we talking about here and both PSU and UPS are known to start behaving with time. Can that be a problem? Don't know a thing about power. One way or another, I'm even more confused now.
 
Last edited:
This is exactly why we suggest to include your full system specs in the question.

Otherwise we are stuck looking at tiny clues and guessing. i7-3770 fits into Sandy Bridge motherboards too, and the RAM running at only DDR3-1333 instead of DDR3-1600 suggests that may be what you have. Why would this be relevant? Early Sandy Bridge chipsets were recalled because Intel identified a flaw and predicted up to 15% of the SATA-300 controllers on them would fail within 3 years, and those are now 13 years old. The SATA-600 ports work fine.

So if you just want to rant and not actually solve the problem, I'm sure someone here will be happy to keep guessing for you.
 
  • Like
Reactions: AdmiralTal
Oct 14, 2024
3
0
10
BFG-9000, I'm sorry, I'm new here and not really that good / experienced with hardware. Didn't want to offend anyone or something, just being noobish. Anyway, the specs are as following: ASUS P8Z77-V LX2, i7-3770 3.40 Ghz, four sticks of Kingston's KVR1333D3N9/8G (same exact model in all four slots, bought from the same store the same exact day back in the days), Corsair TX850M PSU, GeForce GTX 1080 Ti, PLEXTOR PX-365M5Pro SSD, Samsung SSD 850 PRO 512Gb, WD2002FAEX007BA0 (WD Black HDD), Win 10 Pro.

You may be at something with DDR3-1600, since I'm pretty sure my current BIOS settings say 1333... It's currently in ASUS' auto mode. Should I switch to manual and do some changes? It's some sort of a dark forest to me...