Question Strange RAM problem with 13900K + 64GB 6000MHz ?

PC Tailor

Illustrious
Ambassador
Hi All, hope you're all keeping well :)

I have a friend who ordered a new high end, custom spec PC from PC Specialist very recently (they are already going through warranty process, this is so I can try and help diagnose the issue and see if anyone else has other ideas). The spec is here: PC Specialist Build. The person did this as they are unable to build or upgrade their own PCs due to severe mobility issues.

TL-DR: Build worked fine for 2 days, then stopped detecting 32GB of RAM, found 1 module was not working, replaced this with a new one under warranty. PC ran fine for a day, then stopped detecting the new memory module. PC Specialist then undertook warranty repair, and replaced motherboard and both memory modules - sent PC back which worked fine for a day, then stopped detecting 32GB again.



I was helping them diagnose this issue whilst PC Specialist sorted out warranty work in the background to help try and speed any repairs up. Some things to consider firstly:
  • They contacted me to help install some HDDs from their previous PC. I also disabled the HDDs from the boot order list.
  • The system took an abnormally long amount of time to boot for the spec, it would actually take 15/20 seconds or so.
  • PC was fine for a few days, then suddenly the system was only detecting 32GB RAM (both task manager and BIOS). I was called back to help follow some steps from PC Specialist.
  • Worth noting temperatures etc. had all been checked out by this point and drivers had been updated at this point.
  • The buyer also paid extra for an overclock on the CPU and GPU.
  • They also had a delay in the delivery of the order as apparently they had overclock issues that they had to fix before shipping.
Now is when the troubleshooting started:
  • Check which module was being detected, it was slot A2. Memory in B2 was not detected. Also found in the BIOS that no overclocks of any kind had been enabled.
  • I swapped the modules around, same issue appeared with slot A2 being detected, but not detecting B2. Only 32GB detected in BIOS.
  • I then swapped back and the same issue repeated, so I then enabled memory overclock to 6000MHz, the PC then would fail to boot.
  • I reset the BIOS, the PC booted normally, then I moved the memory modules to slots A1 and B1, the PC would then fail to boot.
  • I tried 1 memory module in slot A2, the PC then booted normally and detected the memory.
  • I swapped the memory in A2 with the other memory module and the PC failed to boot.
  • I repeated this twice to confirm the memory module that worked and if the other one still failed, the same result occurred so suggested that the memory module was likely faulty.
  • PC Specialist sent a replacement memory module (despite protesting for them to send a full pack) and this was installed into the machine by someone else. All seemed to work fine.
  • PC was working with 64GB for a day, then the following day, it stopped detecting the other 32GB again and only detected 32GB total again.
  • Someone else then effectively tried the same steps by moving the slots etc. still only detected the 32GB.
At this point the PC was sent back to PC Specialist for warranty repair, and this is what was reported to have happened:
  • They found the motherboard was causing the memory problems, so they replaced the motherboard and both memory modules.
  • They retried OC but then said they were unable to with the 13th gen CPUs (buyer had asked for refund for OC).
  • They updated all drivers in Gigabyte Control Centre.
  • They confirmed it was now working and sent the PC back.
  • PC worked fine for a day, then - shock horror - the next day only 32Gb was being detected again, and memory was running at 4800MHz. No changes to the BIOS had been made at all when they sent it back.


I have some theories, but wanted to check if anyone else had any ideas too. They of course are contacting PC Specialist again but they are growing incredibly stressed from the whole situation. I am going to withhold my theories at this point to remove any bias from any responses on here!

Long one (sorry) but any thoughts are welcome :)
 
Solution
Build worked fine for 2 days, then stopped detecting 32GB of RAM, found 1 module was not working, replaced this with a new one under warranty.
Would be worthwhile if you could take a look at the PCB revision for the sticks of ram and if there are any anomalies(as in they are different). If so they are based off of different IC's which could be the cause of your instabilities.

It could also be an issue with your BIOS being corrupt. Try and reflash with the latest BIOS version, clear CMOS by removing the battery for 30 mins while disconnected from the wall and display, then replace.

Last thing that does come to mind is the CPU being faulty with a bad IMC...or the mounting pressure on the CPU's socket is causing things to be wonky...

Lutfij

Titan
Moderator
Build worked fine for 2 days, then stopped detecting 32GB of RAM, found 1 module was not working, replaced this with a new one under warranty.
Would be worthwhile if you could take a look at the PCB revision for the sticks of ram and if there are any anomalies(as in they are different). If so they are based off of different IC's which could be the cause of your instabilities.

It could also be an issue with your BIOS being corrupt. Try and reflash with the latest BIOS version, clear CMOS by removing the battery for 30 mins while disconnected from the wall and display, then replace.

Last thing that does come to mind is the CPU being faulty with a bad IMC...or the mounting pressure on the CPU's socket is causing things to be wonky. Try relieving some stress on the mount.
 
  • Like
Reactions: PC Tailor
Solution

PC Tailor

Illustrious
Ambassador
Build worked fine for 2 days, then stopped detecting 32GB of RAM, found 1 module was not working, replaced this with a new one under warranty.
Would be worthwhile if you could take a look at the PCB revision for the sticks of ram and if there are any anomalies(as in they are different). If so they are based off of different IC's which could be the cause of your instabilities.
Yes, I told the buyer to request modules from the same pack for partly this reason. I am not sure whether when they took the PC back, that they replaced with a whole new pack. Checking the PCB revision would clarify this so I'll see if we can check.
It could also be an issue with your BIOS being corrupt. Try and reflash with the latest BIOS version, clear CMOS by removing the battery for 30 mins while disconnected from the wall and display, then replace.
I did wonder this, but then I wondered if this would happen on 2 boards as they replaced the entire motherboard during repairs.
Last thing that does come to mind is the CPU being faulty with a bad IMC...or the mounting pressure on the CPU's socket is causing things to be wonky. Try relieving some stress on the mount.
I absolutely thought this too - one of the first things that came to mind actually after everything that has happened. Once again I felt a bit surprised that this would happen with a brand new motherboard? (I can entirely see them putting too much stress on the mount twice in a row though). I am wondering if in the process of it all they have potentially caused a problem with the CPU.
 
I cannot imagine what would have to be wrong for a 13th gen part to not do 6000 on the memory. That's a giant red flag that something is extremely wrong as RPL is rated at 5600 out of the box and it's basically impossible to not be able to run a cl40 6000 kit at XMP settings. If it's not CPU mounting tension then there's something wrong with the motherboard (meaning the model) or CPU.
 

PC Tailor

Illustrious
Ambassador
did they run this? https://www.intel.com/content/www/us/en/download/15951/19792/intel-processor-diagnostic-tool.html?

wonder if Intel CPU suffer rounding errors. Could try prime, might be a power thing
Just done. All passed (although I know this doesn't rule out the CPU).
TXT output below if anyone wants to peruse.

--- IPDT64 - Revision: 4.1.8.40
--- IPDT64 - Start Time: 07/11/2023 11:42:58

CPU1
Genuine Intel CPU Test
Module Version: 1.0.25.64b.W
Start Time: Tue Nov 7 11:42:58 2023
Test Result - PASS
Expected: GenuineIntel
Detected: GenuineIntel
End Time: Tue Nov 7 11:42:58 2023
Total Time: in seconds: 0
--------------------------------------------------------------------
CPU1
CPU Brand String Test
Module Version: 1.0.27.64b.W
Start Time: Tue Nov 7 11:42:58 2023
Test Result - PASS
Expected: Intel(R) Core(TM)
Detected: 13th Gen Intel(R) Core(TM) i9-13900K
Intel(R) Core(TM) processor detected..!!..
End Time: Tue Nov 7 11:42:58 2023
Total Time: in seconds: 0
--------------------------------------------------------------------
CPU1
Cache Test
Module Version: 1.0.23.64b.W
Start Time: Tue Nov 7 11:42:58 2023
Test Result - PASS

--- Reading Cache Size ---

- Detected L1 Data Cache Size --> 48
- Detected L1 Inst Cache Size --> 32

- Detected L2 Cache Size --> 2048
- Detected L3 Cache Size --> 36864

Cache Size Test Passed!!!

End Time: Tue Nov 7 11:42:58 2023
Total Time: in seconds: 0
--------------------------------------------------------------------
CPU1
MMXSSE Test
Module Version: 1.0.29.64b.W
Start Time: Tue Nov 7 11:42:58 2023
Test Result - PASS

DetectUtils64 DLL Version - 1.1.8

--- Determining MMX - SSE capabilities ---
..MMX is supported on this CPU..
..SSE is supported on this CPU..
..SSE2 is supported on this CPU..
..SSE3 is supported on this CPU..
..SSSE3 is supported on this CPU..
..SSE4.1 is supported on this CPU..
..SSE4.2 is supported on this CPU..

MMX Test Result --- PASS
SSE Test Result --- PASS
SSE2 Test Result --- PASS
SSE3 Test Result --- PASS
SSSE3 Test Result --- PASS
SSE4.1 Test Result --- PASS
SSE4.2 Test Result --- PASS

End Time: Tue Nov 7 11:42:58 2023
Total Time: in seconds: 0
--------------------------------------------------------------------
CPU1
IMC Test
Module Version: 1.0.25.64b.W
Start Time: Tue Nov 7 11:42:58 2023
Test Result - PASS
--- Reading Memory Size ---
Detected Memory Size is --> 32.00GB
--- Subtest - Memory Size Test Passed!!! ---
--- Integrated Memory Controller Stress Test ---
Memory to be allocated = 1048576 bytes
Memory Allocated.
Test 1 Ones and Zeros Moving Inversions write operations - Passed
Test 1 Ones and Zeros Moving Inversions verification operations - Passed
Test 2 32Bits Sliding Ones write operations - Passed
Test 2 32Bits Sliding Ones verification operations - Passed
Test 3 32Bits Sliding Zero write operations - Passed
Test 3 32Bits Sliding Zero verification operations - Passed
Memory Deallocated.
--- Subtest - Memory Stress Test Passed!!! ---
Integrated Memory Controller Test Passed!!!
End Time: Tue Nov 7 11:42:59 2023
Total Time: in seconds: 1
--------------------------------------------------------------------
CPU1
Parallel_PrimeNum Test
Module Version: 1.0.0.12
Start Time: 07/11/2023 11:42:59
Test Result - PASS

Module Math_PrimeNum.exe Completed - Pass


Prime Number Generation Test
Module Version: 1.0.28.64b.W
Start Time: Tue Nov 7 11:42:59 2023
DetectUtils64 DLL Version - 1.1.8
AVX is supported in your OS
Max AVX supported = AVX2
Test Result - PASS
Operation Per Second: 3444325
Error: 0
--- Prime Number Generation Test Passed!!!---
End Time: Tue Nov 7 11:43:44 2023
Total Time: in seconds: 45

Parallel_PrimeNum
End Time: 07/11/2023 11:43:44
--------------------------------------------------------------------
CPU1
Parallel_FP Test
Module Version: 1.0.0.12
Start Time: 07/11/2023 11:43:44
Test Result - PASS

Module AVX.exe Completed - Pass
Module Math_FP.exe Completed - Pass


AVX Test
Module Version: 2.0.30.64b.W
Start Time: Tue Nov 7 11:43:44 2023
Test Result - PASS

DetectUtils64 DLL Version - 1.1.8

--- CPU Features Detection ---
..AVX is supported by this CPU..
..AVX2 is supported by this CPU..
..AVX512BW is NOT supported by this CPU..
..AVX512CD is NOT supported by this CPU..
..AVX512DQ is NOT supported by this CPU..
..AVX512ER is NOT supported by this CPU..
..AVX512F is NOT supported by this CPU..
..AVX512IFMA52 is NOT supported by this CPU..
..AVX512PF is NOT supported by this CPU..
..AVX512VBMI is NOT supported by this CPU..
..AVX512VL is NOT supported by this CPU..
..AES is supported by this CPU..
..PCLMULQDQ is supported by this CPU..

..AVX is supported by this Operating System..

Most Advanced AVX Feature Detected.. AVX2
AVX2 Test Result --- PASS
AES Test Result --- PASS
PCLMULQDQ Test Result --- PASS

End Time: Tue Nov 7 11:44:29 2023
Total Time: in seconds: 45


Floating Point Test
Module Version: 1.0.27.64b.W
Start Time: Tue Nov 7 11:43:44 2023
DetectUtils64 DLL Version - 1.1.8
AVX is supported in your OS
Max AVX supported = AVX2
FMA3 supported
Test Result - PASS
Million Floating Points per Second, MFLOPS: 123.565
Error: 0
--- Floating Point Test Passed!!!---
End Time: Tue Nov 7 11:44:29 2023
Total Time: in seconds: 45

Parallel_FP
End Time: 07/11/2023 11:44:29
--------------------------------------------------------------------
CPU1
Parallel_Math Test
Module Version: 1.0.0.12
Start Time: 07/11/2023 11:44:29
Test Result - PASS

Module Math_PrimeNum.exe Completed - Pass
Module FMA3.exe Completed - Pass


Prime Number Generation Test
Module Version: 1.0.28.64b.W
Start Time: Tue Nov 7 11:44:29 2023
DetectUtils64 DLL Version - 1.1.8
AVX is supported in your OS
Max AVX supported = AVX2
Test Result - PASS
Operation Per Second: 3366107
Error: 0
--- Prime Number Generation Test Passed!!!---
End Time: Tue Nov 7 11:45:14 2023
Total Time: in seconds: 45


FMA3 Test
Module Version: 1.0.28.64b.W
Start Time: Tue Nov 7 11:44:29 2023
Test Result - PASS

DetectUtils64 DLL Version - 1.1.8

--- CPU Features Detection ---
..FMA3 is supported by this CPU..
..FMA3 is supported by this Operating System..

FMA3 Test Result --- PASS

End Time: Tue Nov 7 11:45:14 2023
Total Time: in seconds: 45

Parallel_Math
End Time: 07/11/2023 11:45:14
--------------------------------------------------------------------
CPU1
Parallel_GPUStressW Test
Module Version: 1.0.0.12
Start Time: 07/11/2023 11:45:14
Test Result - PASS

Module Math_PrimeNum.exe Completed - Pass
Module Math_FP.exe Completed - Pass


Prime Number Generation Test
Module Version: 1.0.28.64b.W
Start Time: Tue Nov 7 11:45:14 2023
DetectUtils64 DLL Version - 1.1.8
AVX is supported in your OS
Max AVX supported = AVX2
Test Result - PASS
Operation Per Second: 3444099
Error: 0
--- Prime Number Generation Test Passed!!!---
End Time: Tue Nov 7 11:45:44 2023
Total Time: in seconds: 30


Floating Point Test
Module Version: 1.0.27.64b.W
Start Time: Tue Nov 7 11:45:14 2023
DetectUtils64 DLL Version - 1.1.8
AVX is supported in your OS
Max AVX supported = AVX2
FMA3 supported
Test Result - PASS
Million Floating Points per Second, MFLOPS: 1.885
Error: 0
--- Floating Point Test Passed!!!---
End Time: Tue Nov 7 11:45:44 2023
Total Time: in seconds: 30

Parallel_GPUStressW
End Time: 07/11/2023 11:45:44
--------------------------------------------------------------------
CPU1
DGEMM Stress Test
Module Version: 1.0.16.64b.W
Start Time: Tue Nov 7 11:45:44 2023
Test Result - PASS

DetectUtils64 DLL Version - 1.1.8

--- CPU Features Detection ---
..AVX is supported by this Operating System..

Most Advanced AVX Feature Detected.. AVX2
DGEMM AVX2 Test Result --- PASS

End Time: Tue Nov 7 11:46:44 2023
Total Time: in seconds: 60
--------------------------------------------------------------------
CPU1
Frequency Check
Module Version: 1.0.7.64b.W
Start Time: Tue Nov 7 11:46:44 2023
Test Result - NO COMPARE OPTION USED
Processor Name: 13th Gen Intel(R) Core(TM) i9-13900K
Measured Processor Frequency: 2.994793
End Time: Tue Nov 7 11:46:45 2023
Total Time: in seconds: 1
--------------------------------------------------------------------

System Information
------------------
Processor Name: 13th Gen Intel(R) Core(TM) i9-13900K
Processor Information: Family 6 Model B7 Stepping 1
Number of Physical Cores: 24
Number of Logical Cores: 32
Operating System: Microsoft Windows 11 Home 64-bit
Graphics Information: NVIDIA GeForce RTX 4090
--------------------------------------------------------------------

--- IPDT64 - End Time: 07/11/2023 11:46:45
--- IPDT64 - Result: Pass
--------------------------------------------------------------------
 

PC Tailor

Illustrious
Ambassador
Overall they are now returning the PC for a refund. However unless anyone else has anything to add, I'd be narrowing down the potential causes to these:
  • Excess cooler mounting pressure on the CPU.
  • Faulty/damaged CPU.
  • Memory instability at 6000.
  • PCSpecialist not being totally honest with the repairs they've made and either the memory or motherboard are still at fault.
If anyone else thinks any different, I am open ears! If not I will mark the thread as resolved.