Question Help finding what HW is faulty

Mar 10, 2021
3
0
10
Hi!

I recently purchased parts to build a new system, including:

i7-10700KF
ASUS PRIME Z590M-PLUS MB
16G (2 x 8G) DDR4-3600 G-Skill Ripjaws V
Rosewill Glacier 600W PSU

Running linux on it, with no changes but to put the MB into XMP for the 3600 RAM and started noticing frequent crashes in chrome browser and in games. Ran memtest86 and it threw some RAM errors pretty quickly, so I called up the store and started an RMA process. Before shipping them back, though, I wanted to take a closer look as there were some other oddities.
I have another system, also running Linux, that's an:
i5-7600K
GIGABYTE GA-H270N-WIFI MB
16G (2 x 8G) DDR4-2400 Ballistix

I pulled out a ram stick from that and put it in the new system and got some errors. Tried all sorts of combinations of 1 or two sticks of new RAM in various slots and finally came to some conclusion that the new RAM was faulty as I was able to run a memtest on the old RAM in the proper single-stick slot for a few hours with no errors (attributed the one fault I did see with the old RAM to wrong bios settings when I was messing with the new RAM, but can't say whether it wasn't an actual fault that just shows up more rarely due to the lower RAM speed).

The other odd things I found are:

- The kernel reports an MCE consistently on boot, but I can't seem to find info on what exact problem that is:
Code:
Mar 10 12:30:45 server kernel: [    0.227974] mce: [Hardware Error]: Machine check events logged
Mar 10 12:30:45 server kernel: [    0.227979] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 6: ee0000000040110a
Mar 10 12:30:45 server kernel: [    0.227983] mce: [Hardware Error]: TSC 0 ADDR fef20200 MISC 43880000086
Mar 10 12:30:45 server kernel: [    0.227991] mce: [Hardware Error]: PROCESSOR 0:a0655 TIME 1615408240 SOCKET 0 APIC 0 microcode e2

- memtest86 indicates horrible cache and RAM speeds on the new i7 compared to the old i5, but I don't know what the other factors are there or if that's relable info

Old system, with 1 stick of the old RAM
Code:
PassMark MemTest86 V9.0 Free
Intel Core 15-7600K @ 3.80GHz
CIk/Temp: 3793 MHz / 76C
L1 Cache : 64K 247.4 GB/s
L2 Cache : 256K 111.0 GB/s
L3 Cache : 6144K 61.6 GB/s
Memory: 8085M 14.9 GB/s
RAM Info : PC4-19200 DDR4 XMP 2400MHZ / 16-16-16-39 / Cruc
CPUS Found: 4
CPUS Started: 4
CPUS Active: 4

New system with 1 stick of the old RAM
Code:
PassMark MemTest86 V9.0 Free
Intel Core 17-10700KF @ 3.80GHz
CIk/Temp: 3832 MHz / 24C
L1 Cache : 64K 42.8 GB/s
L2 Cache : 256K 18.2 GB/s
L3 Cache : 16384K 9.8 GB/s
Memory: 8073M 6288 MB/s
RAM Info : PC4-19200 DDR4 XMP 2400MHZ / 16-16-16-39 / Cruc
CPUS Found: 16
CPUS Started: 8
CPUS Active: 8

Then I took the new RAM and put it in the old system and it ran a bit better, but did find errors in test 6 after a half-hour or so:

Old system with 2 sticks of new RAM:
Code:
PassMark MemTest86 V9.0 Free
Intel Core 15-7600K @ 3.80GHz
CIk/Temp: 3793 MHz / 65C
L1 Cache : 64K 245.7 GB/s
L2 Cache : 256K 111.2 GB/s
L3 Cache : 6144K 63.5 GB/s
Memory: 15.8G 20.2 GB/s
RAM Info : PC4-28800 DDR4 XMP 3602 MHZ / 16-19-19-39 /G Sk RAM Temp
CPUS Found: 4
CPUS Started: 4
CPUS Active: 4

New system with 2 sticks of new RAM:
Code:
PassMark MemTest86 V9.0 Free
Intel Core 17-10700KF @ 3.80GHz
CIk/Temp: 3832 MHz / 28C
L1 Cache : 64K 43.1 GB/s
L2 Cache : 256K 18.2 GB/s
L3 Cache : 16384K 9.8 GB/s
Memory: 15.8G 6403 MB/s
RAM Info : PC4-28800 DDR4 XMP 3602 MHZ / 16-19-19-39 / G Sk
CPUS Found: 16
CPUS Started: 8
CPUS Active: 8

So my conclusion now is that at least one stick of the new RAM is faulty, as it's reporting errors on both the new and old system. However, what I don't get is:
  • Why there's an MCE error on boot
  • What that error means and if it's a concern at all
  • Why my cache and RAM speeds are so horrible on the new i7 compared to the old i5
  • Are there any bios settings that would affect cache/RAM speeds? I tried both Auto and XMP I and both seem to perform the same and configure it for 3600MHZ, 16-19-19-19 1.35V which is exactly as spec'd on the sticks.
  • Are there any other tests I can do (from linux or from a boot stick) that could help pinpoint MB or CPU issues if it's not already evident from the above?
Can anyone with more experience provide some insight?
 
Mar 10, 2021
3
0
10
Tried a bench test out of the case to see if there was any grounding issues and saw similar results to in-case.

After that, however, decided to go digging through the bios settings and flipping a bunch of stuff around and saw a huge performance increase. After a process of elimination, it seems like it's the Intel Speed Shift setting that's causing the issue. I reloaded bios optimized defaults and then just disabled speed shift (left speed step on auto and turbo mode enabled) and now the geekbench score is far closer to being inline with the other ones of the same CPU:


and the memtest cache and RAM speeds have gone way up as well:
Code:
L1 Cache: 64K 200.3 GB/s
L2 Cache: 256K 89.2 GB/s
L3 Cache: 16384K 45.0 GB/s
Memory: 15.8G 17.0 GB/s

Still not exactly where I'd expect them to be, considering my much older i5-7600K system gets a fair bit better results, but light years better than with Speed Shift enabled.

Does anyone know whether this is indicative of a bad CPU or if there's anything further I should do to debug it?