Hi everyone! I'm facing quite a bizarre issue with an Intel 11th gen platform I've build some time ago for my dad. The build initially consisted of:
Now here's where things get interesting. After some time my dad complained again - now that the game he plays mostly (World of Tanks) started to crash randomly. I've initially believed it's a game related issue introduced by some patch, but after checking the Cyberpunk 2077, I confirmed it also crashed randomly on me. I've reinstalled the graphic drivers using the DDU, and verified again - but the issue was still present. I performed some tests using the OCCT (for CPU) and Furmark (for GPU) separately and everything seemed to work just fine, including the temps which were certainly fine. It was only after I've decided to run both benchmarks simultaneously that within few first minutes OCCT reported multiple errors on CPU's physical cores and begin to crash. From now I knew how to reliably trigger the issue - just test under full system load. Running the Intel's Processor Diagnostic Tool alongside the Furmark even caused the BSOD with service exception dsgkrnl.sys (which I believe points to the GPU driver, but it's definitely not a source of this whole issue) though later OCCT combined with Furmark proved to be able to trigger BSODs as well.
I was pretty mad about this because I assumed that the CPU was somehow damaged (I know it's hard to do it but suspected that maybe the MOBO applied some stupid voltage at auto preset and degraded it within this few months). Surprisingly after messing some more with the PC, it turned out that disabling the XMP fixes the crashing issue completely. However, the weirdest thing is that turning the XMP back on, doesn't trigger the issue immediately. The PC will work perfectly fine for a while, only to start crashing after some more time. Moreover, so far I was unable to trigger this issue using RAM kit #1 with XMP enabled. It's just rock solid when testing and gaming, and the only difference I could see in sensor reading was slightly higher bus frequency the motherboards adjusts for kit #1. I've checked the kit #2 using memtest and it seems to be completely issue-free on it's own. I've even further updated the BIOS to the latest F7b version. Still, no luck.
At this point I'm quite at loss at what to do with all of this and I'm humbly asking for your advice. Currently the PC works fine with kit #1 (aside from the long POST issue) but I feel like I should RMA something, just not sure whether it should be a CPU or MOBO. I know the XMP is not 100% guaranteed to run on each system, but I own a few builds and it's the first time ever for me, that a reliable platform like Z590 with popular and recognized kit (which this Ballistix sticks sure are) are having such a ridiculous stability problem with stock CPU settings. Am I missing something obvious here? Or maybe I should suspect some other component also? For the reference, here are the screenshots of tests runs with both #1 and #2 kits with XMP enabled, as well as as many sensors reading I could fit on screen:
Tl;dr version:
- Intel Core i5 11600k CPU (no OC so far - all settings are left stock)
- Gigabyte Z590 UD AC Motherboard
- Crucial 16GB (2x8GB) 3000MHz CL15 Ballistix Sport LT Gray (BLS2K8G4D30AESBK ) Memory Kit (a reliable leftover from my own AMD Ryzen build - let me further refer to it as a RAM kit #1)
- Samsung 980 1TB SSD
- Thermalright Macho Rev.C Plus CPU Cooler
- XPG Core Reactor 750W 80 Plus Gold PSU
- MSI Geforce 1660 Super Gaming X GPU
- CRUCIAL BallistiX 16GB (2x8GB) DDR4 3200MHz (BL2K8G32C16U4B) - I'll refer to it as RAM kit #2
Now here's where things get interesting. After some time my dad complained again - now that the game he plays mostly (World of Tanks) started to crash randomly. I've initially believed it's a game related issue introduced by some patch, but after checking the Cyberpunk 2077, I confirmed it also crashed randomly on me. I've reinstalled the graphic drivers using the DDU, and verified again - but the issue was still present. I performed some tests using the OCCT (for CPU) and Furmark (for GPU) separately and everything seemed to work just fine, including the temps which were certainly fine. It was only after I've decided to run both benchmarks simultaneously that within few first minutes OCCT reported multiple errors on CPU's physical cores and begin to crash. From now I knew how to reliably trigger the issue - just test under full system load. Running the Intel's Processor Diagnostic Tool alongside the Furmark even caused the BSOD with service exception dsgkrnl.sys (which I believe points to the GPU driver, but it's definitely not a source of this whole issue) though later OCCT combined with Furmark proved to be able to trigger BSODs as well.
I was pretty mad about this because I assumed that the CPU was somehow damaged (I know it's hard to do it but suspected that maybe the MOBO applied some stupid voltage at auto preset and degraded it within this few months). Surprisingly after messing some more with the PC, it turned out that disabling the XMP fixes the crashing issue completely. However, the weirdest thing is that turning the XMP back on, doesn't trigger the issue immediately. The PC will work perfectly fine for a while, only to start crashing after some more time. Moreover, so far I was unable to trigger this issue using RAM kit #1 with XMP enabled. It's just rock solid when testing and gaming, and the only difference I could see in sensor reading was slightly higher bus frequency the motherboards adjusts for kit #1. I've checked the kit #2 using memtest and it seems to be completely issue-free on it's own. I've even further updated the BIOS to the latest F7b version. Still, no luck.
At this point I'm quite at loss at what to do with all of this and I'm humbly asking for your advice. Currently the PC works fine with kit #1 (aside from the long POST issue) but I feel like I should RMA something, just not sure whether it should be a CPU or MOBO. I know the XMP is not 100% guaranteed to run on each system, but I own a few builds and it's the first time ever for me, that a reliable platform like Z590 with popular and recognized kit (which this Ballistix sticks sure are) are having such a ridiculous stability problem with stock CPU settings. Am I missing something obvious here? Or maybe I should suspect some other component also? For the reference, here are the screenshots of tests runs with both #1 and #2 kits with XMP enabled, as well as as many sensors reading I could fit on screen:
Tl;dr version:
- PC POSTs for quite a long time (I specifically mean POST, Windows startup is pretty quick), LED indicates long memory checks, regardless of memory kit, number of sticks, used slot, XMP profile
- With memory kit #2, the applications or even whole system can crash, but only under full system load (both CPU and GPU needs to be under load)
- Disabling the XMP fixes the issue, though enabling it does not cause it to come back immediately (but it will eventually come back)
- With memory kit #1 and XMP enabled, there's no crashing
- Memory kit #2 seems to be just fine when checking for errors
- The GPU drivers and BIOS are already up to date