Question Problems with Overclocking RAM, is it even worth it?

Minaz

Commendable
Sep 20, 2021
118
4
1,585
I have a Gigabyte Aorus mobo with 64 GB DDR4 4000Mhz RAM. It came with 32GB, and I added another 32GB of the exact same mfg, make, timing (I know this is the first question I'll be asked, but I checked and double checked - its the exact same 4 pieces of RAM).

The mobo defaults to 2400MHz. Since I have 4000MHz RAM, I thought to enable the XMP profile. There is only one profile in BIOS when I enable XMP, called Profile 1. It lists the RAM as 4000MHz. But after enabling it, I would get random BSOD (clock watchdog timeout). I am going to be fair and say that I did encounter BSOD before installing the RAM, but it seemed to go away with a BIOS update. I cannot be sure of this as the computer is relatively new and BSOD is tricky to reproduce. If I just leave the computer on, it can go hours (days) without BSOD. DisplayCal seems to cause it to BSOD more often, but I cannot be sure.

I read somewhere that I could try lowering the clock speed of the RAM. At last count, 3600 still caused BSOD or hard crash without BSOD (particularly with DisplayCal).

I have just lowered it further to 3400.

But my question here is: is it even worth it? I mean, I really would love to get the performance I paid for with the expensive 4000MHz RAM, but if I keep getting BSOD, I would stick with disabling XMP and just living with the reduced speed. Is it even that much reduced for that matter, how much would I notice the difference? OTOH I will surely notice the difference when a BSOD occurs!

If it is worth it, then there seems to be a lot to do, with timings and whatnot, and all the while the only way to know if it worked is to wait for a BSOD to maybe or maybe not eventually happen. Seems like a pain, so I'd do it only if its worth the gain.

Any help or suggestions would be appreciated, thanks!
 
I have a Gigabyte Aorus mobo with 64 GB DDR4 4000Mhz RAM. It came with 32GB, and I added another 32GB of the exact same mfg, make, timing (I know this is the first question I'll be asked, but I checked and double checked - its the exact same 4 pieces of RAM).

The mobo defaults to 2400MHz. Since I have 4000MHz RAM, I thought to enable the XMP profile. There is only one profile in BIOS when I enable XMP, called Profile 1. It lists the RAM as 4000MHz. But after enabling it, I would get random BSOD (clock watchdog timeout). I am going to be fair and say that I did encounter BSOD before installing the RAM, but it seemed to go away with a BIOS update. I cannot be sure of this as the computer is relatively new and BSOD is tricky to reproduce. If I just leave the computer on, it can go hours (days) without BSOD. DisplayCal seems to cause it to BSOD more often, but I cannot be sure.

I read somewhere that I could try lowering the clock speed of the RAM. At last count, 3600 still caused BSOD or hard crash without BSOD (particularly with DisplayCal).

I have just lowered it further to 3400.

But my question here is: is it even worth it? I mean, I really would love to get the performance I paid for with the expensive 4000MHz RAM, but if I keep getting BSOD, I would stick with disabling XMP and just living with the reduced speed. Is it even that much reduced for that matter, how much would I notice the difference? OTOH I will surely notice the difference when a BSOD occurs!

If it is worth it, then there seems to be a lot to do, with timings and whatnot, and all the while the only way to know if it worked is to wait for a BSOD to maybe or maybe not eventually happen. Seems like a pain, so I'd do it only if its worth the gain.

Any help or suggestions would be appreciated, thanks!
Full system specs please including make and model of psu.

Have you made sure to install the mother board drivers themselves?

What have tried to run memtest?
 

Minaz

Commendable
Sep 20, 2021
118
4
1,585
Full system specs please including make and model of psu.
See below, PSU is the NZXT 1000W (C1000 Gold).

Have you made sure to install the mother board drivers themselves?
With regards to drivers, I've updated the bios to the latest version, and the audio drivers. There are a few more drivers available (https://www.gigabyte.com/Motherboard/Z590-AORUS-PRO-AX-rev-10/support#support-dl-utility) but they don't seem related to the issue at hand, I'd be okay to install them if there is a chance to help, but I am wary of installing random drivers especially if I have no idea what they do (for example, there is for some reason a Norton Internet Security patch on the list, and I don't use that). Some require me to install Intel APP center. Do you recommend this?

Edit: I've gone ahead and installed all the driver updates for the mobo except for the Utilities (which require app center).

What have tried to run memtest?
Yes, I ran memtest and also some other memory testing utilities (exact name escapes me right now, but it came highly recommended, I'll update this post when I find it again). It passed. I should make note that when I first got the system with 32GB, it already had the instability, and I ran memtest and the other tools back then, where it passed. I lowered the timing and the problem went away for a while so I thought I fixed it. Turns out I wasn't patient enough. When I added the extra 32GB, it still continued working fine, but eventually it BSOD again. Most notably, it would quite often BSOD right after a Windows update.

Here are the specs I got from System Information in Windows:

OS Name Microsoft Windows 11 Pro
Version 10.0.22000 Build 22000
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name
System Manufacturer Gigabyte Technology Co., Ltd.
System Model Z590 AORUS PRO AX
System Type x64-based PC
System SKU Default string
Processor 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz, 3504 Mhz, 8 Core(s), 16 Logical Processor(s)
BIOS Version/Date American Megatrends International, LLC. F9a, 1/24/2022
SMBIOS Version 3.3
Embedded Controller Version 255.255
BIOS Mode UEFI
BaseBoard Manufacturer Gigabyte Technology Co., Ltd.
BaseBoard Product Z590 AORUS PRO AX
BaseBoard Version Default string
Platform Role Desktop
Secure Boot State Off
PCR7 Configuration Elevation Required to View
Windows Directory C:\WINDOWS
System Directory C:\WINDOWS\system32
Boot Device \Device\HarddiskVolume5
Locale United States
Hardware Abstraction Layer Version = "10.0.22000.778"
User Name
Time Zone Pacific Daylight Time
Installed Physical Memory (RAM) 64.0 GB
Total Physical Memory 63.9 GB
Available Physical Memory 56.2 GB
Total Virtual Memory 73.4 GB
Available Virtual Memory 62.6 GB
Page File Space 9.50 GB
Page File C:\pagefile.sys
Kernel DMA Protection Off
Virtualization-based security Running
Virtualization-based security Required Security Properties
Virtualization-based security Available Security Properties Base Virtualization Support, UEFI Code Readonly, SMM Security Mitigations 1.0, Mode Based Execution Control, APIC Virtualization
Virtualization-based security Services Configured Hypervisor enforced Code Integrity
Virtualization-based security Services Running Hypervisor enforced Code Integrity
Windows Defender Application Control policy Enforced
Windows Defender Application Control user mode policy Off
Device Encryption Support Elevation Required to View
A hypervisor has been detected. Features required for Hyper-V will not be displayed.
 
Last edited:
I would first try to get the system stable with 1 kit of RAM, only then add another. Even though the new kit is the same as the original kit you are mixing kits and that can cause problems even when exactly the same model, it’s why RAM is sold in matched kits. ee odd man out section here https://forums.tomshardware.com/faq...y-ram-and-xmp-profile-configurations.3398926/

While you have 2 kits installed you don’t know if it’s the mixing kits or something else causing your stability issues.

On the motherboard compatibility list does it list any 4 dimm kits at 4000mhz? When running 4 dimms the maximum stable speed can be lower.
 
  • Like
Reactions: Roland Of Gilead

Minaz

Commendable
Sep 20, 2021
118
4
1,585
I previously posted a thread in overclocking, but I've come to strongly suspect overclocking is not the issue, so I think this is a better subforum for the post.

I include here a more complete history of everything that happened since I bought the PC. I bought the system new from NZXT last year December-ish. It was meant to replace a dying PC (which is now dead), but since the other PC lasted a bit longer than expected, it mostly just sat idle most of the time, although I did test it with various minor tasks to see that it worked, but never for longer than an hour or so at a time. It gave a BSOD about a week into delivery, and so I called support and they suggested that I update the BIOS. That seemed to solve the problem for a while. However, I never could be quite sure because the PC was mostly idle.

At some point, I realized that the memory was not clocked at its full potential. My RAM was rated at 4000MHz, the default was set at 2400MHz. I therefore enabled XMP profile, after which I got a BSOD. On the forums, I was told to test the memory and lower the timings. I didn't really understand "lowering the timings" meant, and so far the way I have been doing it is to adjust the system memory multiplier, bringing to down from 4000 to 3800 etc...
This seemed to solve the issue for a while. Then there was a sale of memory on NewEgg so I added another 32GB of RAM. I got the exact same type of RAM from the same Manufacturer (I checked the model and specs carefully). I got more BSOD, but because of the infrequent use, I wasn't sure if it was a continuation of the old problem or a new one from installing more RAM. Again the suggestion was to test the RAM, which I did, but with no errors found.

I then completely reset the BIOS to "optimized defaults" so all previous overclocking is erased. No luck - I still get the CWT (Clock Watchdog Timeout) BSOD. On advice, I then raised the Vcore to 1.4 (which Google suggests as a conservative "safe" max), but I still got the CWT BSOD. (This is with default BIOS other than VCore being raised.)

Actually, I am a bit desperate now because some of the things I assumed would fix the error didn't. I had always imagined that resetting everything as a last resort would solve the BSOD in a pinch, and I am mystified that it didn't.

One other thing someone suggested which I am beginning to regret a little was to update the motherboard drivers. I updated all the drivers listed on my specific mobo from the manufacturer website. Not only did that not solve the problem, I am wondering if it could have caused new ones.
Besides CWT BSOD, I also sometimes get WHEA Uncorrectable and Kernal Security BSOD, but mostly CWT. Sometimes the system even hard crashes or hard reboots without BSOD.
Although the BSOD timing is quite unpredictable, I found that one thing that causes BSOD almost 100% of the time (CWT) is copying a large number of files. I am actually trying to restore my old files from a backup I have from the old (dead) PC. These files are both on an externdal HDD and on a network server in my home. Each time I copy the files, my PC would crash a few minutes into the copying process. I can play games on the PC all day sometimes without crashing, but copy those files and its a guarantee crash within 30 mins. I first thought it was a HDD or USB issue, but copying the same files from the network server causes the crash too, and changing the destination (from an external HDD to an internal SSD for instance) doesn't solve the problem. Therefore, regardless of whether I am copying from a backup HDD or network HDD, and whether the destination is an external HDD or internal SSD, the system with BSOD without say 15-30 mins.
Another thing that seems to crash the system is running DisplayCal, to calibrate the monitor color. About half the time this causes a BSOD. Other than that, watching youtube videos seems to be able to cause BSOD as well (although could be a coincidence), and sometimes it can BSOD just sitting there doing nothing (probably some resident program ran in the background is my guess). But it is the first two that I can safely say will reliably trigger the BSOD.

Playing games does not seem to crash the system. As mentioned, I have run at least 4 different recommended memory diagnostic software, but none have crashed the system nor turned up any errors. So if memory is the culprit, it is something the memory tests all must be missing or not testing for.


Here is the config for my system, including CPU:

OS Name Microsoft Windows 11 Pro
Version 10.0.22000 Build 22000
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name MARKS-NZXT
System Manufacturer Gigabyte Technology Co., Ltd.
System Model Z590 AORUS PRO AX
System Type x64-based PC
System SKU Default string
Processor 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz, 3504 Mhz, 8 Core(s), 16 Logical Processor(s)
BIOS Version/Date American Megatrends International, LLC. F9a, 1/24/2022
SMBIOS Version 3.3
Embedded Controller Version 255.255
BIOS Mode UEFI
BaseBoard Manufacturer Gigabyte Technology Co., Ltd.
BaseBoard Product Z590 AORUS PRO AX
BaseBoard Version Default string
Platform Role Desktop
Secure Boot State Off
PCR7 Configuration Elevation Required to View
Windows Directory C:\WINDOWS
System Directory C:\WINDOWS\system32
Boot Device \Device\HarddiskVolume5
Locale United States
Hardware Abstraction Layer Version = "10.0.22000.778"
User Name MARKS-NZXT\markw
Time Zone Pacific Daylight Time
Installed Physical Memory (RAM) 64.0 GB
Total Physical Memory 63.9 GB
Available Physical Memory 55.8 GB
Total Virtual Memory 73.4 GB
Available Virtual Memory 62.4 GB
Page File Space 9.50 GB
Page File C:\pagefile.sys
Kernel DMA Protection Off
Virtualization-based security Not enabled
Device Encryption Support Elevation Required to View
Hyper-V - VM Monitor Mode Extensions Yes
Hyper-V - Second Level Address Translation Extensions Yes
Hyper-V - Virtualization Enabled in Firmware Yes
Hyper-V - Data Execution Protection Yes

As I said, I am quite desperate for any clue right now, if there is anything I can provide that might help, please let me know! Thank you again!
 

Karadjgne

Titan
Ambassador
I added another 32GB of the exact same mfg, make, timing (I know this is the first question I'll be asked, but I checked and double checked - its the exact same 4 pieces of RAM).

And that is why you fail. You are incorrect. Ram underneath the heatsink is a pcb with silicon chips attached. In kit #1, all those silicon chips came from the same silicon sheet. In kit #2, all those chips came from a different silicon sheet.

The ram is the same on the outside, but has very different specs on the inside as the levels and composition of impurities in each sheet of silicon will change the Secondary and Tertiary timings.

The only thing they have in common is the Primary timings, model number, speed/size and heatsink color.

Under normal circumstances, that cpu shouldn't have any issues running 4 sticks. Circumstances are not always normal. You may have to turn on XMP, manually change the dram voltage to @1.38v± 0.02v, and possibly give the VCCIO a small bump, maybe 1.2v and VCCSA to maybe 1.3v, ± 0.05v
 
  • Like
Reactions: alexbirdie

Minaz

Commendable
Sep 20, 2021
118
4
1,585
And that is why you fail. You are incorrect. Ram underneath the heatsink is a pcb with silicon chips attached. In kit #1, all those silicon chips came from the same silicon sheet. In kit #2, all those chips came from a different silicon sheet.

The ram is the same on the outside, but has very different specs on the inside as the levels and composition of impurities in each sheet of silicon will change the Secondary and Tertiary timings.

The only thing they have in common is the Primary timings, model number, speed/size and heatsink color.

Under normal circumstances, that cpu shouldn't have any issues running 4 sticks. Circumstances are not always normal. You may have to turn on XMP, manually change the dram voltage to @1.38v± 0.02v, and possibly give the VCCIO a small bump, maybe 1.2v and VCCSA to maybe 1.3v, ± 0.05v
Thanks. I am beginning to wonder if its not the RAM at all and it is a false lead. The reason I have for this is:
  • running certain apps such as games which normally use a fair amount of RAM never (almost never?) cause BSOD
  • running memory tests (I've run 4 different ones overnight so far) never detect errors or BSOD
  • I have already reset bios to stock, so overclocking wouldn't be the issue per se (xmp is now off)
  • the one thing that 100% causes the BSOD is copying a large number of files from one hard drive to another. I have a backup hard drive that I am trying to copy to my internal. Doing this causes BSOD 100% of the time.
Or would you still think RAM might be the cause for another reason?
 

Karadjgne

Titan
Ambassador
What's the bsod? Any critical errors listed in Event viewer other than sudden loss of power? Do that enough you raise chances of corruption.

Try opening CMD with Admin privileges.
Type : sfc /scannow
Type : dism /online /cleanup-image /restorehealth

See if they find anything.
 

Minaz

Commendable
Sep 20, 2021
118
4
1,585
What's the bsod? Any critical errors listed in Event viewer other than sudden loss of power? Do that enough you raise chances of corruption.

Try opening CMD with Admin privileges.
Type : sfc /scannow
Type : dism /online /cleanup-image /restorehealth

See if they find anything.
I get one of three different BSOD.
Most common is CLOCK_WATCHDOG_TIMEOUT.
Otherwise I might get WHEA_UNCORRECTABLE_ERROR or KERNAL_SECURITY_CHECK_FAILURE.
Let me know if minidumps would help, I have started saving them and uploading them online in case they turn up useful in diagnosing the problem.

As for the sfc and dism checks, they are now reporting normal (I've fixed issues in the past, but currently the BSODs are continuing and there are no problems picked up by those two checks).
 
Your ram is NOT identical.

Ram is sold in kits for a reason.
A motherboard must manage all the ram using the same specs of voltage, cas and speed.
The internal workings are designed for the capacity of the kit.
Ram from the same vendor and part number can be made up of differing manufacturing components over time.
Some motherboards, can be very sensitive to this.
This is more difficult when more sticks are involved.
Ram must be matched for proper operation.

Technically, all ddr4 ram faster than 2400 or so is overclocked ram.
Voltage must be raised past the stock 1.2v to get higher speeds.
Usually, this is done by selecting a XMP profile which is embedded in the ram stick.
You may be able to reach 4000 speed, or close to it by specifying the settings yourself and increasing the ram voltage past what the xmp spec says.
Do so in small increments past the likely spec of 1.35v.
You might need to go as high as 1.5v if you are looking for ultra low cas timings.

Your post asks a very good question...
Is it worth it?
Probably not.
Exception might be use of integrated graphics or some memory intensive apps.
Otherwise think low single % difference in most apps.

When you think you have the settings you want,

Run memtest86 or memtest86+
They boot from a usb stick and do not use windows.
You can download them here:
If you can run a full pass with NO errors, your ram should be ok.

Running several more passes will sometimes uncover an issue, but it takes more time.
Probably not worth it unless you really suspect a ram issue.
 
  • Like
Reactions: Minaz

Minaz

Commendable
Sep 20, 2021
118
4
1,585
None of those are ram related. That's all cpu voltage related and data corruption. That looks to be memory controller not getting enough voltage.
Thanks. I actually now suspect same as you suggest, that the RAM was a red herring (caused no doubt by my unease at having "tampered" with it). So if it is a memory controller issue, how would I look into it or resolve it? This sounds like something embedded in the mobo? Would that mean replacing the mobo?
 

Karadjgne

Titan
Ambassador
The memory controller is part of the cpu. The 2 voltages that deal with it on Intels are VCCIO and VCCSA.

Run the pc with 2x sticks and look to see what those 2x voltages are sitting at. Then add 0.1v to each manually when you put in the 2nd set. VCCIO shouldn't get beyond @ 1.2v, VCCSA @ 1.3v. Dram @ 1.4v