71ctlFMR9GL._SL1500_.jpg

Welcome Community to this "How To" on Stability Testing Ryzen!

CPU overclocking is generally pretty straight forward, you increase your CPU multiplier, then when you reach instability, you increase voltages until you hit your desired frequency. This is especially true with Intel CPUs.

AMD Ryzen on the other hand is a bit more complex when it comes to overclocking. This is due to how the architecture works, Intel’s chips are easy because overclocking the CPU really doesn’t have a lot to do with the memory or the memory controller. You can usually push either CPU core ratios or memory ratios pretty high without negatively affecting each other.

But with Ryzens “weaker” IMC and infinity fabric architecture, overclocking the CPU core can sometimes destabilize the memory controller, causing instability from multiple areas of the chip.

So, here’s what you want to do to properly stabilize your Ryzen CPU.

  1. You’ll need a suite of stress testing utilities, I recommend OCCT, Prime95 (latest version is critical, then you’ll want to disable AVX instructions via txt file) and ROGRealBench.
  2. Run OCCT with an 8-10 hour stress test. (Large Data Set)
  3. Run Prime95 Blend for 8-10 hours.
  4. Run ROG RealBench for 8 hours.
  5. Run Prime95 with memory intensive configuration, start with the blend mode, then edit these fields: 512k small FFTs, “memory in use” = 75% of your total system memory (if this is too much for some reason, you can back down to 50%).
    Run this for 8-10 hours.
Why these Stress testers?

In all my experience with stress testers, OCCT has been an absolute beast of a tester. It’ll catch errors way faster than Prime95, and is usually the 1st program to go unstable. So it’s a great program to start off with.

Prime95 blend is a good option because with stress tests, you WANT to run a bunch of them, not one single program can stress every single part of a CPU. So P95 is there as insurance basically.

Same with ROG RealBench, another good stress tester that I love to run as extra insurance on stability.

Prime95 with memory intensive config stresses the memory significantly. This will insure your memory and memory controller can handle your overclock.

When Stress Tests Fail:

(Note: I AM ABSOLUTELY assuming that you have already stress tested your memory and CPU at STOCK speeds (Stock = XMP profile on RAM), and you know for sure that they are stable.)

If OCCT fails on your CPU overclock, increase SOC and memory voltages to a temporarily high but safe value. This will insure that you know only the CPU is unstable during your future runs. Then run again, if OCCT fails, then you know for sure that it's your CPU cores only that are unstable.

My "high but safe value" voltages are: SOC = 1.13v, RAM = 1.4 (if you have RAM that has an XMP profile of 1.4v then run 1.45v)

If P95 fails, generally it’s due to vcore not being sufficient.

If ROG RealBench fails, but P95 and OCCT run, that usually means your GPU is causing issues or just your CPU is unstable.

If P95 with memory intensive config fails, that usually means your memory is unstable and/or your IMC is unstable. Increase voltage on one or both, or decrease memory speed/loosen timings if necessary.

Once you’ve stabilized your CPU, you can start lowering your IMC voltages, test for that, then lower your memory voltage accordingly. Remember to stress test each step.

FYI, once your done with stress testing, go ahead and increase vcore voltage e 1-2 notches higher than your current-validated-stress-test-error-free voltages. This will absolutely insure you are stable (because in reality, no system is 100% stable, but this is to get as stable as possible). I especially recommend you do this on Ryzen because of how steep the frequency to voltage curve is. With Intel, the frequency to voltage curve is horizontal enough that you really don’t need to do this.

I hope this has helped you stabilize overclocking on Ryzen. Ryzen's a tricky beast to truly get stabilized. But it'll be worth it!
 
Last edited:

mamasan2000

Distinguished
BANNED
8-10 hours of OCCT seems like a waste of time.
If it doesn't crash in 5 minutes, I'm almost always good.
Temp reaches pretty much max temp in 3-5 minutes, even on a Closed-loop Liquid Cooler (CLC).

LLC always plays a role under load. If voltage drops too low, you crash.
 
8-10 hours of OCCT seems like a waste of time.
If it doesn't crash in 5 minutes, I'm almost always good.
Temp reaches pretty much max temp in 3-5 minutes, even on a Closed-loop Liquid Cooler (CLC).

LLC always plays a role under load. If voltage drops too low, you crash.

Almost every single experienced overclocker will tell you that 5 minute stress tests are a joke.

I've had OCCT fail at the 6 hour mark. Heck, some stress testers pass, but some PROGRAMS i use fail due to instability. That's why long stress testing is needed.

Just depends too, some people are content only having a half way stable system just for gaming.

In reality, you should run Prime95 for around 24hours, but nobody's got time for that. And the gains doing that are somewhat small.
 
  • Like
Reactions: Roland Of Gilead
Almost every single experienced overclocker will tell you that 5 minute stress tests are a joke.

I've had OCCT fail at the 6 hour mark. Heck, some stress testers pass, but some PROGRAMS i use fail due to instability. That's why long stress testing is needed.

Just depends too, some people are content only having a half way stable system just for gaming.

In reality, you should run Prime95 for around 24hours, but nobody's got time for that. And the gains doing that are somewhat small.

Yeah, a lot of folk will just OC to get the max bootable speed, so as to 'not leave any performance on the table' kinda approach. Personally I like complete stability with my CPU OC. I run my Prime testing overnight, but as you say not many will have the time to stress for 24hrs.

I'd prefer stability over an extra 200-300mhz which 'may' be stable enough to game.
 
I'm dead set against needles and aggressive stress testing except for troubleshooting purposes. If it runs all benchmarks and regular programs a short and sharp test that brings temps to their max is quite enough.
 
Yeah, a lot of folk will just OC to get the max bootable speed, so as to 'not leave any performance on the table' kinda approach. Personally I like complete stability with my CPU OC. I run my Prime testing overnight, but as you say not many will have the time to stress for 24hrs.

I'd prefer stability over an extra 200-300mhz which 'may' be stable enough to game.

Yeah same here, I would rather optimize my overclocks by going down a few 100mhz and decreasing vcore significantly so I can run all stress tests without overheating and know it's basically a rock solid stable overclock that the factory would do.
 

rigg42

Honorable
Well done TechyInAZ. I think this is all excellent information. Thanks for taking the time to share. Your stress testing regiment is a quite a bit more exhaustive than my own. I've done a lot of overclocking on Zen and Zen + with a lot of different CPU's and figured I'd share some thoughts.

Typically the first thing I do after bios is updated and windows, drivers, and software are installed is run default memtest86 . I use stock bios except XMP. In my opinion/experience this step is usually completely unnecessary with the latest bios' with Zen and Zen + chips if you are running slower than 3200 CL16. I often skip it to save time. It is good practice, and I'm not discouraging it, although in my experience it is probably a complete waste of time unless running expensive memory with hi speed and/or tight timings.

I have to disagree with you that increasing the SOC up around 1.2v is always going to help memory stability. It's actually possible it will cause instability. Sometimes you need to lower SOC voltage to get stability. I ran into this getting a 3600 CL 16 B-die kit to run stable XMP with a 1600. It was completely unstable above 1.15v soc yet perfectly stable at 1.1. With a different 1600 CPU and the same memory and motherboard I needed 1.2 to get it stable. This varies from chip to chip and more isn't always better. With your typical 3000/3200 memory kits running XMP on recent bios' you can just leave SOC on auto IMO.

I typically start my OC by using Intel burn test on the default test. This helps me quickly find the sweet spot of the voltage and frequency curve for a given processor while getting a quick yet difficult stress test to pass. If it crashes out right I add more voltage or less clock. If it finds instability I know I'm close. If it passes I move on to the next stress test. Having overclocked at least 20 different Ryzen CPU's (I lost track of this exact number a while ago) I've found this to be the best initial "proof of concept" for an overclock for my uses.

From there I use P95 26.6 small FFT for at least an hour. If no workers crash and temps stay below 80 C I call it good. If workers or windows crash I add voltage and/or reduce clock until It runs for at least an hour.

From there I'm on to 8 hours real bench with half of the system RAM. Rinse and repeat the clock/voltage adjustments if instability or crashing occurs.

Once I have passed all 3 of those tests I do a default memtest86 run and adjust SOC/DRAM voltage/timings if needed.

While I've found this process to work like a charm to relatively quickly arrive at stable OC's I'll consider running more extensive long term testing like you've laid out in your post. I typically overclock on CPU SKU's where it really makes sense like the 1200 or 1600 on the stock coolers so I'm not really pushing the limits most of time. At the end of the day any software can find instability even with stock settings and It's impossible to test for everything.

As to the LLC question.....You want to use it but you need to BE CAREFUL!!! Especially if pushing voltage's near the safe limits of the CPU. You should always be seeing some V Droop at full load. If the voltage is boosting on SV12 TFN between idle and load you are using too much. Small FFT works good to see if your LLC is dialed in. I like to see atbout 10 mv or so of droop. More if really pushing the voltage. The problem with LLC is it can send voltage spikes to the CPU when transitioning from heavy to light load/idle. The motherboard is pumping more voltage to the CPU to compensate for the voltage sag when the CPU starts drawing a bunch of current. The problem is It cant stop doing this instantaneously when the load shifts so you'll get momentary spikes of voltage equal to the amount of load compensation above what you'd see at idle. This can kill your CPU if you are not careful.
 
Last edited:
  • Like
Reactions: TechyInAZ
Interesting rigg42, wow thanks for that info! So here's my situation:

I can max out my LLC to level 5, and I see no vdroop or vboost at all. But i do have offset voltage applied so my CPU does downclock when idle.

So this is bad? Does this mean that HWINFO can't see the vboost spikes cause their so fast?

and seriously thanks for the info about the IMC, i had no idea too much voltage on the IMC can cause instability.
 

rigg42

Honorable
Interesting rigg42, wow thanks for that info! So here's my situation:

I can max out my LLC to level 5, and I see no vdroop or vboost at all. But i do have offset voltage applied so my CPU does downclock when idle.

So this is bad? Does this mean that HWINFO can't see the vboost spikes cause their so fast?

and seriously thanks for the info about the IMC, i had no idea too much voltage on the IMC can cause instability.

HWInfo can’t poll faster than 100ms. I doubt it would pick up the spike. You'd need to measure it with a meter. I would see what it looks like with manual voltage and set it at an LLC level that still droops a bit. I believe level 4 still droops about 10mv on the C6H at heavy load. You can then set the voltage back to offset and adjust it accordingly to compensate if its unstable.

View: https://www.youtube.com/watch?v=Z8nFdFpuVBg



I found the SOC voltage thing by accident. I bought a 1600 that was a lottery winning chip. I decided to stick it in my personal living room 4K build and use the 1600 that was in there in a flip. After an evening of tearing my hair out trying to get the memory stable I finally tried lowering the SOC down from the 1.2v it took to get stable on the previous CPU. I had assumed the high SOC voltage would be my best case for stability but the opposite turned out to be true. I think in general higher is probably better but it certainly wasn’t for this particular CPU.
 
Last edited:
I kinda figured HWInfo can't detect stuff fast enough.

Ok so I have a story to share with you,

So my Ryzen 7 1700X, I thought died. a few days ago, it literally started crashing in windows and i couldn't get it to work at all. I even tried to reinstall windows, and that gave me errors just installing it. Then i tried to get into memtest86, it wouldn't even load.

So I bought a 3600, comes today actually to replace my "dead" 1700X. Well when you shared your info about the SOC voltage, it hit me, I left it at 1.175v when troubleshooting. I thought, hey if it isn't stable now, why would it be stable with a lower voltage.

So sure enough, i put the SOC voltage back to auto. Bam, everything works. The computer is fully reliable now!! Thanks for saving my 1700X man LOL.

It's just so strange that it was quite stable before, like i didn't really have that many issues. Then all of a sudden, the BSODs came like a landslide.
 
  • Like
Reactions: rigg42

rigg42

Honorable
I kinda figured HWInfo can't detect stuff fast enough.

Ok so I have a story to share with you,

So my Ryzen 7 1700X, I thought died. a few days ago, it literally started crashing in windows and i couldn't get it to work at all. I even tried to reinstall windows, and that gave me errors just installing it. Then i tried to get into memtest86, it wouldn't even load.

So I bought a 3600, comes today actually to replace my "dead" 1700X. Well when you shared your info about the SOC voltage, it hit me, I left it at 1.175v when troubleshooting. I thought, hey if it isn't stable now, why would it be stable with a lower voltage.

So sure enough, i put the SOC voltage back to auto. Bam, everything works. The computer is fully reliable now!! Thanks for saving my 1700X man LOL.

It's just so strange that it was quite stable before, like i didn't really have that many issues. Then all of a sudden, the BSODs came like a landslide.
Awesome! I'm glad that helped. Perhaps the controller degraded a bit over time. It looks like it possibly didn't need that much voltage to begin with.

Still planning on using the 3600 in the CH6? I recommend 7106 until a new bios releases . It has some quirks but everything seems to work for the most part. I think 7201 has the bad AGESA version on it. I'd avoid it. The CMOS doesn't like to clear properly. Safe boot really comes in handy if you bork your settings. You really don't need to do much. Just go into advanced PBO, set the auto oc @ 200 mhz, and enable it. Beyond that your memory speed an FCLK is where all the performance is to be had. The two have a weird relationship. At 3200 or below running FCLK as high as possible seems to be best. Above that 1 to 1 seems to work better. So if you can get stable 1900 FCLK, and the memory to 3800, it is ideal. This might be hard with the T-Topology memory layout on the C6H though. The 300 series boards have never been the best for memory in general.

One of my 3 3600's is super weird with memory and motherboard combinations. It struggles to post consistently. Just something to be aware of if you have issues with getting it to post. That particular CPU liked a 3200 cl16 crucial kit with my C6H wifi.

There also seems to be some 3600's that run really hot. I haven't seen this first hand but I've seen people with proper voltage have temps way worse than any of my 3600's with comparable cooling.

https://forums.tomshardware.com/threads/ryzen-5-3600-w-wraith-stealth-cooler-temps.3501705/

That should give you a point of reference for temps.
 
Thanks man!! Yes I still plan to upgrade. Actually I can't run my memory overclock anymore on my 1700X, due to it needing 1.15v SOC at the bare minimum to run. But even that I believe is unstable. So i'm stuck with stock 3200mhz XMP. Which still ain't bad at all, I'm lucky I got a 1700X that can run 3200mhz at all.

Yes I have the same thought process, I'm on 7106 right now and plan to stay on it until the next good BIOS comes out. 7201 has that weird AGESA code that AMD pulled due to bugs, so I don't plan on going with that BIOS, like you said.

Sounds good on the overclocking, i'll probably just stick to PBO and some minor timing adjustments in the future.
 
  • Like
Reactions: rigg42