[SOLVED] Stable overclock on i7 8700k now having issues

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
Hi,

I have an i7 8700k running on an ASRock Z370 extreme4 MB.

I have been running a stable OC @ 4.8Ghz with an offset of -30 for a few years now without issue. I mainly max it out running ffmpeg or handbrake for video encoding.

In the last few days I've noticed that when running handbrake the PC has started to either lock up or crash.

Any ideas what could be causing this? Should I re-seat the cpu and apply new thermal paste? I'm using an Arctic Cooler i30 CO as a cooler.
 
Solution
There it is again ... 64mv.

When Vcore spikes low at the same moment that a fluctuating workload spikes high, this is when the processor is most vulnerable to BSOD.

There are 2 solutions:

Increase Vcore to compensate for increasingly sloppy voltage regulation.
Replace the motherboard with a make or model of tighter voltage regulation.

Just for some background perspective, on 4th generation processors, Intel used Fully Integrated Voltage Regulators (FIVR) directly in the package, so fluctuations were minimal at only 8mv, with occasional fluctuations to 16mv; 8mv = 1 bit.

Motherboards for other generations of processors before and after 4th generation have had voltage regulation from 16 to 32mv, but as sloppy as...
Hey there,

I've had a similar issues with my 1600x (yes, I know different platform etc). After running an OC for a few years, it has suddenly become unstable. You might just pump a little more juice to the CPU, which is what I've done, and then retested for stability. I like to test overnight for 8hrs approx to ensure stability. I use Prime95, small ffts to test.

With an OC there will be a degradation over time. The solution here is bumping the vcore a little. You may have to readjust again in a year or two. Normal enough.

Good luck.
 
  • Like
Reactions: ian.egerton

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
Hi,

Thanks for the reply.

I've adjusted the offset to -20 and I'll see how that goes. I think the idle temps are a bit higher than I remember, so I might clean up the heat-sync and reapply some paste to be sure also.
 
Hi,

Thanks for the reply.

I've adjusted the offset to -20 and I'll see how that goes. I think the idle temps are a bit higher than I remember, so I might clean up the heat-sync and reapply some paste to be sure also.
Yeah, defo worth a try to re-apply paste.

What temps are you getting at idle? Ideally you should be about 30-35c at idle (depending on ambient temps), 45-65c gaming load, and approx 65-80c stress load. Keep in mind these are rough temperatures. Every system is different.
 
  • Like
Reactions: ian.egerton

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
Yeah, defo worth a try to re-apply paste.

What temps are you getting at idle? Ideally you should be about 30-35c at idle (depending on ambient temps), 45-65c gaming load, and approx 65-80c stress load. Keep in mind these are rough temperatures. Every system is different.

My idle temps are hovering around 40 so that is what concerning me. Under load it's maxes out at about 80, so that seems normal for what I've seen before.

Edit - The system seems to be running a bit more stable since I changed the offset.
 
Hmmmm. 40-ish, seems a tad high alright. Give the paste ago, and let us know the results. If the paste is degrading, which it does over time. That could be where the high idle temp is coming from. A good paste alone can reduce temps by 5c. If you haven't re-applied paste before this, it may be that you get a good reduction in idle temps.

If the system seems more stable with the offset, that's good. Maybe test again for stability with Prime/OCCT or something similar.
 
  • Like
Reactions: ian.egerton
You could also try LLC. It's typically used for higher OC, and can help with 'vdroop'. Normally LLC is set on auto for moderate OC's. If you put it on manual, try at the first two levels, maybe the 3rd. It can help with stability, but higher than level 3 can sometimes lead to overvolting the CPU, so just watch temps etc, and see if the first two levels give you good results.
 

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
I'm starting to think I might have an issue.

Prime95 fails fairly quickly all even modest overclocks. Temperature hit 95+ on 4.8 with the -20 offset and it still returns errors.

I've even dropped to 4.7 and 4.6 but still get errors in prime95.

I can't find my thermal paste anywhere so I've ordered more from amazon.
 

Phaaze88

Titan
Ambassador
Temperature hit 95+ on 4.8 with the -20 offset and it still returns errors.
Do you know that your current cpu cooler is good for up to 160w of heat, and that your cpu can easily exceed that in Prime 95 - especially if AVX isn't disabled?

I don't see any mention of the Vcore you have running through your chip, by the way...
I see the offset, but that doesn't say how high the Vcore gets. If offset + LLC has been bumping to over 1.4Vcore, I wouldn't be surprised if what you're seeing is the result of voltage degradation.
 
  • Like
Reactions: CompuTronix

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
Do you know that your current cpu cooler is good for up to 160w of heat, and that your cpu can easily exceed that in Prime 95 - especially if AVX isn't disabled?

I don't see any mention of the Vcore you have running through your chip, by the way...
I see the offset, but that doesn't say how high the Vcore gets. If offset + LLC has been bumping to over 1.4Vcore, I wouldn't be surprised if what you're seeing is the result of voltage degradation.

My cooler is an Arctic Cooler i30 CO. It's been working fine for a long time now and it's rated for 160w. Specs show:


Max. Cooling Performance: 320 Watts
Recommended for TDP up to: 160 Watts

Edit: I'm running prime95 on stock clock and still hitting over 82c with CPU VCore floating around 1.123V.
 
Last edited:

Phaaze88

Titan
Ambassador
It's been working fine because you don't normally push 160w with your 8700K - except if you stress test it.

I don't know where they get 320w max cooling performance from - it must be full system draw, because all the 320w TDP cpu coolers I've seen are way larger than that one.
Plus, if the i30 CO actually were a 320w TDP cooler, and not 160w, you'd not be thermal throttling in Prime 95 right now.
 

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
It's been working fine because you don't normally push 160w with your 8700K - except if you stress test it.

I don't know where they get 320w max cooling performance from - it must be full system draw, because all the 320w TDP cpu coolers I've seen are way larger than that one.
Plus, if the i30 CO actually were a 320w TDP cooler, and not 160w, you'd not be thermal throttling in Prime 95 right now.

I understand that prime95 is torturing the system, but where I initially noticed the issue was in handbrake for a overclock I've been using for a couple of years now. Firstly my machine froze when running handbrake with the overclock at 4.8 with an offset of -30. I modified the offset to -20 and handbrake seemed to run fine.

But my temps seem higher than normal in general not just in prime95. My idle temps are between 35 and 40 on 4.8Ghz.

Also, I don't put the system under stress regularly. Only occasional video encoding etc.
 

Phaaze88

Titan
Ambassador
vcore is around 1.23v...
That... is low, at least according to Silicon Lottery's binning statistics for that cpu:
Coffee LakeAll Core SSE FrequencyAll Core AVX2 FrequencyBIOS Vcore% Capable
8700K4.80GHz4.60GHz1.375V100%
8700K4.90GHz4.70GHz1.387VTop 99%
8700K5.00GHz4.80GHz1.400VTop 83%
8700K5.10GHz4.90GHz1.412VTop 49%
8700K5.20GHz5.00GHz1.425VTop 17%
8700K5.30GHz5.10GHz1.437VTop 4%
 

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
Yeah, even my idle temps even at stock are too high I think (between 35 and 42), whereas these were never an issue previously (or the last time I checked) that's why i think it's the cooler and/or paste.
 

CompuTronix

Intel Master
Moderator
ian.egerton,

There are several variables and possibilities to consider with respect to Core temperatures and stability.

Although Throttle temperature is 100°C, Core temperatures above 85°C are not recommended. Core temperatures below 80°C are ideal.

The Arctic Cooler i30 CO is barely adequate for the i7-8700K at stock, but not when overclocked or when Package Power is as high as 160 Watts. Although transcoding software such as HandBrake runs AVX code, the peak fluctuating workload and power consumption typically shouldn't exceed the steady-state true 100% workload of Prime95 Small FFTs with all AVX test selections disabled.

If Core voltage at 100% workload is only 1.23 then processor degradation is extremely unlikely, which typically doesn't begin to occur on 14nm processors until Vcore exceeds 1.4 as Phaaze88 has pointed out. However, motherboard VRM degradation is always possible, which can cause unstable power delivery to the CPU power plane resulting in intermittent instabilities and BSODs, especially when lowest Vcore sags occur during highest peak workloads, as keith12 has pointed out with respect to LLC.

You may be experiencing memory instability, which is often difficult to isolate from CPU instability. Record the error message(s) and code(s) when BSODs occur. Use a pink eraser to clean each module's edge connector contacts, then reseat and / or swap modules to different slots. You can increase memory voltage in 10 millivolt (0.010) increments to see if your rig stabilizes. It would also be prudent to run MemTest86 as well as Windows Memory Diagnostic Tool.

TIM (paste) degradation is most obvious at 100% workload due to high power and thermal dissipation, but is much less noticeable at idle due to low power and thermal dissipation. This also applies to Intel's internal paste between the Die and the Integrated Heat Spreader (IHS), which also degrades over time. In such scenarios, the only solution is delidding.

Complaints about high idle temperatures on various websites and forums are commonly made by users who are highly inconsistent about what the term "idle" actually means. Idle means just 1 or 2% CPU Utilization in Windows Task Manager; not running a light workload and surfing with various browser tabs open while streaming, etc. Unnecessary background Processes and Services account for spikes, fluctuations and high idle temperatures, especially when combined with high ambient temperature.

Most users don't mention ambient temperature which can be a huge variable. Many aren't accustomed to monitoring thermal behavior, so Core temperatures may go unobserved for countless months. Seasonal indoor ambient temperatures might be 10°C near the Arctic Circle, or 40°C without A/C near the Equator, which affects all computer temperatures, whether at idle or under 100% workload.

The International Standard for "normal" room temperature is 22°C or 72°F. So when providing full system specs, it always helps to include ambient temperature so we know whether or not your rig is running in a reasonably normal environment.

CT:sol:
 
  • Like
Reactions: ian.egerton

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
ian.egerton,

There are several variables and possibilities to consider with respect to Core temperatures and stability.

Although Throttle temperature is 100°C, Core temperatures above 85°C are not recommended. Core temperatures below 80°C are ideal.

The Arctic Cooler i30 CO is barely adequate for the i7-8700K at stock, but not when overclocked or when Package Power is as high as 160 Watts. Although transcoding software such as HandBrake runs AVX code, the peak fluctuating workload and power consumption typically shouldn't exceed the steady-state true 100% workload of Prime95 Small FFTs with all AVX test selections disabled.

If Core voltage at 100% workload is only 1.23 then processor degradation is extremely unlikely, which typically doesn't begin to occur on 14nm processors until Vcore exceeds 1.4 as Phaaze88 has pointed out. However, motherboard VRM degradation is always possible, which can cause unstable power delivery to the CPU power plane resulting in intermittent instabilities and BSODs, especially when lowest Vcore sags occur during highest peak workloads, as keith12 has pointed out with respect to LLC.

You may be experiencing memory instability, which is often difficult to isolate from CPU instability. Record the error message(s) and code(s) when BSODs occur. Use a pink eraser to clean each module's edge connector contacts, then reseat and / or swap modules to different slots. You can increasing memory voltage in 10 millivolt (0.010) increments to see if your rig stabilizes. It would also be prudent to run MemTest86 as well as Windows Memory Diagnostic Tool.

TIM (paste) degradation is most obvious at 100% workload due to high power and thermal dissipation, but is much less noticeable at idle due to low power and thermal dissipation. This also applies to Intel's internal paste between the Die and the Integrated Heat Spreader (IHS), which also degrades over time. In such scenarios, the only solution is delidding.

Complaints about high idle temperatures on various websites and forums are commonly made by users who are highly inconsistent about what the term "idle" actually means. Idle means just 1 or 2% CPU Utilization in Windows Task Manager; not running a light workload and surfing with various browser tabs open while streaming, etc. Unnecessary background Processes and Services account for spikes, fluctuations and high idle temperatures, especially when combined with high ambient temperature.

Most users don't mention ambient temperature which can be a huge variable. Many aren't accustomed to monitoring thermal behavior, so Core temperatures may go unobserved for countless months. Seasonal indoor ambient temperatures might be 10°C near the Arctic Circle, or 40°C without A/C near the Equator, which equally all computer temperatures, whether at idle or under 100% workload.

The International Standard for "normal" room temperature is 22°C or 72°F. So when providing full system specs, it always helps to include ambient temperature so we know whether or not your rig is running in a reasonably normal environment.

CT:sol:

Hi,

Thanks for the reply.

My room temperature ranges between 19 and 22 degrees.

Based on what most have been saying here I might look into getting a better cooling solution (like the Noctua NH-D15) for the CPU and see where that takes me.

I have run memtest and everything was normal also.
 

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
Update.

So I removed the heat sync and cleaned everything off and re-applied thermal paste.

There has been an improvement of about 10 degrees in both idle and under load.

I can run handbrake at 4.8Ghz with a VCore of 1.23v holding steady at floating between 72-74 degrees.

Idle temps are down to 32-34 degrees.

Handbrake only pulls up to 160W when running so the Arctic Freezer can handle that. When running prime95 it still fails as power drain increases to 200W+. I think I'll go ahead and purchase the Noctua NH-D15 (just need to measure to make sure it fits in my Corsair Carbide 300r, which I believe it does).
 

CompuTronix

Intel Master
Moderator
That's a nice improvement, but 200 Watts means your 8700K is running AVX / AVX2, which Prime95 v29.8 allows you to disable.

Prime95 v29.8

Although transcoding software such as HandBrake runs AVX code, the peak fluctuating workload and power consumption typically shouldn't exceed the steady-state true 100% workload of Prime95 Small FFTs with all AVX test selections disabled.

n7sRDt4.jpg

Click on the AVX test selections that are not greyed out so that all three AVX boxes are checked, as shown above, then try it again.
 

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
That's a nice improvement, but 200 Watts means your 8700K is running AVX / AVX2, which Prime95 v29.8 allows you to disable.

Prime95 v29.8

Although transcoding software such as HandBrake runs AVX code, the peak fluctuating workload and power consumption typically shouldn't exceed the steady-state true 100% workload of Prime95 Small FFTs with all AVX test selections disabled.

n7sRDt4.jpg

Click on the AVX test selections that are not greyed out so that all three AVX boxes are checked, as shown above, then try it again.

Hi,

I disabled AVX and now prime95 is rock solid with temps holding at an average of 77 degrees with a power draw or just under 160W. This is with an OC of 4.8GHz and an offset of -30 producing a VCore of 1.164V.
 

Phaaze88

Titan
Ambassador
Something that does not appear to have been covered yet: you're not done with just Prime 95. That only covers cpu cooler thermal stability.
We have not seen proof that your OC is voltage stable. Prime 95 does not do both effectively; it's workload is too steady for the voltage test.

Run Cinebench R20 'Infinite Loop'. To set up the 'infinite loop': File > Preferences > Minimum Test Duration. Enter an impossible number.
Run the test for an hour - the longer, the better though.
 

CompuTronix

Intel Master
Moderator
Hi,

I disabled AVX and now prime95 is rock solid with temps holding at an average of 77 degrees with a power draw or just under 160W. This is with an OC of 4.8GHz and an offset of -30 producing a VCore of 1.164V.
And there it is. Good job!

AVX in transcoding apps is much less intensive than AVX in P95, which is a brutal 130% workload. So when you disable AVX in P95, you get very similar power consumption to transcoding, which is what we would expect to see.

For P95 with AVX to remain stable, it requires either a significant increase in Vcore, which of course increases power consumption and heat ... or ... a downclock of 200 to 300MHz that adversely affects transcoding time.

You could always use an AVX offset in BIOS to downclock, but since your current Vcore offset and overclock now "appears" to be stable when transcoding, it's unnecessary to downclock just to accommodate AVX in P95, which is one of the reasons why we disable it.

However, as Phaaze88 has pointed out, we need to know whether your Vcore and power delivery are stable. Download and run Hardware Info (HWiNFO) with P95 Small FFTs no AVX just as before, and observe the minimum and maximum Vcore values during the steady-state workload, which will tell us how much your Vcore is fluctuating.

10 to 15 minutes with an air cooler is ample time for the CPU, cooler, socket, motherboard and VRMs to reach 95% thermal saturation, so there's no point in letting it cook for more than 30 minutes.

Run HWiNFO with "Sensors Only".

Once we know your Vcore behavior in P95, then repeat the same transcoding session in HandBrake that you ran when you initially discovered the instability. Again observe minimum and maximum Vcore, then we'll compare Vcore fluctuations with HandBrake's fluctuating workload against Vcore fluctuations with P95's steady-state workload.

16 to 24 millivolts (0.016 to 0.024) in P95 indicates good voltage regulation, but values of 48mv (millivolts) or more indicates poor voltage regulation, which points to motherboard VRM instability.
 
  • Like
Reactions: ian.egerton

ian.egerton

Reputable
Dec 19, 2017
25
0
4,530
And there it is. Good job!

AVX in transcoding apps is much less intensive than AVX in P95, which is a brutal 130% workload. So when you disable AVX in P95, you get very similar power consumption to transcoding, which is what we would expect to see.

For P95 with AVX to remain stable, it requires either a significant increase in Vcore, which of course increases power consumption and heat ... or ... a downclock of 200 to 300MHz that adversely affects transcoding time.

You could always use an AVX offset in BIOS to downclock, but since your current Vcore offset and overclock now "appears" to be stable when transcoding, it's unnecessary to downclock just to accommodate AVX in P95, which is why we disable it.

However, as Phaaze88 has pointed out, we need to know whether your Vcore and power delivery are stable. Download and run Hardware Info (HWiNFO) with P95 Small FFTs no AVX just as before, and observe the minimum and maximum Vcore values during the steady-state workload, which will tell us how much your Vcore is fluctuating.

10 to 15 minutes with an air cooler is ample time for the CPU, cooler, socket, motherboard and VRMs to reach 95% thermal saturation, so there's no point in letting it cook for more than 30 minutes.

Run HWiNFO with "Sensors Only".

Once we know your Vcore behavior in P95, then repeat the same transcoding session in HandBrake that you ran when you initially discovered the instability. Again observe minimum and maximum Vcore, then we'll compare Vcore fluctuations with HandBrake's fluctuating workload against Vcore fluctuations with P95's steady-state workload.

16 to 24 millivolts (0.016 to 0.024) in P95 indicates good voltage regulation, but values of 48mv (millivolts) or more indicates poor voltage regulation, which points to motherboard VRM instability.

So,

I ran some longer tests on prime95 and had instability (BSOD). So I changed the offsets and repeated until I got to an offset of +10 without BSOD.

The results are as follows:

Prime95
VCore - Min - 1.184V Max - 1.248V Avg - 1.192V
80 degrees
167W

Handbrake
VCore - Min - 1.216V Max - 1.360V Avg - 1.243V
77 degrees
160W

So I'm guessing my voltage regulation is still quite poor. Are there any further steps I can take to address this?
 

CompuTronix

Intel Master
Moderator
Power consumptions and Core temperatures are as expected, however, you have 64mv of Vcore fluctuations during P95's steady-state workload, which indicates sloppy voltage regulation.

Just to be certain, please repeat the Prime95 Small FFTs no AVX test.

Make sure that you don't start HWiNFO until after you start P95, then record the Minimum and Maximum Vcore before you stop P95, so that the Vcore values will reflect the fluctuations that occur only during the P95 test, which is the information we need to verify. Average Vcore isn't useful for this test, so don't bother recording it.

Transcoding produces fluctuating workloads, so I would expect to see considerably greater Vcore fluctuations (144mv) than in P95 (64mv). We don't need to repeat HandBrake, so just run P95 once more, if you please.