News Is your Intel Core i9-13900K crashing in games? Your motherboard BIOS settings may be to blame — other high-end Intel CPUs also affected

JarredWaltonGPU · Feb 22, 2024

Alvar Miles Udell said:
The 13900K has been out for a year and a half, and if issues are just now cropping up then I'd suspect something faulty in the newest BIOS released by Intel, and it would be interesting to see if it was because the newest, or at least newer, BIOSs set the limits to 4096, while older ones used something remotely more sane.

I've been having issues since 2022 when I first assembled my test system. I thought it was maybe just my particular setup, but over time it got worse and more games started to have issues. And with the recent tweet from Hassan of WCCF, I figured it was a good time for us to write up a full article, particularly as I only recently found a solution that wasn't just a workaround.

(TLDR: I was manually setting affinity for certain games to only one thread on each P-core to get the shader compile to, usually, complete. Then I'd reset affinity to all cores — because if I didn't, most games ran like 💩 .)

TJ Hooker · Feb 22, 2024

Alvar Miles Udell said:
The 13900K has been out for a year and a half, and if issues are just now cropping up then I'd suspect something faulty in the newest BIOS released by Intel, and it would be interesting to see if it was because the newest, or at least newer, BIOSs set the limits to 4096, while older ones used something remotely more sane.

Setting sky high limits isn't remotely new though. Intel mobo OEMs have been doing it for years as a way to juice benchmark results and differentiate their products, particular high-end/Z-series. Intel turns a blind eye, presumably because it makes their products look more performant in reviews (where reviewers are usually using high end boards).

Here's an article that's over 5 years old describing this (and I don't think the practice was new even then).

"This lets them set PL2 to 4096W and Tau to something very large, such as 65535, or -1 (infinity, depending on the BIOS setup). This means the CPU will run in its turbo modes all day and all week, just as long as it doesn’t hit thermal limits."
https://www.anandtech.com/show/13544/why-intel-processors-draw-more-power-than-expected-tdp-turbo

In the past, even though current/power/turbo-duration limits were effectively being set to 'infinite', there were presumably still other limits in place keeping the processor operating stably (e.g. frequency, voltage, temp). Evidently we've reached a point where those other limits are no longer sufficient, at least for their flagship chips. And/or changes to Intel turbo behavior (including addition on (enhanced) thermal velocity boost) finally exposed the issue with this approach.

Edit: Adaptive Boost is the new-ish boost feature I was thinking of here that seems like a possible culprit. As an aside, Intel has way too many different "Boost" feature sets.

https://www.tomshardware.com/news/intel-adaptive-boost-technology-rocket-lake-amd

JarredWaltonGPU · Feb 22, 2024

TJ Hooker said:
Setting sky high limits isn't remotely new though. Intel mobo OEMs have been doing it for years as a way to juice benchmark results and differentiate their products, particular high-end/Z-series. Intel turns a blind eye, presumably because it makes their products look more performant in reviews (where reviewers are usually using high end boards).

Here's an article that's over 5 years old describing this (and I don't think the practice was new even then).

"This lets them set PL2 to 4096W and Tau to something very large, such as 65535, or -1 (infinity, depending on the BIOS setup). This means the CPU will run in its turbo modes all day and all week, just as long as it doesn’t hit thermal limits."
https://www.anandtech.com/show/13544/why-intel-processors-draw-more-power-than-expected-tdp-turbo

In the past, even though current/power/turbo-duration limits were effectively being set to 'infinite', there were presumably still other limits in place keeping the processor operating stably (e.g. frequency, voltage, temp). Evidently we've reached a point where those other limits are no longer sufficient, at least for their flagship chips. And/or changes to Intel turbo behavior (including addition on (enhanced) thermal velocity boost) finally exposed the issue with this approach.

Yeah, that's the crux of the issue. I think Intel has pushed the silicon basically as far as it will go with Raptor Lake, and on some of the high-end CPUs if the BIOS limits are too high, it will get into unstable territory. I never had issues with 12900K, but 13900K was a hell of a lot more finnicky.

The increased thermal limits of the CPU are also a factor, I think — RPL can break 100C while still technically being "okay." Except in my experience, that's simply not true. Every time I saw a game crash (usually with shader compiling), I'd look over at the BIOS DrLED code that shows temperature and it would hit 99 and sometimes even roll over to 00. And that's with a Corsair H150i cooler, so not exactly a terrible cooling setup.

But with the lower limits I've input into the BIOS, CPU temps are never breaking 90C now.

bit_user · Feb 22, 2024

knyeji said:
managed to fix the issue by disabling some option called MCE on my asus mobo.
Games run fine now.

Multi-Core Enhancement is a technique for overriding the default multi-core frequency curves. It's basically running multi-core workloads at higher frequencies than Intel stated they would run, but without any one core exceeding the maximum turbo frequency.

Intel Core i9 13900K: Impact of MultiCore Enhancement (MCE) and Long Power Duration Limits on Thermals and Content Creation Performance

Intel's 13th Gen processors bring terrific performance across the board, but many modern processors have been criticized for their high power draw and heat output. However, we have found that most CPUs only operate at these high temperatures when the motherboard BIOS is allowed to automatically...

www.pugetsystems.com

I believe using MCE should generally involve exceeding the default PL2, especially if we're talking about most- or all-core workloads.

So, yes - seems very relevant.

CSOCSO · Feb 22, 2024

bit_user said:
How can it not be Intel's fault, if board partners are only doing things Intel considers not to be overclocking or warranty-voiding?

It's Intel's job to place & enforce limits that ensure stable operation of their products. I think we really shouldn't let them off the hook, here.

I had zero issues for 6 months or so. I did upgrade my bios recently so probably because of that. I remember seeing the insanely huge max power in the gigabyte app but never touched it. So The cpu's have no issues. It was an update that caused this

Namluk · Feb 22, 2024

ALL - How I fixed mine!
I had a 13900K release day paired with the ASUS ROG Strix Z790-E Gaming WIFI motherboard with an RTX 4090. Everything was absolutely stable until about 2 months ago. Games (COD and BF 2042) would have random freezes and sometimes with the out of memory message that others have noted. I did think it was my video card at first. After much troubleshooting I figured out a fix that worked 100% of the time.

In my motherboard BIOS I changed the SVID setting from Auto to "Intel Fail Safe." As soon as I did this stability 100% returned; however, it came at the cost of increased voltage which means heat. Under load my gaming temps were typically 50s-60s. With the new SVID setting the temps went to 88-90 under load.

Although I discovered a "fix," that extra heat was going to prematurely take away from the longevity of the CPU. Intel has a 5 year warranty. If your 13900K is having this issue, I 100% recommend pursuing an RMA.

I contacted Intel and RMA'd the processor and you can do a cross-shipment for a $25 fee (totally worth it to minimize the downtime). New processor is running great and the SVID is back on the Auto setting (default) with temps now actually just in the 50s under load (while gaming).

Personally I believe it is both bad chips (Intel's bad) and the motherboard makers with settings that are wearing them out. I personally leave of the MCE and there is literally no need to overclock a 13900K for any FPS gain. If you want to mess your trash up for a stupid 3D Mark score...go right ahead.

Note: I messed around with a ton of settings, only the SVID worked 100% of the time. It is an indicator that one of the cores is not able to run properly, the Intel Fail Safe volts the ^*(%^&*^ out of it to keep it from crashing.

rluker5 · Feb 22, 2024

MrWallopy said:
This isnt an intel issue. This is the Board partners overreaching. Even older mobos had a set limit when set to auto, but now the solar system adjacent doesnt even limit the power they force through these chips.

I noticed this same crashing more and more on almost all UE5 games due to the cpu load hit they take at times.

For me. 2 settings are all i change.

Asus board:
AI Tweaker > Asus Multicore enhancement - Disabled - Enforce ALL Limits
AI Tweaker > SVID Behavior - ~~Intel's Failsafe~~ Instead do "Typical Usage"
I was pulling a solid 1.652v With failsafe.... Typical kept me at 1.426 and i was still throttling because of heat haha.

These settings force the chip to run at stock intel limits.

I personally don't buy the highest end to keep pushing it to bleeding edge, i buy it because it at stock is already faster than the previous and hits Intels standard.

I don't trust the board partners to replace my cpu when their board kills it with the default settings. The board partners are to blame, not just asus. ALL of them.

Edit: This also happens on my Lenovo Legion Pro 7 series with the 13900. If you change the bios settings to allow the core voltage to auto adjust i crash in UE5 games on there just as i do my Desktop.

I also turned off that MCE. Maybe I'm lucky, but I also run my 13900kf on best case scenario. Those two settings are most of my volt tuning.

I have also seen some hard to explain instability sometimes, but suspected my e-cores bits weren't getting enough power, after the droop, for the frequency they were running at so I added +20mv to the e-core L2 cache and that fixed it with no noticeable increase in temps or power draw. Maybe I just have an e-core or 2 that is too weak for 4.5? I saw it more in my old 12700k with an Asus Prime Z690p, but it went away with any other motherboard.{*}

There are extra parts to watch in hybrid chips. The volts and frequencies are jumping all over the place for p,e-cores and cache. All volts are natively tied together even though frequencies aren't and all can be a cause for issue.

But I recommend trying upping L2 cache voltage a bit if problems are coming up since it is easy to overlook and has little penalty if it fixes the problem and you decide to keep the setting. Just don't give it crazy volts.

{*} Edit: By any other motherboard I mean any other one I tried, which would just be 2 - ROG Strix B660i and Z690i Aorus Ultra Lite. Also my Prime Z690 P motherboard just says I'm adjusting L2 cache voltage offset, but HWinfo64 claims L2 E-core voltage offset.

Papusan · Feb 22, 2024

JarredWaltonGPU said:
Yeah, that's the crux of the issue. I think Intel has pushed the silicon basically as far as it will go with Raptor Lake, and on some of the high-end CPUs if the BIOS limits are too high, it will get into unstable territory. I never had issues with 12900K, but 13900K was a hell of a lot more finnicky.

The increased thermal limits of the CPU are also a factor, I think — RPL can break 100C while still technically being "okay." Except in my experience, that's simply not true. Every time I saw a game crash (usually with shader compiling), I'd look over at the BIOS DrLED code that shows temperature and it would hit 99 and sometimes even roll over to 00. And that's with a Corsair H150i cooler, so not exactly a terrible cooling setup.

But with the lower limits I've input into the BIOS, CPU temps are never breaking 90C now.

Suppose you wish to set a maximum operating temperature lower than the CPU’s specified maximum warranted operating temperature. In that case, I suggest primarily relying on TjMax Offset or TCC_Offset, as both rely on the same underlying throttling mechanism. It is the fastest and most precise in detecting overheating and adjusting the CPU frequency to stay below the desired temperature. The Package Temperature Threshold technology also does its job but is slightly slower in adapting to the overheating processor.

https://skatterbencher.com/2023/10/27/raptor-lake-refresh-fast-throttle/#Test_Results

Shonk. · Feb 22, 2024

Its not the power limits thats just masking the problem forcing a power limit

Its a combination of dumb LLC + AC/DC Load Line + Bad Auto Settings + Non tested SA/ VDDQ / not fully stability tested ram settings
and in many cases people setting a static voltage for something they have no understanding of in many cases

Intel SPEC LLC is 1.1 mOhm which also has to be then reported back to the cpu
messing with this causes staility issues if you dont know what your doing (which most people dont)
AC Loadline depends on Loadline Calibration value and motherboard socket impedance. No one knows what the proper impedance is, and I doubt anyone here has the correct tools to measure it, but it's "probably" between 0.2 to 0.6 milliohms. (So let the vendor set that as its board dependant)

DC Loadline should always be set to the same mohms as LLC (Loadline calibration) value in mOhms however, in order for package power and VID to report close to accurate values (e.g. 1.1 mOhm if LLC Is Standard/Normal/SPEC....)

My 13900K Passes
Karhu + FPU Load 24 hours
Prime 95 Small FFT AVX (440W Load)
y-cruncher
Never Crashes in games
Scores 41300 in Cinebench R23
And has unlimited PL1/PL2
etc etc

and i run a 30mv undervolt on the higher multipliers
I run the cpu
58x (1-3c), 57x (4-5c), 56x (6-7c), 55x (8c)
instead of
58x (1-2c), 55x (8c)
e-cores are also on..

In sumary user/oem error and a nothing burger

tamalero · Feb 22, 2024

JarredWaltonGPU said:
I've been having issues since 2022 when I first assembled my test system. I thought it was maybe just my particular setup, but over time it got worse and more games started to have issues. And with the recent tweet from Hassan of WCCF, I figured it was a good time for us to write up a full article, particularly as I only recently found a solution that wasn't just a workaround.

(TLDR: I was manually setting affinity for certain games to only one thread on each P-core to get the shader compile to, usually, complete. Then I'd reset affinity to all cores — because if I didn't, most games ran like 💩 .)

Could anything else ( as in programs) cause this kind of error? or is it just games, specially UE?
Is it some sort of data corruption going on?

rluker5 · Feb 22, 2024

Shonk. said:
Its not the power limits thats just masking the problem forcing a power limit

Its a combination of dumb LLC + AC/DC Load Line + Bad Auto Settings + Non tested SA/ VDDQ / not fully stability tested ram settings
and in many cases people setting a static voltage for something they have no understanding of in many cases

Intel SPEC LLC is 1.1 mOhm which also has to be then reported back to the cpu
messing with this causes staility issues if you dont know what your doing (which most people dont)
AC Loadline depends on Loadline Calibration value and motherboard socket impedance. No one knows what the proper impedance is, and I doubt anyone here has the correct tools to measure it, but it's "probably" between 0.2 to 0.6 milliohms. (So let the vendor set that as its board dependant)

DC Loadline should always be set to the same mohms as LLC (Loadline calibration) value in mOhms however, in order for package power and VID to report close to accurate values (e.g. 1.1 mOhm if LLC Is Standard/Normal/SPEC....)

My 13900K Passes
Karhu + FPU Load 24 hours
Prime 95 Small FFT AVX (440W Load)
y-cruncher
Never Crashes in games
Scores 41300 in Cinebench R23
And has unlimited PL1/PL2
etc etc

and i run a 30mv undervolt on the higher multipliers
I run the cpu
58x (1-3c), 57x (4-5c), 56x (6-7c), 55x (8c)
instead of
58x (1-2c), 55x (8c)
e-cores are also on..

In sumary user/oem error and a nothing burger

I've heard that some have problems with stock motherboard settings. Not Intel spec, but motherboard spec that are set as default. Sure one can change them, but the default should be safe and not a crazy amount of power draw with apprehensively high volt spikes and system instabilities.

That's why the title is putting the blame on the motherboard bios.

As an aside, Have you tried turning HT off? You could likely get 5.8Ghz all p-cores stable. My CPU runs at about 1.35v per HWinfo when running 5.8 under a CPUz all core load. You would almost certainly throttle if you ran those 100% use stress tests on it, but it is fine for gaming.

Shonk. · Feb 23, 2024

rluker5 said:
As an aside, Have you tried turning HT off? You could likely get 5.8Ghz all p-cores stable. My CPU runs at about 1.35v per HWinfo when running 5.8 under a CPUz all core load. You would almost certainly throttle if you ran those 100% use stress tests on it, but it is fine for gaming.

Im set it up for all scenario's even worst case
5.5 All core i can 100% cool at 440w
458w without the 30mv undervolt is just slightly to much for my cooler

I did play with 5.6 + 5.7 all core when i first got it with the e-cores on
and got them stable just fine
but the resulting power output went through the roof with the stock voltage table

the voltage table jump from 5.5 to 5.8 is quite large

TerryLaze · Feb 23, 2024

rluker5 said:
I've heard that some have problems with stock motherboard settings. Not Intel spec, but motherboard spec that are set as default. Sure one can change them, but the default should be safe and not a crazy amount of power draw with apprehensively high volt spikes and system instabilities.

But people are buying those motherboards because they do do an automatic overclock to the max, which they would have no idea about how to do manually.

People that don't need or don't want the maximum performance often go with a non-z board even with a k or ks CPU, because they like the better binning but don't want a space heater and the other issues that come with overclocking that they have no idea about how to deal with.

rluker5 · Feb 23, 2024

Shonk. said:
Im set it up for all scenario's even worst case
5.5 All core i can 100% cool at 440w
458w without the 30mv undervolt is just slightly to much for my cooler

I did play with 5.6 + 5.7 all core when i first got it with the e-cores on
and got them stable just fine
but the resulting power output went through the roof with the stock voltage table

the voltage table jump from 5.5 to 5.8 is quite large

Excellent cooling, I'm using a D15. Turning HT off will decrease your p-core temps at the same volts and increase stability while raising your minimum single thread performance, but it will cost a bit of multithread. And you are correct in that the voltage table jump is quite large. Probably too large to be able to run P95 small fft. Just not too large to do everything that is not an all core stress test. It is just an option for slightly better gaming performance or slightly cooler at the same performance.

TerryLaze said:
But people are buying those motherboards because they do do an automatic overclock to the max, which they would have no idea about how to do manually.

People that don't need or don't want the maximum performance often go with a non-z board even with a k or ks CPU, because they like the better binning but don't want a space heater and the other issues that come with overclocking that they have no idea about how to deal with.

Unless you have CPU cooling like Shonk up there those motherboard settings often cost performance because they default to unmanageable power draws in all core stress tests. And apparently sometimes lead to instability because of said auto OC. My Prime Z690 P has an auto OC option and it is trash. All it does is open up the spigot and dump a bunch of power. Conveniently it is off by default or else I may be seeing 100c temps when starting a game. But I do get memory controller voltage that was somewhere between 1.45-1.5 with dual rank DDR5 dimms by default when 1.385 is fine (per HWinfo).

It could be that some CPUs are less stable at higher clocks and Mobo manufacturers are increasing the volts to decrease RMAs. And like Shonk mentioned, the volts needed for stable clocks go up a bunch when you get to the i9k top frequencies. They are a lot lower at the i5k frequencies.

atomicWAR · Feb 23, 2024

This is a bad look for Intel. No one buys a K series cpu only to down clock it by 200mhz. If they wanted that they'd just got a non K version. Hopefully Intel fixes this issue with due haste.

antych · Feb 23, 2024

This sounds like a lot of nonsense. I have those errors in almost every UE5 game. That's the common denominator. If the power limit was an issue, I would encounter those errors elsewhere. My PC would never pass a cinebench run or other stress tests where the load is even higher. Maybe downclocking or capping power masks the real issue, but the problem is more likely to be UE5 shader compilation. Perhaps it spins up to many threads on higher end CPUs causing some resource contention.

JarredWaltonGPU · Feb 23, 2024

antych said:
This sounds like a lot of nonsense. I have those errors in almost every UE5 game. That's the common denominator. If the power limit was an issue, I would encounter those errors elsewhere. My PC would never pass a cinebench run or other stress tests where the load is even higher. Maybe downclocking or capping power masks the real issue, but the problem is more likely to be UE5 shader compilation. Perhaps it spins up to many threads on higher end CPUs causing some resource contention.

As noted in the article text, it's not necessarily the power limit itself, but a combination of power limit, current limit, voltage, and/or frequency — one or more of these allows certain CPUs to reach unstable states. Games that use Oodle for data decompression are frequently affected. Oodle has released a statement, basically echoing everything we said, along with some details for Asus, Gigabyte, and MSI motherboards.

It's 100% not a case of simply spinning up too many threads. It's running a workload that happens to push the CPU hard, which can cause overheating, overcurrent, overpower, overvoltage, or some other related problem. Once you limit the clocks, power, voltage, and/or currency to the appropriate levels for your CPU, all issues will go away.

TerryLaze · Feb 23, 2024

atomicWAR said:
This is a bad look for Intel. No one buys a K series cpu only to down clock it by 200mhz. If they wanted that they'd just got a non K version. Hopefully Intel fixes this issue with due haste.

Maybe you got confused somewhere but the solution is not clocking down from the normal clocks but from the max overclock they are getting.

antych said:
This sounds like a lot of nonsense. I have those errors in almost every UE5 game. That's the common denominator. If the power limit was an issue, I would encounter those errors elsewhere. My PC would never pass a cinebench run or other stress tests where the load is even higher. Maybe downclocking or capping power masks the real issue, but the problem is more likely to be UE5 shader compilation. Perhaps it spins up to many threads on higher end CPUs causing some resource contention.

Different things run different instructions, a CPU can be stable at everything else and have one single instruction that crashes the system.

antych · Feb 23, 2024

Sure, but this isn't even instability. We have a situation where shader compilation in UE5 crashes in a fairly deterministic way, while all stress tests and other application work fine. If only one workload predictably crashes on a stock CPU that otherwise works fine, the root cause is unlikely to be related to cpu/bios settings. I'm not saying this isn't Intel's fault, what if it's a bug in Thread Director? But I think the conclusion in this article is completely off the mark and no one should be underclocking their CPU.

JarredWaltonGPU · Feb 23, 2024

antych said:
Sure, but this isn't even instability. We have a situation where shader compilation in UE5 crashes in a fairly deterministic way, while all stress tests and other application work fine. If only one workload predictably crashes on a stock CPU that otherwise works fine, the root cause is unlikely to be related to cpu/bios settings. I'm not saying this isn't Intel's fault, what if it's a bug in Thread Director? But I think the conclusion in this article is completely off the mark and no one should be underclocking their CPU.

On the "stock" settings for my motherboard, a lot of CPU stress tests and heavy workloads will crash. It's not just Oodle and UE5 stuff.

MyXPC · Feb 23, 2024

From Linux, CoreFreq shows and let you change the Power Limits

See this i9-13900K where PL1 and PL2 are indeed left at 4095

antych · Feb 23, 2024

JarredWaltonGPU said:
On the "stock" settings for my motherboard, a lot of CPU stress tests and heavy workloads will crash. It's not just Oodle and UE5 stuff.

That doesn't seem to be the case on my system which is otherwise stable. The statement from Oodle is interesting... This is very easy to reproduce on my PC, so I'll have to play around with some settings next time I run into it.

JarredWaltonGPU · Feb 23, 2024

antych said:
That doesn't seem to be the case on my system which is otherwise stable. The statement from Oodle is interesting... This is very easy to reproduce on my PC, so I'll have to play around with some settings next time I run into it.

Yeah, to be clear, I bet if I got ten different 13900K CPUs, some would work fine in my mobo at its 'stock' settings, and some would not. I bet every CPU would run properly at "Intel real stock settings," which is not what enthusiast mobos generally use. But there's clearly a subset of CPUs where something is going off the rails.

HaninTH · Feb 23, 2024

A little late to the party, but does affect the i5-13600k?

JarredWaltonGPU · Feb 23, 2024

HaninTH said:
A little late to the party, but does affect the i5-13600k?

Not that I've heard. It seems to be the CPUs where Intel has pushed clocks the highest, and as a Core i5 there's usually more wiggle room. I won't say it's impossible some i5 CPUs could be affected, though.

News Is your Intel Core i9-13900K crashing in games? Your motherboard BIOS settings may be to blame — other high-end Intel CPUs also affected

Splendid

Titan

Splendid

Titan

Distinguished

Distinguished

Distinguished

Reputable

Distinguished

Distinguished

Reputable

Titan

Distinguished

Glorious

Distinguished

Splendid

Titan

Distinguished

Splendid

Distinguished

Distinguished

Splendid

Proper

Splendid

Share this page