News Is your Intel Core i9-13900K crashing in games? Your motherboard BIOS settings may be to blame — other high-end Intel CPUs also affected

Page 5 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Sometimes i cant event load in but i do get the "run out of VRAM message most of the time when it crashes. but yeah also when shader compilations. The german Intel support told they never had this problem in their team when i showed them this article, and also blamed it on my GPU/ Driver. I even changed my MSI card with a Asus card. Sometimes im able to play about half an hour before the next crash. Than it randomly closes itself and when restarting and trying to load the safe games it gives me the error message. After restarting the PC im able to go for half an hour again.
So, the first thing I would do is to update your motherboard BIOS, clean out all GPU drivers (use Display Driver Uninstaller), and reinstall the latest drivers. The driver "clean install" does not do quite a full clean, FYI.

Then I'd go into the BIOS and find the settings for PL1/PL2 power and amperage. Sometimes these are called Short Duration Power Limit and Long Duration Power Limit. On my MSI board, these default to 4096W and 512 Amps. Set them to 255W and 325 Amps. Save and reboot. See if that helps with your instability.

Alternatively, you could try to underclock the CPU by 200 MHz to see if that helps. But that's really just a different way of accomplishing the power limit change.
 
  • Like
Reactions: bit_user
So, the first thing I would do is to update your motherboard BIOS, clean out all GPU drivers (use Display Driver Uninstaller), and reinstall the latest drivers. The driver "clean install" does not do quite a full clean, FYI.

Then I'd go into the BIOS and find the settings for PL1/PL2 power and amperage. Sometimes these are called Short Duration Power Limit and Long Duration Power Limit. On my MSI board, these default to 4096W and 512 Amps. Set them to 255W and 325 Amps. Save and reboot. See if that helps with your instability.

Alternatively, you could try to underclock the CPU by 200 MHz to see if that helps. But that's really just a different way of accomplishing the power limit change.
ok so i just did all these steps and it still crashes. I played for like 30 minutes until random crash. After restarting the game and trying to load safe game i get the "out of VRAM message". should i get RMA or is it likely that the replacement will also have this problem?
 
ok so i just did all these steps and it still crashes. I played for like 30 minutes until random crash. After restarting the game and trying to load safe game i get the "out of VRAM message". should i get RMA or is it likely that the replacement will also have this problem?
The fact that you can play at all to me suggests it's not specifically the CPU causing the crash. It might be something else, and that's a lot harder to nail down. You're welcome to try doing a CPU RMA, but I do worry you'll just end up wasting time and be right back to the same spot you're in right now. It's your call, though.

Based on you being able to play for a while, it's more likely a GPU issue. You said you tried Asus and MSI cards, though you didn't say which specific models. I'd try poking around at the GPU side a bit to see if you can determine for sure what's causing issues.

I mentioned above that logging temps and such with HWiNFO64 while gaming would be useful for diagnosing problems on a laptop. The same would apply here. Check thermals and loads and see if anything looks questionable (meaning, 90C or above is definitely a concern).
 
  • Like
Reactions: bit_user
The fact that you can play at all to me suggests it's not specifically the CPU causing the crash. It might be something else, and that's a lot harder to nail down. You're welcome to try doing a CPU RMA, but I do worry you'll just end up wasting time and be right back to the same spot you're in right now. It's your call, though.

Based on you being able to play for a while, it's more likely a GPU issue. You said you tried Asus and MSI cards, though you didn't say which specific models. I'd try poking around at the GPU side a bit to see if you can determine for sure what's causing issues.

I mentioned above that logging temps and such with HWiNFO64 while gaming would be useful for diagnosing problems on a laptop. The same would apply here. Check thermals and loads and see if anything looks questionable (meaning, 90C or above is definitely a concern).
i switched my MSI GeForce RTX 4070 Ti Gaming X Trio with a Aorus 4070TI (sorry got it wrong in the beginnign) but still got the same crashes. Is it maybe an error with the GPU driver not being optimized or something like that? Im really clueless. I checked my temperaturs while these crashes but nothing too high. But if you say that the problem thats being discussed here only shows up when trying to launch HFW and not being able to even load in at all, the CPU may be infact not the problem. What kinda put me off is that i had these same Problems with shader compilation crashes back when i played Detroit become human where it also crashed often while loading/ being ingame and getting the same error message.
 
i switched my MSI GeForce RTX 4070 Ti Gaming X Trio with a Aorus 4070TI (sorry got it wrong in the beginnign) but still got the same crashes.
Try reducing your CPU clock speed, as mentioned in the article, and see if it fixes at least some of your crashes.

Alternatively, you could reduce power & current limits, as Jarred mentioned. Whichever approach you take, set the limits lower than you think they need to be. If that improves stability, then start increasing them to the point where the instability returns.
 
I had a similar issue compiling my code using the GCC compiler with my 13900K. Changed the power limits and now all is good. The strangest thing is I never used to have the problem then one day things changed. I spent a week trying to identify what changed and gave up. Changing the power limits fixed the compiler issues and crashes I was getting in Call of Duty with the latest update.
 
  • Like
Reactions: bit_user
Update for anyone following still. After conferring with Paul, our CPU expert, he talked to some people he knows and suggested tweaking loadline calibration. On the MSI board, it's a bit hidden in the BIOS, but this also appears to have 'fixed' my issues, in a slightly different manner. I've added this paragraph to the article:
-------------
As another alternative, tweaking the Load Line Calibration (LLC) to increase voltages can help. Loadline calibration is related to the amount of voltage supplied under lighter loads, and the idea here is that the motherboard vendors may be undershooting on voltage in some cases (i.e. to keep temperatures down), leading to instability. Again, with our own 'problem' CPU sample on an MSI board, going into the advanced CPU settings and then changing CPU Like Load Control from 'Normal' to 'Advanced' exposes two additional settings: CPU AC Loadline and CPU DC Loadline. The default Auto setting used a value of 50 for AC on our system and 80 for DC. We bumped the AC up to 70 (without the decreased power and current limits) and it also seems to be stable.
-------------
YMMV, but I'm pretty convinced these aren't "faulty" CPUs but rather a combination of silicon lottery plus too aggressive default BIOS settings. Because let me tell you, I have been poking at and building PCs for over 30 years and I know a lot more than the average Joe. If I'm having some issues tracking down the problem and fixing it in the BIOS, most users wouldn't even know where to begin. Your typical PC users often view the BIOS as a scary black box with mystical settings and arcane labels. (They're not wrong!) I can certainly imagine some people will just try the RMA route rather than trying to understand what might actually be going on.
 
... most users wouldn't even know where to begin. Your typical PC users often view the BIOS as a scary black box with mystical settings and arcane labels. (They're not wrong!) I can certainly imagine some people will just try the RMA route rather than trying to understand what might actually be going on.
Imho a user that don't know how to set the BIOS should not buy a beast like a 13/14900.
 
I'm pretty convinced these aren't "faulty" CPUs but rather a combination of silicon lottery plus too aggressive default BIOS settings.
Then why is Intel accepting them for replacement under Warranty, instead of telling people to update their BIOS? If it were truly the motherboards being too aggressive, I don't see why Intel would shoulder that financial burden, when it could advise motherboards on how they need to adjust their BIOS limits and just redirect customers to the board vendors.
 
Then why is Intel accepting them for replacement under Warranty, instead of telling people to update their BIOS? If it were truly the motherboards being too aggressive, I don't see why Intel would shoulder that financial burden, when it could advise motherboards on how they need to adjust their BIOS limits and just redirect customers to the board vendors.
When someone asks for an RMA, especially if it's in relatively small numbers, Intel would have to accept the initial request based purely on whatever the user says. "Hey, my system is unstable and crashing because my CPU is faulty." Without further information gathering, Intel doesn't really have a recourse, and default denying would be far worse for publicity IMO. I'm sure some small number of chips get sold that are bad, so there would always be a trickle of RMAs, but it's probably like 0.1% or less.

How high are the numbers now? Probably still quite small, and given Intel sells 2 billion CPUs per year, that's 5.5 million CPUs per day. Lots of those are not Core i9 obviously, and there are embedded, etc. CPUs. But even a thousand "bad" chips per day would be a drop in the bucket.

The other aspect is what happens with an RMA. If Intel receives a "good" CPU back for RMA, and it sends back a different "good" CPU that was previously RMA'ed, that's not a problem. I don't know for sure what Intel is doing, but presumably it tests "bad" CPUs to determine if they're actually bad or not. Or perhaps it's just easier and less costly to send out a replacement chip and not worry about testing.
 
The other aspect is what happens with an RMA. If Intel receives a "good" CPU back for RMA, and it sends back a different "good" CPU that was previously RMA'ed, that's not a problem.
The "good" CPUs they're getting can't just be sent out to someone else who's experiencing the same problem.

Does anyone know if Intel ships refurbished CPUs as warranty replacements? I know HDD companies do this, but it's not necessarily true of Intel. If they're not refurbs, then the cost is non-trivial for them to fulfill warranty replacements.
 
The "good" CPUs they're getting can't just be sent out to someone else who's experiencing the same problem.

Does anyone know if Intel ships refurbished CPUs as warranty replacements? I know HDD companies do this, but it's not necessarily true of Intel. If they're not refurbs, then the cost is non-trivial for them to fulfill warranty replacements.
Realistically, the cost of the chips would be somewhere south of $100 each. For Core i9 parts that sell for $600, sending a replacement is basically like the Apple "you already paid for two" fix. Whether it's worth testing a returned CPU or not is of course a separate question. Maybe they give RMA'ed chips to their employees? 😀

But RMA is really such a messy subject. Whether it's mobos, graphics cards, CPUs, HDDs, SSDs, or whatever other part, generally the cost of dealing with RMAs is factored into the original sale price. So yeah, Intel might just dump the "potentially bad" CPUs in the rubbish for all we know. Having someone take the time to swap CPUs and test them to the point where Intel could confidently say, "This CPU is fine" is probably not worth the effort.
 
  • Like
Reactions: 35below0
Whether it's mobos, graphics cards, CPUs, HDDs, SSDs, or whatever other part, generally the cost of dealing with RMAs is factored into the original sale price.
I'm aware of this, but if they didn't use an underwriter or reinsurance, then it means every warranty replacement comes straight out of their profit margins. They're definitely not "free", even if they were budgeted for. And when they budget, they surely base the estimates on prior defect rates. If they have a particularly problematic product or generation, it will exceed their budget and that's extra bad!
 
Last edited:
Update for anyone following still. After conferring with Paul, our CPU expert, he talked to some people he knows and suggested tweaking loadline calibration. On the MSI board, it's a bit hidden in the BIOS, but this also appears to have 'fixed' my issues, in a slightly different manner. I've added this paragraph to the article:
-------------
As another alternative, tweaking the Load Line Calibration (LLC) to increase voltages can help. Loadline calibration is related to the amount of voltage supplied under lighter loads, and the idea here is that the motherboard vendors may be undershooting on voltage in some cases (i.e. to keep temperatures down), leading to instability. Again, with our own 'problem' CPU sample on an MSI board, going into the advanced CPU settings and then changing CPU Like Load Control from 'Normal' to 'Advanced' exposes two additional settings: CPU AC Loadline and CPU DC Loadline. The default Auto setting used a value of 50 for AC on our system and 80 for DC. We bumped the AC up to 70 (without the decreased power and current limits) and it also seems to be stable.
-------------
YMMV, but I'm pretty convinced these aren't "faulty" CPUs but rather a combination of silicon lottery plus too aggressive default BIOS settings. Because let me tell you, I have been poking at and building PCs for over 30 years and I know a lot more than the average Joe. If I'm having some issues tracking down the problem and fixing it in the BIOS, most users wouldn't even know where to begin. Your typical PC users often view the BIOS as a scary black box with mystical settings and arcane labels. (They're not wrong!) I can certainly imagine some people will just try the RMA route rather than trying to understand what might actually be going on.
What I told you couple of pages back, LLC is the most likely culprit. Especially on some motherboards starting with A that usually have the SVID behavior being "best case" instead of the intel defaults.
 
  • Like
Reactions: JarredWaltonGPU
I'm aware of this, but if they didn't use an underwriter or reinsurance, then it means every warranty replacement comes straight out of their profit margins. They're definitely not "free", even if they were budgeted for. And when they budget, they surely base the estimates on prior defect rates. If they have a particularly problematic product or generation, it will exceed their budget and that's extra bad!
Lots of companies don't even test a product before they go through with an RMA. A company that makes mouse keyboards and stuff just asked me to send them a picture of my keyboard being hammered or have the cord cut and they proceeded with the RMA, sending me a new one.
 
What I told you couple of pages back, LLC is the most likely culprit. Especially on some motherboards starting with A that usually have the SVID behavior being "best case" instead of the intel defaults.
Yeah, but as usual, figuring out what what exact settings to change wasn't clear, nor was the proper value to use. Even once you find the LLC settings in the MSI BIOS (which took a bit of digging), you then end up with Auto or settings of 10–90 in increments of 10. Paul finally answered my queries and told me that I should go higher, starting with 60 and moving up.

[Incidentally, this is what I mean about BIOS settings being a scary black box. Like, I know what a lot of the settings are supposed to do, but then you get things like LLC with nebulous settings. There are also some related settings that are like "Level 1" through "Level 9" and no indication of what the auto setting is using. So, having previously found a workaround, I wasn't keen on tweaking BIOS settings to try potentially ten different options.]

So, I changed the power and current limits back to higher settings (didn't go whole-hog, though, as I don't like the CPU running at up to 100C, so leaving a 350W limit in place seemed more sensible). Then I tried bumping the LLC to 60 and running The Last of Us. I got an almost immediate crash during the shader compiling, so I rebooted and set LLC to 70. Tried again and TLOU got through the initial launch just fine.

Now I need to see about sticking an Nvidia card back into the system and determining whether the driver installation still fails or not. Which will probably happen tomorrow, as I'm currently (re)testing an AMD GPU...
 
  • Like
Reactions: bit_user
Yeah, but as usual, figuring out what what exact settings to change wasn't clear, nor was the proper value to use. Even once you find the LLC settings in the MSI BIOS (which took a bit of digging), you then end up with Auto or settings of 10–90 in increments of 10. Paul finally answered my queries and told me that I should go higher, starting with 60 and moving up.

[Incidentally, this is what I mean about BIOS settings being a scary black box. Like, I know what a lot of the settings are supposed to do, but then you get things like LLC with nebulous settings. There are also some related settings that are like "Level 1" through "Level 9" and no indication of what the auto setting is using. So, having previously found a workaround, I wasn't keen on tweaking BIOS settings to try potentially ten different options.]

So, I changed the power and current limits back to higher settings (didn't go whole-hog, though, as I don't like the CPU running at up to 100C, so leaving a 350W limit in place seemed more sensible). Then I tried bumping the LLC to 60 and running The Last of Us. I got an almost immediate crash during the shader compiling, so I rebooted and set LLC to 70. Tried again and TLOU got through the initial launch just fine.

Now I need to see about sticking an Nvidia card back into the system and determining whether the driver installation still fails or not. Which will probably happen tomorrow, as I'm currently (re)testing an AMD GPU...
Yes, the auto settings not showing what value it actually uses is an MSI thing, happens on my z690 unify X as well.

For the most part, DC LL is "useless", since that's used just for proper power draw calculations (with a proper DC LL value your VID will match Vcore and therefore power draw in something like hwinfo will be accurate). AC LL is the determining factor of stability..
 
Recommended solution: Get full refund from intel and move to AMD fast as you can.

in my case: i had RMA 6 times (i9-13900K-i9, 13900KF, i9-13900KS, i9-13900KS, i9-13900KS, i9-14900K) so trust me, RMA is not over. Even PL1 and PL2 are 253w. (So this is WTF for me i bought an unlocked CPU to limit spec equal to a non-K CPU that doesn't make sense.) or Intel Fail Safe to put high voltage to fast destroy CPU

You can contact Intel to get a full refund or change a new CPU and sell it.
 
Recommended solution: Get full refund from intel and move to AMD fast as you can.

in my case: i had RMA 6 times (i9-13900K-i9, 13900KF, i9-13900KS, i9-13900KS, i9-13900KS, i9-14900K) so trust me, RMA is not over. Even PL1 and PL2 are 253w. (So this is WTF for me i bought an unlocked CPU to limit spec equal to a non-K CPU that doesn't make sense.) or Intel Fail Safe to put high voltage to fast destroy CPU

You can contact Intel to get a full refund or change a new CPU and sell it.
Cool story bro. And I'm batman
 
I've loaded new bios (2202) for Asus Z790 Hero which includes "Intel Baseline Profile" and it does seem to solve the problem based on a quick test in one game that used to crash on startup. 3Dmark CPU profile test shows that 32 thread performance dropped 16%, 16 threads dropped 10%, 8 threads and less are within 1% so the games aren't really affected. I've run CP2077 benchmark that tends to be heavy on CPU and likes many cores and the score was within 1% the same.
 
I've loaded new bios (2202) for Asus Z790 Hero which includes "Intel Baseline Profile" and it does seem to solve the problem based on a quick test in one game that used to crash on startup. 3Dmark CPU profile test shows that 32 thread performance dropped 16%, 16 threads dropped 10%, 8 threads and less are within 1% so the games aren't really affected. I've run CP2077 benchmark that tends to be heavy on CPU and likes many cores and the score was within 1% the same.

I've noticed the iccmax in this profile was really low and the cpu never reached PL1/2. I've set it to 400A, it still works fine but the performance degradation in 32threads is now 5% and 2% in 16threads.
 
This literally started happening on a 6 month i9 13900KF build today launching Fornite. I did update GPU drivers beforehand and thought that must have been why, then I checked the game logs:

Well this form won't let me paste in the exact error, comes across as spam or something, but the gist of it is '
The CPU (13th Gen Intel(R) Core i9-13900KF) may be unstable'

I'm still investigating this, but looks like I'm in for a long night 😵 I'm expecting Epic Games to say this is a hardware issue.
 
  • Like
Reactions: bit_user