News Intel finally announces a solution for CPU crashing errors — claims elevated voltages are the root cause; fix coming by mid-August

bit_user

Titan
Ambassador
Intel said:
Intel is currently targeting mid-August for patch release to partners
So, they think they've found the cause and a solution, but they're going to just let this problem fester for another month before a solution can realistically get into the hands of end users??

Wow, I'm sure glad I don't have a Raptor Lake that's continuing to degrade, in the meantime. You'd think they could at least post some tips for enthusiasts to follow, in order to minimize damage until then. I guess the community is left to follow the advice people have discovered on their own.

I'm guessing they're trying to ride a knife's edge of undervolting, to avoid sacrificing either too much performance or risking instability due to too little voltage. I'll bet the voltage window is really narrow, at those higher frequencies. That's the main reason I can see why it'd take them so long to perfect their solution, before letting anyone else even have a beta version.
 
This would certainly explain why it has taken so long to narrow down and predominantly affects high clocking chips. The T series may be low power, but they have higher boost clocks than i5s.
Wow, I'm sure glad I don't have a Raptor Lake that's continuing to degrade, in the meantime. You'd think they could at least post some tips for enthusiasts to follow, in order to minimize damage until then. I guess the community is left to follow the advice people have discovered on their own.
So much this as I understand why it's taking so long to fix, but to give no advice as to what to do in the meantime or mitigation techniques is bad. At this point I think Intel needs to add two years to the warranty period for boxed processors.
 

DS426

Proper
May 15, 2024
91
63
110
RMA for what? Get another defective same model or a step down to an i7 or i5? Get money back and then swap out the whole platform for AMD? (I know I would if I was one affected by this problem)

I just want to know how Intel is going to make it right for the customers. While that's great that they finally came out and said something, they're leaving too many questions on the table this far into the debacle.
 
honestly I dont buy it as the real reason.

theres claims of MB's that run tame settings that have issue (some server grade ones) and likely voltage on them was never near being too much so even if it was voltage issue likely wouldnt of been too high for those ones yet still had issue.

this just sounds like a "best excuse w/o getting sued" statement.

also if it IS true its hilariously comical as Intel's always been joke only getting better by pushing more voltage every generation and it biting em in butt finally would be icing on cake.
 
Yes all that automatic overclocking every Mobo vender has been doing is having severe consequences to chips. Guaranteed Intel doesn't have a test scenario where the run a chip 20~50% above voltage in high usage scenarios (games) and see what happens.
 

vinay2070

Distinguished
Nov 27, 2011
280
65
18,870
So, they think they've found the cause and a solution, but they're going to just let this problem fester for another month before a solution can realistically get into the hands of end users??
Probably that is one of the reason or there may be more which they are not telling the end users, and hopefully find the right root cause and try to fix as much as possible via bios updates by then - if at all possible.
 
  • Like
Reactions: NinoPino

PCWarrior

Distinguished
May 20, 2013
215
96
18,670
Robeytech's video on Intel's announcement (who sort of serves as Intel's mouthpiece):
View: https://www.youtube.com/watch?v=wkrOYfmXhIc


Wendell & GN pointed to Oxidation issues with the Via's with problems at the foundry level.
That is just speculation for the sake of speculation and verging on scaremongering. Intel officially stated what the root cause of the problem is and they really cannot lie about it. Also the reason they stated makes perfect sense given the experience of people who dealt with the issue and solved it like in this video from framechasers.
View: https://www.youtube.com/watch?v=afN6SaT21cQ
 
  • Like
Reactions: NinoPino

vinay2070

Distinguished
Nov 27, 2011
280
65
18,870
So much this as I understand why it's taking so long to fix, but to give no advice as to what to do in the meantime or mitigation techniques is bad. At this point I think Intel needs to add two years to the warranty period for boxed processors.
I have been switching between Intel and AMD CPUs for a long time now and if I have to trust intel again, they need to realease a bios fix that WILL fix the issue, or like you said provide an extended warranty. Else I am not buying Intel ever again. Currently on Ryzen and thankfully not affected by this.
 

NinoPino

Respectable
May 26, 2022
332
200
2,060
So, they think they've found the cause and a solution, but they're going to just let this problem fester for another month before a solution can realistically get into the hands of end users??
I agree, it is a lot of time and with continuos RMAs it will cost a lot of money.
Than I think that must be a good reason to wait another month. Maybe they are searching a way to not hurt performance, or the problem is not so simple as described.

... You'd think they could at least post some tips for enthusiasts to follow, in order to minimize damage until then. ...
They already did : "The company had previously advised its customers to stick with the basic power guidelines for its processors, rather than running them at fully unlocked settings, as it worked through the issues."
 

NinoPino

Respectable
May 26, 2022
332
200
2,060
honestly I dont buy it as the real reason.

theres claims of MB's that run tame settings that have issue (some server grade ones) and likely voltage on them was never near being too much so even if it was voltage issue likely wouldnt of been too high for those ones yet still had issue.
If the problem is on microcode, motherboard type does not matter.
 
  • Like
Reactions: thestryker
They already did : "The company had previously advised its customers to stick with the basic power guidelines for its processors, rather than running them at fully unlocked settings, as it worked through the issues."
I think the point here is that if the problems happen when being used on boards locked to Intel specifications (W680) then that alone isn't good enough to prevent this from happening/getting worse. Given that they're effectively saying it's something with the VF curve that would imply it's the voltages to get to the high clockspeeds pushing this issue. That's why you'd see the problem more often with say a 13900T than a 13600K (former boosts to 5.5ghz stock while the latter is 5.1ghz).
 

setx

Distinguished
Dec 10, 2014
231
161
18,760
Sounds oddly familiar with the issue AMD just fixed last year
It seems familiar only at first glance. There overvoltage was caused by motherboard venders and didn't affect people running full stock settings. Experienced overclockers were also not affected as it's obvious that you don't want to leave voltages on Auto because motherboards (at least Asus' ones) were setting them to crazy levels for a long time now.

Here it's some mysterious microcode that is encrypted and no one external can really tell if Intel tells the truth or they just managed with it to work around some hardware defect.
 
  • Like
Reactions: King_V
May 21, 2024
4
7
15
microcode patch is an easy fix that would not have taken half a year to investigate and few more months to roll out. a bug in microcode for voltage control may be a sufficient condition to cause the symptoms, however yet to say it is "the" root cause. i believe there are much more necessary conditions being the root cause, and most likely physical rather than software/algorithmic cause.

one example i have had in mind is the chips' high interval per sec switching between ~20 to ~50 ratio multipliers, corresponding voltage jump is as large as close to 1V from ~0.6V to 1.56V; where each 8 p-cores are switching independently, the L3 cache and ring bus has to deal with all the independent voltages thousand times per second among all the cores. I suspect the ringbus & L3 cache simply cannot bear with the huge voltage jumps and differences all the time, thus it degrades. some say disabling e-cores helped a bit, I suspect there is a typical large voltage gap at the ringbus position gapped between p- and e-cores, where most likely hit hard, disabling them keeps the processes away from crossing the defected site on the ring.

The new microcode may help in reducing the voltage differences across core nodes on the ring, but it will be more complicated to actually fix the frequency dynamics to reduce stress on the voltage changes on the ring, while still providing that advertised 5.xGHz which takes high voltage spikes to achieve.
 
Last edited:
  • Like
Reactions: phitinh81

tamalero

Distinguished
Oct 25, 2006
1,159
165
19,470
RMA for what? Get another defective same model or a step down to an i7 or i5? Get money back and then swap out the whole platform for AMD? (I know I would if I was one affected by this problem)

I just want to know how Intel is going to make it right for the customers. While that's great that they finally came out and said something, they're leaving too many questions on the table this far into the debacle.
The new ones could have the new microcode and would not degrade.
But then.. makes you wonder if there is a performance hit.