News Intel's CPU instability and crashing issues also impact mainstream 65W and higher 'non-K' models — damage is irreversible, no planned recall

Page 5 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Mar 10, 2020
421
387
5,070
Do you acknowledge that there is some degree of degradation in the 13/14 gen 65W parts? (Hint, Intel does)

The extent of the deg in individual parts cannot be assessed by the users, there are no tools.

The majority of users won’t confidently update bios. (Windows update as a delivery method?)

So a brand new unit has a life before terminal deg of 100%. In normal non abused, not overclocked, every day use the chips are being stressed due to the microcode bug, high voltages presented.
Every day the chip moves closer to 1% remaining life.

Here I ask you a question. If you knew the amount of deg, at what limit would you NOT buy a 13/14 gen chip from eBay?

That number is not knowable by us. Intel sold devices which people bought expecting a good solid lifespan. (I note from your signature that you have a 2016 cpu)
5 to 10 years is not an unreasonable expectation for the average cpu. These devices are being degraded in normal use, that deg is accelerated compared to normal. They are failing in numbers which are higher than normal. (If they weren’t intel wouldn’t have to get a fix out).

Yes it would cost money but they sold a FAULTY product. One that cannot be repaired.
 
Jul 29, 2024
1
4
15
This is a real shambles from Intel - something like eleven SKUs of 13th/14th gen CPUs are affected (a bit surprised this article failed to list the exact models!), including my 13600K I bought back in Dec 2022 (luckily still working OK).

Don't forget the oxidation issue in 2023 too (I dodged a bullet there) - it's laughable that Tom's Hardware repurposed an old "which is best - Intel or AMD CPUs?" article from 2020 just after posting this article: they slapped a 2024 label on it, mentioned the microcode issue (but failed to mention the oxidation issue) and then gave Intel the win!
 
May 21, 2024
15
27
40
what i worry the most is not my cpu is degrading and i lost advertised performance to keep it working. I fear this disaster will set a lowest standard to future CPUs that manufacturers will keep boosting limits at cost of longevity and say "you should expect to buy new cpu every year"
 

pixelpusher220

Distinguished
Jun 4, 2008
224
109
18,760
Tell that to Ford which has at least 150,000 cars on the road right now from fuel leak and fuel injector issues in the last couple of years that they will not fix unless it catches fire, or the millions of Kia and Hyundai cars which will not be fixed but only have a fuse put in to lower the chance of a fire.
Or Kia with something so bad as to "Don't park it indoors or it may burn your house down" but hey keep driving it for 10 months before we even issue a fix.
 
Do you acknowledge that there is some degree of degradation in the 13/14 gen 65W parts? (Hint, Intel does)
There may be abnormal degradation not that there is guaranteed to be.
The extent of the deg in individual parts cannot be assessed by the users, there are no tools.
Intel absolutely needs to do something with regards to this. While there won't be a guaranteed list of which parts are most likely affected they should be able to write software that pulls off of VID information to tell people whether or not they're likely to have had issues.
Here I ask you a question. If you knew the amount of deg, at what limit would you NOT buy a 13/14 gen chip from eBay?
I would never buy a used CPU in the first place unless it was for a really old platform. There's no way to know what people have done to them, and with a situation like this one it's now doubly dangerous.
That number is not knowable by us. Intel sold devices which people bought expecting a good solid lifespan. (I note from your signature that you have a 2016 cpu)
5 to 10 years is not an unreasonable expectation for the average cpu. These devices are being degraded in normal use, that deg is accelerated compared to normal.
Unfortunately the warranties don't reflect this reality which companies use as a scapegoat. I've said in other posts here that I absolutely think Intel needs to extend the warranty on these parts. There's no way for an end user to determine the likelihood of damage unless it's already experiencing signs.
Yes it would cost money but they sold a FAULTY product. One that cannot be repaired.
If someone's CPU fails due to this it's something Intel absolutely should cover (and as I said above I think they should extend the warranty). That doesn't mean Intel should be throwing away billions of dollars to rectify the situation when the majority don't appear to be negatively affected.
 

pixelpusher220

Distinguished
Jun 4, 2008
224
109
18,760
So we go back to this same thing yet again: why should Intel spend billions of dollars on a recall that doesn't seem to warrant it?
Because they would like people to trust them in the future? This would be a mild single year *profit* hit, let alone a loss. Take the pain, show your customers you go 'above and beyond' for YOUR mistakes.

Or try and coast on your laurels of 'Nobody ever got fired choosing [insert current behemoth]" strategy from the 70s. There are other options...every time a leader takes the cheap road, they alienate more of the market.
 

bit_user

Titan
Ambassador
It is mostly marketing materials but in the end Intel list as IoT devices only the E and TE versions that are specific for embedded.
You didn't click where it says "View All".

The entire list of Gen 13 CPU models Intel is recommending for Industrial & Embedded computing is:

Tier65W Mainstream35W MainstreamEmbeddedLow-power Embedded
i9i9-13900i9-13900Ei9-13900TE
i7i7-13700i7-13700Ti7-13700Ei7-13700TE
i5i5-13400, i5-13500i5-13500Ti5-13400E, i5-13500Ei5-13500TE
i3i3-13100i3-13100Ti3-13100Ei3-13100TE

Again, this is taken from the page I previously cited: https://www.intel.com/content/www/us/en/products/details/embedded-processors/core/13thgen.html

Perhaps @thestryker also failed to click "View All".
 
Last edited:
  • Like
Reactions: flofixer
Mar 10, 2020
421
387
5,070
There may be abnormal degradation not that there is guaranteed to be.

Intel absolutely needs to do something with regards to this. While there won't be a guaranteed list of which parts are most likely affected they should be able to write software that pulls off of VID information to tell people whether or not they're likely to have had issues.

I would never buy a used CPU in the first place unless it was for a really old platform. There's no way to know what people have done to them, and with a situation like this one it's now doubly dangerous.

Unfortunately the warranties don't reflect this reality which companies use as a scapegoat. I've said in other posts here that I absolutely think Intel needs to extend the warranty on these parts. There's no way for an end user to determine the likelihood of damage unless it's already experiencing signs.

If someone's CPU fails due to this it's something Intel absolutely should cover (and as I said above I think they should extend the warranty). That doesn't mean Intel should be throwing away billions of dollars to rectify the situation when the majority don't appear to be negatively affected.
Put simply, people put their faith in intel.

They believe that the merchandise Intel produces will be reliable. There will always be failures but these are generally in the low (very low) percent of units produced.

Intel are not respecting their customers. Currently no extension to warranty, no mention of a recall, no mention of less restrictive RMAs.

To deny there is a problem is futile, do your research, the evidence is easy to find. Intel have acknowledged that there is a problem … they are ultimately responsible, they have deflected and the blame has returned to them. Their actions are disingenuous at best.
 

bit_user

Titan
Ambassador
if part of the issue is the complexity of the differing needs between E and P cores then Bartlett may be the ONLY viable fix... it doesn't have E cores so won't have the issues with the down stepping/upstepping etc.
FWIW, disabling the E-cores has been tried with relatively little succes.

Bartlett Lake CPUs aren't due until 2025 and only the i9 and i7 tier models are P-core only.
 

bit_user

Titan
Ambassador
So we go back to this same thing yet again: why should Intel spend billions of dollars on a recall that doesn't seem to warrant it?
Because they would like people to trust them in the future? This would be a mild single year *profit* hit, let alone a loss. Take the pain, show your customers you go 'above and beyond' for YOUR mistakes.
Another way to do this (at lower cost to Intel), is to extend the warranty period and relax the terms for claiming it. It has been suggested that retail boxed CPUs go from a 3 year warranty to 5, which I think is reasonable.

They can separately arrange a deal with OEMs, but they will have added costs and many will probably drag their feet on passing along the extended warranty coverage to their customers. This is where most of the reputational damage could occur.
 

Taslios

Proper
Jul 11, 2024
54
76
110
FWIW, disabling the E-cores has been tried with relatively little succes.

Bartlett Lake CPUs aren't due until 2025 and only the i9 and i7 tier models are P-core only.
Yes but these attempts are on cores that have already been damaged. The Bartlett chips will never have E cores so if the issue is the variations between P and E make the bus have issues then those issues cannot happen without both types of cores.

if the issue is JUST the bus, then removing the E cores will have no effect and these cpus may be just as problematic as Raptor Lake.
 

slightnitpick

Upstanding
Nov 2, 2023
237
156
260
Your description of the auto industry is not accurate. If a major issue shows up, they will/must recall all the vehicles potentially affected for repair or replacement.
Yes, and here the upcoming microcode update is the recall. Recalls don't happen immediately upon problem discovery, even in things as serious as automotive recalls: https://onlinelibrary.wiley.com/doi/10.1111/jscm.12160
Results from event history analysis reveal that discovery-to-recall is longer for: (1) recalls that are triggered by external initial reports, rather than internal initial reports; (2) recalls that are attributed to suppliers, rather than automakers; (3) recalls that are associated with design flaws, as opposed to manufacturing flaws; and (4) recalls with more models involved.
This Intel issue has three out of four.
And they certainly won't sell new vehicles with those problems. Doing anything would lead to legal liability, major lawsuits and financial catastrophe.
Only for new car sales, and that due to a federal law (US). Used cars can be, and are, sold when subject to recall. https://www.consumerreports.org/car...ll-guide-your-questions-answered-a1115780728/

I personally bought a used car a couple of years ago and then received an airbag recall notice in the mail a few months later - this recall had occurred about 10 years previously. The recall authorities kindly check for change of ownership to inform new owners of old recalls.
 

bit_user

Titan
Ambassador
The Bartlett chips will never have E cores
I said only the i9 and i7 will lack E-cores, but it appears a version of the i5 will also use this die:

nvlbTypw4UhhVv3Z.jpg


Source: https://www.techpowerup.com/324571/...y-bartlett-lga1700-processor-for-2025#g324571

if the issue is JUST the bus, then removing the E cores will have no effect and these cpus may be just as problematic as Raptor Lake.
I'm not sure about that, since disabling the E-cores is said to have some benefit. On Alder Lake, disabling them enabled the ring bus to clock higher. On both Alder Lake and Raptor Lake, disabling them also seems to disable the associated slice of L3 cache.

Anyway, I'm not convinced the issue is really as simple as the E-core clusters. I will await better evidence to support that theory. I mainly just wanted to clear up what we know about the Bartlett Lake model lineup.


P.S. the Bartlett Lake 12P dies will have the same number of ring bus stops as Raptor Lake 8P + 16E! That's because each 4 E cores share a single ring bus stop.
 
Last edited:

CmdrShepard

Prominent
BANNED
Dec 18, 2023
531
428
760
So what you're actually saying is that you think Intel is lying about GC being affected.
No, that's not what I said -- all I said is that if the flaw is architectural (which frankly wouldn't surprise me because I don't expect that this kind of stuff is controlled in microcode by default) SPR could be affected too.
This would be why in the wild the CPUs predominantly affected are the i9 variety.
You are disregarding the factors that skew the statistics -- i9 owners are more likely to try to push the chip harder, after all they paid for the fastest chip. They are also more likely to complain than i5 and i7 owners. And there are non-K users who probably bought some prebuilt system and just shrug and reboot when it crashes instead of reporting.
You're absolutely ignoring binning which controls the programmed VID. While the bug may exist in every CPU that doesn't mean the negative effects will be. It's not like there's an arbitrary voltage number being pumped through every CPU with the faulty algorithm.
If only there was some article where all this was already laid out for you to read:

FTA1:
erroneous microcode instructing the CPU to ask for more voltage than was safe

FYI, that's the bug which is supposedly going to be fixed in August and which has nothing to do with eTVB and boosting whatsoever. So yes, every CPU has faulty voltage selection algorithm.

FTA2:
The crashing issues could impact any Raptor Lake or Raptor Lake Refresh chip drawing 65W or more power. Furthermore, the bug also affects the mainstream non-K models and their K/KF/KS counterparts
You don't seem to grasp that this isn't some sort of architectural silicon bug. If it was then the T series parts would be listed as being affected.
You can't know that it isn't architectural unless Intel decides to tell us. They didn't explicitly spell out T series parts but they did say mainstream non-K models, and T series qualifies for that.
It's about the way the CPUs boost and the voltage required to get there.
Again, no. RTFA.
 
  • Like
Reactions: bit_user

rluker5

Distinguished
Jun 23, 2014
906
586
19,760
VID is how much the CPU requests, vcore is how much it actually gets from the mobo. The "goal" should be to make these 2 match, for starters, because that's the only way to get actual power usage from hwinfo. If vid and vcore don't match your power readings will be off. In order to make these 2 match you need to play around with AC / DC LL on advanced lite load settings. Setting up the correct values is based on motherboard's vrms mostly.

As you can see from my SS, min / avg / max vcore and VID are as close as possible. Maybe could tune a bit further to close that 0.003 difference but im too bored to do that

image-2024-07-29-081155957.png
Thanks for the heads up, in an ideal world those two readings would match. But on my Asus Prime Z690 P (w 13900kf), and my Aorus Z690i Ultra Lite (13600k) the external measured vcore*VR measured current matches the reported CPU power within a few watts, under load, across the frequency spectrum unless I really push the sliders weird. Even when the CPU requested volts is a couple to a few dozen mv less than the measured volts. Then the CPU requested volts* amps supplied comes out less than watts reported. But that is just on the two lower end Z boards I have. I can't speak for other motherboards.

It would be nice to have the two matching but I don't want to start typing out AC / DC LL values when I'm just used to picking a number, then I have to keep them in a file if I ever put in a new bios. But it was worth checking. Motherboard settings seem pretty suspect nowadays.
 
No, that's not what I said -- all I said is that if the flaw is architectural (which frankly wouldn't surprise me because I don't expect that this kind of stuff is controlled in microcode by default) SPR could be affected too.
So Intel isn't lying, but maybe Intel is lying? You can't have it both ways like you're trying to here.
You are disregarding the factors that skew the statistics -- i9 owners are more likely to try to push the chip harder, after all they paid for the fastest chip. They are also more likely to complain than i5 and i7 owners. And there are non-K users who probably bought some prebuilt system and just shrug and reboot when it crashes instead of reporting.
This is beyond absurd to suggest that somehow this is all because of purchaser psychology.
If only there was some article where all this was already laid out for you to read:

FTA1:
erroneous microcode instructing the CPU to ask for more voltage than was safe

FYI, that's the bug which is supposedly going to be fixed in August and which has nothing to do with eTVB and boosting whatsoever. So yes, every CPU has faulty voltage selection algorithm.
The voltage selection algorithm isn't just randomly applying voltages at random times. It's based on the voltage requests which only go up under load and boost. There's no debate over whether or not every CPU has this flaw, but rather whether or not it damages them all.
FTA2:
The crashing issues could impact any Raptor Lake or Raptor Lake Refresh chip drawing 65W or more power. Furthermore, the bug also affects the mainstream non-K models and their K/KF/KS counterparts

You can't know that it isn't architectural unless Intel decides to tell us. They didn't explicitly spell out T series parts but they did say mainstream non-K models, and T series qualifies for that.
Here let me copy the actual Intel quote for you since you're too ridiculous to hunt it down yourself:
Intel Core 13th and 14th Generation desktop processors with 65W or higher base power – including K/KF/KS and 65W non-K variants – could be affected by the elevated voltages issue. However, this does not mean that all processors listed are (or will be) impacted by the elevated voltages issue.
Base power on T series are 35W so...

Also woah now did Intel literally say that not all CPUs are impacted by the elevated voltage issue?!
 

NinoPino

Respectable
May 26, 2022
489
305
2,060
You didn't click where it says "View All".

The entire list of Gen 13 CPU models Intel is recommending for Industrial & Embedded computing is:
Tier65W Mainstream35W MainstreamEmbeddedLow-power Embedded
i9i9-13900i9-13900Ei9-13900TE
i7i7-13700i7-13700Ti7-13700Ei7-13700TE
i5i5-13400, i5-13500i5-13500Ti5-13400E, i5-13500Ei5-13500TE
i3i3-13100i3-13100Ti3-13100Ei3-13100TE


Again, this is taken from the page I previously cited: https://www.intel.com/content/www/us/en/products/details/embedded-processors/core/13thgen.html

Perhaps @thestryker also failed to click "View All".
I clicked, but only E and TE models are marked as IoT in the column "CPU Category".

E suffix = NEX/IoT corp/mainstream embedded road map SKU.
TE suffix = NEX/IoT low-power embedded road map SKU.

CPU Part NumberCPU
Category
Processor ThreadsIntel® Smart Cache (L3)Processor Base PowerSingle P-Core Turbo FreqBSingle E-Core Turbo FreqBP-Core
Base FreqB
E-Core
Base FreqB
Graphics Execution Units (EUs)
Intel® Core™ i9-13900EIoT3236 MB65WUp to
5.2 GHz
Up to
4.0 GHz
1.8 GHz1.3 GHz32 EU
Intel® Core™ i9-13900TEIoT3236 MB35WUp to
5.0 GHz
Up to
3.9 GHz
1.0 GHz0.8 GHz32 EU
Intel® Core™ i9-13900Mainstream3236 MB65WUp to
5.6 GHz
Up to
4.2 GHz
2.0 GHz1.5 GHz32 EU
 

NinoPino

Respectable
May 26, 2022
489
305
2,060
...
Here let me copy the actual Intel quote for you since you're too ridiculous to hunt it down yourself:

Base power on T series are 35W so...

Also woah now did Intel literally say that not all CPUs are impacted by the elevated voltage issue?!
This is one of the problems of the situation, Intel should communicate the full list of affected models instead of giving generic indications that are subject to interpretation and can generate ambiguities.
 

ThomasKinsley

Notable
Oct 4, 2023
385
384
1,060
Also woah now did Intel literally say that not all CPUs are impacted by the elevated voltage issue?!
I am completely out of my league in this conversation, but Intel's statement is consistent with what greymaterial suggested when he said C0 dies are likely not affected.
so far all the cases known involve the B0 die, but none from the C0 die. B0 die has 8 raptor cove p-cores (w/ 2MB L2$) and 4 clusters of gracemont e-cores, i7 configuration has some cores disabled; C0 die is from ADL with 1.25MB L2$, and only 2 clusters of e-cores, some 13th gen i5 use C0 instead of B0 dies which are none affected.
 
Status
Not open for further replies.