News Puget says its Intel chips failures are lower than Ryzen failures — retailer releases failure rate data, cites conservative power settings

Page 11 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Here ya go... It's pretty far down the page.

https://www.pugetsystems.com/blog/2...-perspective-on-intel-cpu-instability-issues/

Here's a copy/past of the relevant part I was quoting from the release:

"The most concerning part of all of this to us here at Puget Systems is the rise in the number of failures in the field, which we haven’t seen this high since 11th Gen. We’re seeing ALL of these failures happen after 6 months, which means we do expect elevated failure rates to continue for the foreseeable future and possibly even after Intel issues the microcode patch.

Based on this information, we are definitely experiencing CPU failures higher than our historical average, especially with 14th Gen. We have enough data to know that we don’t have an acute problem on the horizon with 13th Gen — it is more of a slow burn. We do expect an elevated failure rate on 14th Gen while Intel finishes finding a root cause and issuing a microcode update."
In deed it was there from the beginning, but also that's what caught my eyes as to be Puget being more of a responsible SI, they more or less neutrally said that is isn't an "acute" issue for 13th gen, but a slow death, so failure rate expected to rise within their warranty, but it won't be something like a hundred PC rush back to them as an avalanche type of issue, so they can take care of it on by one.

This basically said that there is an issue, even with their more conservative profile, just that their lower power and clock slows it down, pretty much like the early days intel ask affected users to downclock the multiplier to run stable.

Also that quite some of the errors reported by Wendell is under very strict error testing, where even a single, non-fatal error are counted in, so in Puget's test suite it might fly pass without finding that specific error finding suite. Time will tell the full story, but IF with puget's ver strict power profile it still degrade like it is, good luck that there will be enough i9s in the wild won't need replacement in the warranty period, or they just find excuses to decline your RMA again
 
  • Like
Reactions: Guardians Bane
In deed it was there from the beginning, but also that's what caught my eyes as to be Puget being more of a responsible SI, they more or less neutrally said that is isn't an "acute" issue for 13th gen, but a slow death, so failure rate expected to rise within their warranty, but it won't be something like a hundred PC rush back to them as an avalanche type of issue, so they can take care of it on by one.

This basically said that there is an issue, even with their more conservative profile, just that their lower power and clock slows it down, pretty much like the early days intel ask affected users to downclock the multiplier to run stable.

Also that quite some of the errors reported by Wendell is under very strict error testing, where even a single, non-fatal error are counted in, so in Puget's test suite it might fly pass without finding that specific error finding suite. Time will tell the full story, but IF with puget's ver strict power profile it still degrade like it is, good luck that there will be enough i9s in the wild won't need replacement in the warranty period, or they just find excuses to decline your RMA again
Either way, it seems Puget is doing a better job of handling their customers, than Intel. I won't be buying Intel any time soon. I don't want to gamble. And after today's initial reviews on 9000 SKUs , it looks like I would do better with the 7000 SKUs considering price/performance. Hoping Microcenter will offer some tasty bundles soon!
 
Here ya go... It's pretty far down the page.

https://www.pugetsystems.com/blog/2...-perspective-on-intel-cpu-instability-issues/

Here's a copy/past of the relevant part I was quoting from the release:

"The most concerning part of all of this to us here at Puget Systems is the rise in the number of failures in the field, which we haven’t seen this high since 11th Gen. We’re seeing ALL of these failures happen after 6 months, which means we do expect elevated failure rates to continue for the foreseeable future and possibly even after Intel issues the microcode patch.

Based on this information, we are definitely experiencing CPU failures higher than our historical average, especially with 14th Gen. We have enough data to know that we don’t have an acute problem on the horizon with 13th Gen — it is more of a slow burn. We do expect an elevated failure rate on 14th Gen while Intel finishes finding a root cause and issuing a microcode update."
Notice how they focus on field failure rates. They are basically saying they are experiencing unusually high field failure rates. Not unusually high failure rates, period.

It doesn't say that they are expectating a much higher failure rate like you suggested in your previous post. Instead, they are saying that the elevated failure rates they are experiencing will continue.

Obviously puget would prefer if 13th and 14th gen were so horrible that they died much quicker than they currently dying so they don't even make it to the field. It's kind of a weird situation where the best fix for puget would be to make 13th and 14th as unreliable as zen 4, making them so bad that they basically fail before they eve make it out of the door.
 
Notice how they focus on field failure rates. They are basically saying they are experiencing unusually high field failure rates. Not unusually high failure rates, period.

It doesn't say that they are expectating a much higher failure rate like you suggested in your previous post. Instead, they are saying that the elevated failure rates they are experiencing will continue.

Obviously puget would prefer if 13th and 14th gen were so horrible that they died much quicker than they currently dying so they don't even make it to the field. It's kind of a weird situation where the best fix for puget would be to make 13th and 14th as unreliable as zen 4, making them so bad that they basically fail before they eve make it out of the door.
1) field failure rates are that the CPUs degrading after use, they can't just "make them fail more in shop" when they are having design flaws and the Puget settings can't stop them from degrading, that's why they expect the microcode update could help but not eliminating the issue as the systems are assumed to be under heavy use since purchase

2) If you look at nos. failing per month, not the %, you can see the failing numbers are on par and surpassing the 11th gen era
 
  • Like
Reactions: bit_user
Notice how they focus on field failure rates. They are basically saying they are experiencing unusually high field failure rates. Not unusually high failure rates, period.

It doesn't say that they are expectating a much higher failure rate like you suggested in your previous post. Instead, they are saying that the elevated failure rates they are experiencing will continue.

Obviously puget would prefer if 13th and 14th gen were so horrible that they died much quicker than they currently dying so they don't even make it to the field. It's kind of a weird situation where the best fix for puget would be to make 13th and 14th as unreliable as zen 4, making them so bad that they basically fail before they eve make it out of the door.

1) field failure rates are that the CPUs degrading after use, they can't just "make them fail more in shop" when they are having design flaws and the Puget settings can't stop them from degrading, that's why they expect the microcode update could help but not eliminating the issue as the systems are assumed to be under heavy use since purchase

2) If you look at nos. failing per month, not the %, you can see the failing numbers are on par and surpassing the 11th gen era
I've tried and tried to explain all this to the Herald. They either
1: will not admitt the problems bigger than it should be and that the CPUs are going to continue to fail at an elevated rate (and it's growing). And they are a fanboy posting misleading information just "because Intel" and trying to sway people to gamble their money on a known bad product.

2: they just don't understand and are completely oblivious to the facts being presented by places like Puget Systems and even OEMs the world over. And ignore anything that goes against their views on Intel. (Because, Intel)

I'm pretty much done with them. But what upsets me is people posting misleading and/or misinterpreted data which is obviously focused on trying to sway people into believing there is not a problem... Almost all the big OEMs have extended their warranties because of the issues. There was an article on The Verge detailing all this with the OEMs today. The contacted all the major players to get some good information about these CPUs and warranties. Because, facts are facts and the CPUs are failing.

Sorry for the ramble. Lol


\m/
 
  • Like
Reactions: bit_user
Notice how they focus on field failure rates. They are basically saying they are experiencing unusually high field failure rates. Not unusually high failure rates, period.

It doesn't say that they are expectating a much higher failure rate like you suggested in your previous post. Instead, they are saying that the elevated failure rates they are experiencing will continue.

Obviously puget would prefer if 13th and 14th gen were so horrible that they died much quicker than they currently dying so they don't even make it to the field. It's kind of a weird situation where the best fix for puget would be to make 13th and 14th as unreliable as zen 4, making them so bad that they basically fail before they eve make it out of the door.
Oh hey!!! Nice to see you again! I see you're still at it... Just not worth my time on old news. Guess what.... Field failures are what happens after the PC is in customer hands... Hmmm almost as if it were good when it left their facility. Then the CPUs are starting to fail at a higher than normal rate...



Wonder why this is? Maybe they explained it, or what about Intel? Do they have an explanation?



This is my last reply to you on this post. I know you won't admitt to the fact that I tell CPUs are degrading faster than normal and that there are CPUs in the wild that are experiencing oxidation. This is all factual knowledge. I've had fun... But this gets tiring. Hope all goes well for you.
 
  • Like
Reactions: YSCCC
1) field failure rates are that the CPUs degrading after use, they can't just "make them fail more in shop" when they are having design flaws and the Puget settings can't stop them from degrading, that's why they expect the microcode update could help but not eliminating the issue as the systems are assumed to be under heavy use since purchase

2) If you look at nos. failing per month, not the %, you can see the failing numbers are on par and surpassing the 11th gen era
Field failure rates are CPUs that passed puget's tests and were sent to the consumer and then failed after a certain amount of time. Shop failure rates are cpus that failed before even passing puget's test.
 
Oh hey!!! Nice to see you again! I see you're still at it... Just not worth my time on old news. Guess what.... Field failures are what happens after the PC is in customer hands... Hmmm almost as if it were good when it left their facility. Then the CPUs are starting to fail at a higher than normal rate...



Wonder why this is? Maybe they explained it, or what about Intel? Do they have an explanation?



This is my last reply to you on this post. I know you won't admitt to the fact that I tell CPUs are degrading faster than normal and that there are CPUs in the wild that are experiencing oxidation. This is all factual knowledge. I've had fun... But this gets tiring. Hope all goes well for you.
Of course I won't admit that Intel cpus are degrading faster than normal, unless you are comparing them with 12th gen. If you compare them with zen 4, they degrade way slower. Zen 4 start throwing errors (at double the % rate btw) before they even make it out of the door, according to the data.

Again, im basing this on Puget's data.

1% of 13 and 14th gen start showing errors within a couple of hours. Another 1% starts throwing errors after months of years of usage

4% of zen 4 cpus start throwing errors within a couple of hours.

Please explain to me how is 13th and 14th worse? Ill like to hear your opinion
 
Field failure rates are CPUs that passed puget's tests and were sent to the consumer and then failed after a certain amount of time. Shop failure rates are cpus that failed before even passing puget's test.
We know thatall along, so the CPUs are perfectly fine when shipped out, but it degrades within a year and are returning for RMA to Puget, so it's that the Puget settings didn't save them from degradation, DOA can be anything from ram incompatibility to bios issues or manufacturing defect, but that is not degradation, it is slipping pass QC, and from the shear nos. in the Puget chart, their % means almost nothing, it is just that they sold more intel 13/14th gen than they did on the 11th gen, so the defective vs working % skyrocket, and RPL since they are relative new and with very low power setting, not yet get as much return, AMD? who knows if they have nearly as much return, just because Puget gets much more intel sales the % of failure got randomized by luck.

Fun fact is that even Intel admits they are having serious trouble yet you are standing hard on they are better
 
We know thatall along, so the CPUs are perfectly fine when shipped out, but it degrades within a year and are returning for RMA to Puget, so it's that the Puget settings didn't save them from degradation, DOA can be anything from ram incompatibility to bios issues or manufacturing defect, but that is not degradation, it is slipping pass QC, and from the shear nos. in the Puget chart, their % means almost nothing, it is just that they sold more intel 13/14th gen than they did on the 11th gen, so the defective vs working % skyrocket, and RPL since they are relative new and with very low power setting, not yet get as much return, AMD? who knows if they have nearly as much return, just because Puget gets much more intel sales the % of failure got randomized by luck.
Does it matter if it's "degradation" or the chip already ships not working? The end result is the same, 4% of the chips are unusable day one, vs 2% of the chips being unusable after months or years.
Fun fact is that even Intel admits they are having serious trouble yet you are standing hard on they are better
You understand puget is running both intel and amd at default settings? Do you understand that what's im talking about? That when run at those low safe settings for both amd and intel, amd have a much bigger problem. Intel admits having problems is completely irrelevant cause they are not talking about "safe" settings like puget is using, are they?
 
Does it matter if it's "degradation" or the chip already ships not working? The end result is the same, 4% of the chips are unusable day one, vs 2% of the chips being unusable after months or years.

You understand puget is running both intel and amd at default settings? Do you understand that what's im talking about? That when run at those low safe settings for both amd and intel, amd have a much bigger problem. Intel admits having problems is completely irrelevant cause they are not talking about "safe" settings like puget is using, are they?
Of course it matters a ton, and a crap load of importance if it is degradation.

For normal chip lottery, certain % will be defective and passed initial screening without stress testing, for any consumer, SI or DIYers, getting a defective chip which will crash in a quick stress test is the best way of defective chip.

As a normal and sane hobbist, you put in your chip and ram and maybe GPU, connect the power, install basic windows and stress test the CPU for 15-30min before even doing all those putting in case and cable management, you immediate noticed the issue, stop the complex installation of everything else and not putting your work in it, then go back to the shop for the 7 days return period for a new one, slap it in, run the test fine and you are good to go for years.

For degradation, you do all the carefull tuning, runs stress test to ensure it runs stable, go on to work with it, and it bite your butt within months and crash your work and games randomly, so you need to try identify which thing is wrong, dismental everything and with the immediate 1 for 1 exchange period expired, you are forced with a chipless PC decloration and works stuck before the RMA ever come back, which degrades slowly again and repeat the process. Everyone self use or selling the PC will want a dead in factory chip so they can screen out, swarp it and then go on, no future trouble taken.

and as I said, don't look at stupid % only, the 14th gen is failing more than 11th gen they sold already, yet the % is low, it just means the base number of RPL sold is higher, and random failure gets diluted. Yet for normal working PC, if it isn't crashing to a point of completely unusable, like it just crash once a few days a time, most of the users won't even file for a return for repair, It just clear up some settings and ask you to just use it until it BSOD left and right before they will contact the SI, so any fail at shop rate is a big issue, much bigger than fail in shop
 
Of course I won't admit that Intel cpus are degrading faster than normal, unless you are comparing them with 12th gen. If you compare them with zen 4, they degrade way slower. Zen 4 start throwing errors (at double the % rate btw) before they even make it out of the door, according to the data.

Again, im basing this on Puget's data.
You're not, because you don't know the uniformity of that data across models or times. In other words, you're generalizing. For all we know, Puget just got one or two bad batches of AMD CPUs. Maybe they were even X3D models.
 
You're not, because you don't know the uniformity of that data across models or times. In other words, you're generalizing. For all we know, Puget just got one or two bad batches of AMD CPUs. Maybe they were even X3D models.
Maybe the got one or two batches of intel chips too.

You are just making up theories to explain the high failure rate on amd chips while im solely reading what the data shows.
 
As a normal and sane hobbist, you put in your chip and ram and maybe GPU, connect the power, install basic windows and stress test the CPU for 15-30min before even doing all those putting in case and cable management, you immediate noticed the issue, stop the complex installation of everything else and not putting your work in it, then go back to the shop for the 7 days return period for a new one, slap it in, run the test fine and you are good to go for years.p
Really? Who does that? I've never done that on my life.

But let's say that every single person on the planet does this. You are proving exactly my point. I even wrote this a few posts ago.

See, let me quote myself

It's kind of a weird situation where the best fix for puget would be to make 13th and 14th as unreliable as zen 4, making them so bad that they basically fail before they eve make it out of the door.


This is what you are also suggesting, that if you make them so horrible that they fail within the first couple of stress test like their competitors, it would be a net positive. Well maybe Intel is working on it, let's see what the patch does.
 
Maybe the got one or two batches of intel chips too.
Actually, no. They provided timelines of their Intel CPU failures. We have no such data for AMD.

You are just making up theories to explain the high failure rate on amd chips while im solely reading what the data shows.
No, my point is that you lack the information to interpret the AMD data the way you're trying to. You're trying to claim it's a general problem with Ryzen 7000 CPUs, but their data in insufficient to support that conclusion. No matter how many times you reply, this won't change.
 
  • Like
Reactions: thestryker
Really? Who does that? I've never done that on my life.

But let's say that every single person on the planet does this. You are proving exactly my point. I even wrote this a few posts ago.

See, let me quote myself

It's kind of a weird situation where the best fix for puget would be to make 13th and 14th as unreliable as zen 4, making them so bad that they basically fail before they eve make it out of the door.

This is what you are also suggesting, that if you make them so horrible that they fail within the first couple of stress test like their competitors, it would be a net positive. Well maybe Intel is working on it, let's see what the patch does.
really? every tech savy or computer shop trainee in my region does that, barebone testing before all the troubles to ensure the basic compnents (CPU, MB and RAM) are stable before putting everything in just to avoid try to take out afterwards for first stability test. Since the days of Pentium 3.

No, you cannot make it fail in shop vs fail on street, it is not how degradation works, for RPL, the design flaw or too aggressive binning or the overlooked microcode will kill it slowly, through months, not within an hour of stress testing, the stress testing is for either overclocking or out of factory duds, due to manufacturing issues, degradation is just like wear and tear or human aging, when it is a generational problem you cannot make that. I never said Puget won't want intel to die in shop vs die in field, I always said die in field is the real issue, die in shop, when the rate is below 10% in significant quantities, it is not a huge issue, you get 100 chips, find a test bed and run all for the 30min-1hr suite and do the final QC before putting in the real system to do another run of quick test and bam, you are ready to go, but fail in field, especially when the curve isn't steady or slowly declining in failure rate, it is a growing risk and a big one, that's why random dud from purchase isn't a big issue unless it occurs in a row for some really unlucky individual, degradation for RPL causes massive outrage
 
  • Like
Reactions: bit_user
really? every tech savy or computer shop trainee in my region does that, barebone testing before all the troubles to ensure the basic compnents (CPU, MB and RAM) are stable before putting everything in just to avoid try to take out afterwards for first stability test. Since the days of Pentium 3.

No, you cannot make it fail in shop vs fail on street, it is not how degradation works, for RPL, the design flaw or too aggressive binning or the overlooked microcode will kill it slowly, through months, not within an hour of stress testing, the stress testing is for either overclocking or out of factory duds, due to manufacturing issues, degradation is just like wear and tear or human aging, when it is a generational problem you cannot make that. I never said Puget won't want intel to die in shop vs die in field, I always said die in field is the real issue, die in shop, when the rate is below 10% in significant quantities, it is not a huge issue, you get 100 chips, find a test bed and run all for the 30min-1hr suite and do the final QC before putting in the real system to do another run of quick test and bam, you are ready to go, but fail in field, especially when the curve isn't steady or slowly declining in failure rate, it is a growing risk and a big one, that's why random dud from purchase isn't a big issue unless it occurs in a row for some really unlucky individual, degradation for RPL causes massive outrage
Well I never do that cause the likelyhood of getting a dud CPU is so tiny that's not really worth it. But that's maybe because I buy Intel, I guess if you are buying amd - with such huge DOA numbers - you need to go through that process I guess.
 
Please explain to me how is 13th and 14th worse? Ill like to hear your opinion
For Puget (or any other SI) it should be obvious why it's worse.

For a regular consumer while both are equally bad I think it's fair to say that having something that kicks back errors or fails right away is still better than randomly having issues x amount of time down the road.

The overall percentage may lean towards AMD being slightly worse but without timelines it's just not possible to say definitively they're worse. Whereas we have the timeline data to show increasing failures over relatively short time compared to the CPU lifespan on the Intel side of things.
 
  • Like
Reactions: bit_user
Well I never do that cause the likelyhood of getting a dud CPU is so tiny that's not really worth it. But that's maybe because I buy Intel, I guess if you are buying amd - with such huge DOA numbers - you need to go through that process I guess.
I bought intel for literally since Core duo comes out, and just now getting hit by RPL headache, ppl do this is because back in the days, you need to test ram compatability before, and mobo have duds more often, so one don't want to plug everything in before testing, 99% it's fine but it's a practice that makes testing a lot easier especially building a SFF PC

[edit]
Mind you I choose 12700KF and then get a 14900K is preciously because HISTORICALLY intel is rock stable and more optimized for adobe apps, so I choose that for my video encoding and photoshop needs, and not going AM4/AM5 back then, where I am let down by intel big this time.
They used to be gold standard of CS and reliability, but this degradation is complete mess which could bring them down big. How annoyed the intel customers are? see how the major OEM interview by the verge
 
  • Like
Reactions: bit_user
For Puget (or any other SI) it should be obvious why it's worse.

For a regular consumer while both are equally bad I think it's fair to say that having something that kicks back errors or fails right away is still better than randomly having issues x amount of time down the road.

The overall percentage may lean towards AMD being slightly worse but without timelines it's just not possible to say definitively they're worse. Whereas we have the timeline data to show increasing failures over relatively short time compared to the CPU lifespan on the Intel side of things.
Im not disputing that, that's exactly what Im saying. That if Intel dropped the ball even more to match zen 4, their cpus would fail in shop and the issue would fix itself.

I bought intel for literally since Core duo comes out, and just now getting hit by RPL headache, ppl do this is because back in the days, you need to test ram compatability before, and mobo have duds more often, so one don't want to plug everything in before testing, 99% it's fine but it's a practice that makes testing a lot easier especially building a SFF PC

I've never done this. I don't think I've ever even booted with stock settings. First thing I do is get into the bios, tune the ram, undervolt and I go from there.
 
Im not disputing that, that's exactly what Im saying. That if Intel dropped the ball even more to match zen 4, their cpus would fail in shop and the issue would fix itself.



I've never done this. I don't think I've ever even booted with stock settings. First thing I do is get into the bios, tune the ram, undervolt and I go from there.
Lol, if you tune the ram, undervolt and start from there testing BEFORE putting everything in is even more important as you are lowering the stock setting from the manufacturer, who literally is ALWAYS safe except for this RPL issue, never ever heard that stock settings in the factory putting way too much of power into any cpu since days of Athlon XP and Pentium 3 and need to undervolt, they are historically so darn conservative and safe that you only need to go in, tweak the performance up, not tweak things down and still live for 10+ years, my athlon 64 day one stock voltage OC and survived gaming and daily use since 2003 to 2011, then I got the i7 2600k sandy bridge, again, cheapo cooler, day one OC and survive till 2022 before I got the Alder lake and my current gigabyte mobo, where it stablize after 6 months of bios update which I blamed gigabyte, and upgrade to 14900k preorder, and first time ever I need to do day 1 undervolt just to not make it into a bloody heater, followed by this whole drama of degradation.

https://www.pugetsystems.com/labs/a...n-review/#Motion_Graphics_Adobe_After_Effects

And off track a bit, in Puget's review, the Zen 5 is much more competitive compared to 14th gen i5 and i7, even the evil GN and Jayz reviewed them unfavourably, and the quote says a lot about Puget's confidence level in the current situation:

Second, as many of our more tech-focused readers are likely aware, Intel has recently been in the news for a number of issues causing high failure rates and permanent damage to their 13th- and 14th-generation CPUs. Although we haven’t experienced nearly as much of that issue as others in the industry (see our President’s blog post for our experience and views so far), it is still a factor to bear in mind when comparing Intel and AMD processors. We are confident in our workstation configurations, but we encourage caution when purchasing an Intel CPU right now until we see whether Intel’s promised microcode update addresses the issue.
 
  • Like
Reactions: bit_user
Actually, no. They provided timelines of their Intel CPU failures. We have no such data for AMD.
They didn't provide timelines of when they bought the Intel CPUs. I presume that they know that they have enough installation such that they buy them a tray, or more, at a time. And then install those purchased CPUs over the course of a couple of months. It's probably unlikely that they are buying more than a couple of months worth at a time, but is also unlikely that they're buying them in a just-in-time manner.

Still, we know from Intel's own publication that this issue affects a huge swath of chips, not any particular batch of chips. If it was any particular batch of chips I'm sure they would let everyone know, as that would be better spin than "all of our Raptor Lake chips".

It would be nice to know about the AMD chips as well.
 
It would be nice to see the AMD failures over time, but we know that there was lots of problems with memory compatibility when Ryzen 7000 launched.
We are seeing the same thing today with the 9000 launch.

If I remember correctly, those issues were often CPU specific - the same RAM might work fine in another system with the same model CPU and MB.
If I were building a system, I'd call that a shop failure.

I wish that AMD would wait a month or two and get a more compatible AGESA version before it releases new CPUs.
 
  • Like
Reactions: slightnitpick
Status
Not open for further replies.