News Ryzen Burnout? AMD Board Power Cheats May Shorten CPU Lifespan

Page 5 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Perhaps, just because it would be better for good will, but there's no real way to know without trying. Anyone here actually have a dead Zen or later CPU they've tried to get replaced? It seems like it would be relatively easy for AMD to have some small indicator in a chip that would 'blow' to indicate over-current, over-voltage, or over-power operation.

My own experience with RMAs has been checkered -- not for CPUs, but for other parts. I have had multiple GPUs fail on me over the years. One, an R9 290X, apparently burned itself out. All I knew was that it stopped working, so I sent it in. Three weeks later I got a response back saying my RMA was denied "due to physical damage" with a photo showing a circled resistor or something. Can't say it made me happy, since it was my own card and not a review sample.

I saw a tech reviewer (can't recall which one, BitWit, maybe), try out the return process with Intel (Intel specifically) after "overclocking". He tried four-five times to see the variance between reps he talked to. The reps never asked specifically about CPU OCing - but the first question they asked was "what speed was your ram running at?" -- if you answer higher than 2666 - they say, that is running beyond our spec, and the CPU may have been damaged as a result, so its your fault.

But after playing a bit dumb and trying really hard I think he was able to get the return on 3 out of the 5 (sorry memory is sketchy, but that's ballpark). So it seems like the individual rep you talk to has a say in whether they try to deny the return or not. There was no physical inspection of the chip at all, the decision was made over the phone based on how well you could convince the rep that they should give you a replacement.
 
Last edited:
I saw a tech reviewer (can't recall which one, BitWit, maybe), try out the return process with Intel (Intel specifically) after "overclocking". He tried four-five times to see the variance between reps he talked to. The reps never asked specifically about CPU OCing - but the first question they asked was "what speed was your ram running at?" -- if you answer higher than 2666 - they say, that is running beyond our spec, and the CPU may have been damaged as a result, so its your fault.

But after playing a bit dumb and trying really hard I think he was able to get the return on 3 out of the 5 (sorry memory is sketchy, but that's ballpark). So it seems like the individual rep you talk to has a say in whether they try to deny the return or not. There was no physical inspection of the chip at all, the decision was made over the phone based on how well you could convince the rep that they should give you a replacement.
I watched that. Gamers Nexus.
View: https://www.youtube.com/watch?v=I2gQ_bOnDx8
 
Eh, I'm really not a fan of this 'hey, we're just asking questions!' defense. Even if TH had innocent intentions, this approach is something you see all the time in news and politics as a way to insinuate something without actually having any evidence. The publication can then point to the question mark if anyone accuses them of dishonesty (or of libel, depending on the situation).

So let's say someone publishes an article with the headline "Does TJ Hooker kick puppies?". Sure, the article itself could go on to clarify that no, there's no reason to think that TJ Hooker does that, but we all know (including the publisher) that many people will only see the headline. So we've now planted the idea in peoples heads that whether TJ Hooker kicks puppies is up for question, which implies a non-zero chance that he does. And that might end up being repeated and twisted in other articles to become something like "Some people are saying that TJ Hooker kicks puppies" ("some people say" is another sneaky way to make poorly supported accusations).

I'm being a little dramatic here, but the 'just asking questions' phenomenon, particularly in headlines, is a bit of a pet peeve for me because the practice is so rife with cases of intellectual dishonesty. Frankly, I don't think it would have been hard to predict that people might read too much into the title as it's written, nor would it have been difficult to avoid this (if desired) with different wording.
Here's the thing: there's at least one motherboard maker (Paul will say who in a follow-up article) that is really pushing the limits as to what you should do with a Zen 2 processor. There are others that are 'cheating' but only by 7-10% on most boards. If I want to overclock, let me assume the responsibility -- don't do it with absolutely zero indication to the user what's happening.

It's bad enough that were I using a board with this 'feature' and saw how much power the CPU was actually pulling at 'stock' I would be alarmed. And I have had this same reaction to some of the Skylake-X parts back in the day on certain X299 motherboards.

This was not intended to cast AMD in a bad light. The text of the article makes all of this very clear. It's the motherboard makers cheating the system, and potentially causing a lot of headaches for users. The headline is more eye grabbing, sure, but this is a PSA for anyone running a Zen 2 chip.

This is a major controversy for the involved parties -- not AMD, but specifically at least one motherboard maker. Having a 95W chip pull down over 200W at stock is not acceptable behavior. If you have a Zen 2 chip, it's easy to test the behavior of your motherboard. For a lot of people, it's probably going to be no big deal, but some boards people will realize, "Holy crap, THAT'S why my system seemed to run hot!"

I know in testing Ryzen 3000 last year, I had a couple of boards that at the time behaved weird. One of them performed quite a bit better at "stock" than with PBO turned on. Power draw (at the system level) for a few boards was a lot higher than on the main board I used for the reviews at PC Gamer. One board, I saw about 200W of power at 'stock' during Cinebench, but 150W when I turned on PBO. All I could think at the time was, "WTF, PBO must be broken on this board," when in fact PBO was probably working as intended but stock performance was artificially high.

This 'cheating' finally explains why that was happening, and frankly it pisses me off. I wasted hours trying to get a test system to run 'correctly' and couldn't figure out what was going wrong. In hindsight, I'm sure at least one of the boards was just pushing things way too far.
 
Steve at Gamers' Nexus addressed it too, if ya'll rather see that...
View: https://www.youtube.com/watch?v=10b8CS7wQcM


After checking that out, my view is now:
Nothing new to see here. It's just the usual crap from motherboard vendors trying to one-up each other...
Although it does kinda paint Asus in a better light compared to the other vendors, as they seem to be sticking the closest to Intel and AMD's guidelines for their boards out of the box.
This was exactly my hypothesis in a post on the first page of the comments. Go figure that it was true.

@JarredWaltonGPU I have been doing a lot of testing on my 3900x system (ASUS x570-f, BIOS 1407, AGESA 1.0.0.4). The option in the BIOS for my motherboard which is called Performance Bias with a drop-down and lists 3 benchmark programs including R20 and an Auto setting. When the R20 option is used it changes the deviation reported by HWinfo by about 3% on average. I wonder if this is relevant. Anyway, I am looking forward to the next article on the topic by TH.
 
Last edited:
Ok, now i would say this news is nothing more than a clickbait... The headlines are totally wrong....

I believe many have read the article posted at anandtech..in case you didnt.

https://www.anandtech.com/show/15839/electromigration-amd-ryzen-current-boosting-wont-kill-your-cpu

This is not even PBO (which is supposed to void your warranty). This is NOT what overclockers normally do to their cpu. So i wont even call it same as what overlcocking does....

1. It cannot go above preset clocks and voltage. Your 3600 will not become 3600x.

2. Its similar to gpu power limit. We increase the limit to reduce throttling under load. But this slider does NOT allow you to increase your clockspeed and voltage beyond the preset values.

3. Both and and intel already doing it.. one good its their U series.. default 15w but able. To change it to 10-25w. Of course better cooling is needed at 25w tdp.
 
Good to know that AMD has chimed in with a statement reassuring that this won't cause imminent (within a couple years time) harm to their CPU's.
I agree.

If people ever do see an issue with any CPU / motherboard combo, I hope they are at least as detailed as the thread-starter over at the CPU vendor forums who declared his board, voltages, etc., and posted a follow-up Dec. 24th on his results with under-volting (which is what I was researching as a favor for someone with an 1800X who wants some tests run on my son's 3800X):
Normal All Core Voltage
Anecdotal, certainly (as could be related to motherboard, power supply, cooler, etc.). But at least he provided some real information that could prove useful when combined with (only) hundreds of other similar reports detailing MB, CPU and voltage settings. Ideally, the Auto Overclocking built into some BIOS would automatically provide both floor and ceiling data that gets stored as a reference... I'm asking too much.
 
  • Like
Reactions: helper800
I know in testing Ryzen 3000 last year, I had a couple of boards that at the time behaved weird. One of them performed quite a bit better at "stock" than with PBO turned on. Power draw (at the system level) for a few boards was a lot higher than on the main board I used for the reviews at PC Gamer. One board, I saw about 200W of power at 'stock' during Cinebench, but 150W when I turned on PBO. All I could think at the time was, "WTF, PBO must be broken on this board," when in fact PBO was probably working as intended but stock performance was artificially high.
Yeah you are confusing something here,this is from the article of this topic.
The bioses reduced the power draw that they were reporting to the CPU (and to the monitoring softwares) they did not increase the power draw numbers that you could see they decreased them which has two results first the CPU will always run as fast as possible and won't throttle which could potentially cause harm because it could run above TDP for ever although the chances of harm are very slim and second of course the reported power draw is way below what it actually is why we have so many reviews praising Ryzen for low power draw...yeah due to mobos under-reporting certain power telemetry data.
We are aware of the reports claiming that select motherboards may be under-reporting certain power telemetry data that could alter the performance and/or behavior of AMD Ryzen processors under certain conditions. We are looking into the accuracy of these reports.
 
Yeah you are confusing something here,this is from the article of this topic.
The bioses reduced the power draw that they were reporting to the CPU (and to the monitoring softwares) they did not increase the power draw numbers that you could see they decreased them which has two results first the CPU will always run as fast as possible and won't throttle which could potentially cause harm because it could run above TDP for ever although the chances of harm are very slim and second of course the reported power draw is way below what it actually is why we have so many reviews praising Ryzen for low power draw...yeah due to mobos under-reporting certain power telemetry data.

It appears you're suggesting the Ryzen's low power vs current Intel has nothing to do with the nodes they are on, but rather, they only appear low because of this situation ... we can easily evaluate that ...

When your measurement for power draw is full system draw at the wall (which it sometimes is), you are measuring actual. Whatever is being reported anywhere within the system is irrelevant to this number.

So, nice theory, but no ....

power-consumption-645x499.png
 
  • Like
Reactions: bit_user
Yeah you are confusing something here,this is from the article of this topic.
The bioses reduced the power draw that they were reporting to the CPU (and to the monitoring softwares) they did not increase the power draw numbers that you could see they decreased them which has two results first the CPU will always run as fast as possible and won't throttle which could potentially cause harm because it could run above TDP for ever although the chances of harm are very slim and second of course the reported power draw is way below what it actually is why we have so many reviews praising Ryzen for low power draw...yeah due to mobos under-reporting certain power telemetry data.
Lol is it another one of your jokes?
You remember the joke you made that the 3950x is losing in multi threaded app to the i9 9900k?
And the joke about that no one use AVX and there is no need the measure it?
And the other joke when you told us about the 10900k average watts, 0.1% watts, 1% watts and said it is like your gaming fps and we should ignore peak watts like we ignore peak fps in gaming?
Peak watts = all cores load, why should we ignore it? Is it causing micro stuttering and you should sync it with your g-sync monitor?

Is your brain most of the time in sleep mode?
 
Last edited:
Yeah you are confusing something here,this is from the article of this topic.
The bioses reduced the power draw that they were reporting to the CPU (and to the monitoring softwares) they did not increase the power draw numbers that you could see they decreased them which has two results first the CPU will always run as fast as possible and won't throttle which could potentially cause harm because it could run above TDP for ever although the chances of harm are very slim and second of course the reported power draw is way below what it actually is why we have so many reviews praising Ryzen for low power draw...yeah due to mobos under-reporting certain power telemetry data.
You're not understanding, at all, and making some incorrect assumptions because of that. I had a power meter plugged into the wall -- I do not trust software power readings because the drivers and firmware can lie. So, looking at total system power, turning on PBO on one board dropped the peak power use while running Cinebench from 200W to around 150W. I think it was a Gigabyte board, but it might have been Asus. It's been a year now, sorry if that's too fuzzy, and those numbers are +/- 10W. All I know is things were hectic at the time of the Zen 2 + Navi 10 launch, and some of the first X570 boards sent to reviewers felt raw. I ended up settling on an MSI X570 Godlike for additional testing after a BIOS update.
 
You're not understanding, at all, and making some incorrect assumptions because of that. I had a power meter plugged into the wall -- I do not trust software power readings because the drivers and firmware can lie. So, looking at total system power, turning on PBO on one board dropped the peak power use while running Cinebench from 200W to around 150W. I think it was a Gigabyte board, but it might have been Asus . It's been a year now, sorry if that's too fuzzy, and those numbers are +/- 10W. All I know is things were hectic at the time of the Zen 2 + Navi 10 launch, and some of the first X570 boards sent to reviewers felt raw. I ended up settling on an MSI X570 Godlike for additional testing after a BIOS update.
Then your issue has probably nothing to do with this,this is about review bioses reporting less power draw and thus causing the CPU to use more power.
If your bios used 200W and also reported 200W then it's a different thing,if it used 200W and reported 150W or even less,which is a thing you did not check,then it would be connected to this issue.
 
Then your issue has probably nothing to do with this,this is about review bioses reporting less power draw and thus causing the CPU to use more power.
If your bios used 200W and also reported 200W then it's a different thing,if it used 200W and reported 150W or even less,which is a thing you did not check,then it would be connected to this issue.
Do you even try to comprehend what people write? Your response doesn't even make sense -- "this is about review bioses reporting less power draw..." No, it's about non-review BIOSes and stock behavior reporting less power draw, causing the CPU to effectively exceed its normal TDP and also run faster. Can I guarantee it was this specific 'cheat' that was being used? No, but if it wasn't this, it was something that resulted in the same sort of behavior. And it was definitely annoying and frustrating.

To be as explicit as possible, I had multiple boards at the Zen 2 launch. One of the boards for sure consumed more power at full CPU load running "stock" than it did when I enabled PBO, and also performed better at 'stock' -- my initial impression was that PBO was broken on the board. In reality, it was that PBO probably turned back on the correct reporting of power or whatever, which dropped performance. And a 50W discrepancy under load is freaking huge, especially when going from 'stock' to 'overclocked'!

What's being shown now is that certain motherboard firmware lies to the CPU about amperage / power, so that the CPU thinks it's using less power than it actually is and ends up running slightly overclocked and thus consuming more power. It raises the power limit, without actually letting users know that's what it's doing. However a board does this, it's improper behavior and should at least have an option to turn it off.

At the time, I swapped the CPU to a different board, tested again, and guess what: Power use was significantly lower than the first board (around 50-70W lower at the outlet IIRC). Performance was also a few percent lower at stock, but enabling PBO on the new board improved performance slightly, basically matching the first board. Power use also increased a bit, but not to the ~200W level of the first board. My assessment at the time was that the second board had better firmware and so I used it instead.

This was all power measurements at the outlet, so there was no software or BIOS misinformation skewing what I was seeing. It was a CPU at 'stock' running faster, probably because of this exact issue or something very similar. Of course at launch the vendors want the motherboards and CPU to look as fast as possible, but it more often than not causes problems and frustrations rather than impressing with the 1-3% performance boost. Did I check the software power reporting? No, because it wasn't necessary to see that something was off. The CPU (3900X) also ran hotter in the first board by 5-10C under load. Maybe it was something else, but it was bad enough that I didn't use that particular board for the launch review: https://www.pcgamer.com/amd-ryzen-9-3900x-review/

Lots of high-end enthusiast motherboards 'cheat' in similar ways to appear a bit faster, so in some ways this is all just typical behavior. But the good boards run true 'stock' when you tell them to, and then let you enable "performance enhancement" or whatever they want to call it to boost performance. Undisclosed overclock, overvolting, or exceeding power limits may or may not cause problems, but it's not correct default behavior.

The motherboard vendors have been cheating in various ways going back at least to the Pentium 4 era, and almost certainly before. FSB overclocks of 2-5% used to be common and sometimes still pop up. Removing power limits to boost performance is often the default behavior on enthusiast boards if you don't explicitly turn it off. And yet, the default behavior isn't to engage XMP profiles, which would almost certainly affect performance as much as these other shenanigans ... except that can also cause instability and compatibility.

And for what it's worth, nothing I've seen with power use on Zen 2 even remotely compares to some of the shenanigans of the X299 launch. I had a Gigabyte board that peaked at 450W power use during a Cinebench run, with an i9-7900X, before the system shut off -- at "stock." The CPU was running at 4.5GHz on all 10 cores, and radically exceeding safe power and temperature limits. I wrote about that as well: https://www.pcgamer.com/the-ongoing-testing-of-intels-x299-and-i9-7900x/
 

Ran cinebench r20 it reports 30-31%. 3800x x570 asrock taichi BIOS 2.10 1.0.0.3 ABBA. Performance drops if I enable PBO scalar x1. Only scalar x10 is faster than stock.
 
Last edited:
  • Like
Reactions: bit_user
AMD's official response seems in line with the reasoning expressed by an elaborate article by Ian Cuttress (Anandtech) on why he thinks that this misreported data by motherboards may not significantly degrade the CPU.

However, there are two elements of AMD's response that strike me:

First, it's the phrase "These safeguards enforce the safety and reliability of the processor during stock operation."

If a mainboard vendor has been tinkering and altering sensor data reporting to the CPU, then I guess this already is outside of "stock operation", which would allow to speculate that AMD has possibly/probably not yet tested these "safeguards" under such artifical conditions created by motherboard vendors.

Secondly, this is why AMD is extremely careful when expressing their "belief": "Based on our initial assessment, we do not believe that altering external telemetry [...] would have a material impact on the longevity [...] of a user's processor."

I love AMD, but what some people tend to forget is that feedback is essential for improvement. Therefore, the discussion generated by HWInfo's revealings is actually a very good feedback for AMD that will hopefully nudge them to actually test these particular motherboard "hacks" and evaluate how much they may - or may not -impact longevity / degradation of their CPUs.
 
Last edited:
  • Like
Reactions: JarredWaltonGPU

Ran cinebench r20 it reports 30-31%. 3800x x570 asrock taichi BIOS 2.10 1.0.0.3 ABBA. Performance drops if I enable PBO scalar x1. Only scalar x10 is faster than stock.
I think you may have found the worst offender -- ASRock. That seems to be the general consensus going around right now. Asus has the least deviation, ASRock the most, MSI and GB are somewhere in between, usually closer to the 5-10% range.

Do you have a power meter at the wall where you can see how much power the system actually draws? It would be very interesting to know the stock / PBO numbers you're getting. I also wonder how HWiNFO64 determines the deviation -- is it based off some internal table for the program, or is there something else it can measure to know what the "real" power is? Because if it can figure out what the CPU is really using, shouldn't AMD's software / hardware be able to do that as well?
 
I think you may have found the worst offender -- ASRock. That seems to be the general consensus going around right now. Asus has the least deviation, ASRock the most, MSI and GB are somewhere in between, usually closer to the 5-10% range.

Do you have a power meter at the wall where you can see how much power the system actually draws? It would be very interesting to know the stock / PBO numbers you're getting. I also wonder how HWiNFO64 determines the deviation -- is it based off some internal table for the program, or is there something else it can measure to know what the "real" power is? Because if it can figure out what the CPU is really using, shouldn't AMD's software / hardware be able to do that as well?

With HWinfo, once you know your deviation, you can make a multiplier for the Package power readout to display closer to the actual full load power draw. That doesn't correct the misreporting, however.
 
  • Like
Reactions: helper800
Perhaps, just because it would be better for good will, but there's no real way to know without trying. Anyone here actually have a dead Zen or later CPU they've tried to get replaced? It seems like it would be relatively easy for AMD to have some small indicator in a chip that would 'blow' to indicate over-current, over-voltage, or over-power operation.
I have no proof that AMD does this, but in the case of the Raspberry Pi, running above a certain frequency sets a bit within the Broadcom SoC itself. It's likely that AMD has a similar mechanism in place, but it's the question whether that is ever checked when a CPU has failed. There's a good chance that you will get warranty coverage if you start the RMA process via the retailer.
 
  • Like
Reactions: JarredWaltonGPU
has anyone checked to see if this applies only to x570 or is possibly also on other chipsets?

just curious really as i have a b450 board and wonder if i need to look into this at all? i have PBO enabled anyway and that seems to disable this exploit either way. but just curious
 
  • Like
Reactions: bit_user
If no one wanted to be a bad cop to make more money this strategy wouldn't be widely used in media today.
Well, based on how this article was reported, it's clear that Tom's was covering it an evolving story.

On Anandtech, Ian chose to address more the underlying concern, which is also appreciated. However, it should be noted that Ian didn't cite any sources or provide any data to back up his claims.

So, Tom's is playing a more journalistic role, while Ian is playing the role of an Op/Ed by an expert. Yes, differences in coverage, but I don't think the "good cop/bad cop" analogy fits or is fair to either.

If you go to Anand's comment board
No, let's not.

Its all just perspective ... humans don't see reality, they see their perceptions that is created by their biases and programming. Racism is a purely learned trait, for example - no human is born racist.
People are inherently tribal, and suspicious of outsiders. However, the brain is flexible in how it draws those lines. That's the part which is learned.
 
This still doesn't make sense to me. Even if the motherboard is lying, I'd still expect it to report 100% under full load (i.e. CB R20), assuming you're operating at power limits rather than voltage or thermal limits. It's just that what it's calling 100% might be 130W, rather than 95W as it's supposed to be.
I'll admit that it sounds a little weird. According to Ian's piece, there are 3 ways the CPU can be limited, and one of those is by the VRM.

If we ask ourselves what the point of a VRM-based limit is, the most obvious rationale is that the CPU is trying not to overtax the board's power-delivery capacity. In that case, a board with a beefier VRM should report lower relative load, for the same Watts being delivered to the CPU.

However, that's not what seems to be happening. Instead, it's like AMD is trying to use the VRM to help it lock the CPU into a certain power envelope. In this case, the motherboard not only needs to have calibration data on its VRM, but it should also scale that according to the power envelope of the specific CPU model that's installed. This is the only way the sensor is going to deliver a value of 255 - corresponding to 100% load at the VRM, regardless of the CPU model that's installed.

If the CPU was still being limited to its official power limits, there wouldn't be any difference in performance and there'd be no point in the mobo manufacturer skewing the parameter in the first place.
You're oversimplifying it. There are 3 limiting factors (PPT, TDC, and EDC), and the motherboard is fudging one of them. Yes, the CPU will eventually be limited by the other two, but maybe not as quickly.

And if it's not abiding by the official limits, then you have no reference against which to compare the potentially bogus values that the motherboard is reporting.

So again, I don't see how they could know the deviation without first knowing the "true" value.
Again, the testing methodology is to apply a workload which should max the Package Power. If it doesn't then you know the motherboard is lying.