News AMD's Inception Fix Causes Up to 54% Performance Drop

Status
Not open for further replies.
Also note that the MariaDB 11 / 4096 clients on IBPB is the big outlier : using safe RET with firmware is a smaller hit, and other DB benches showed far less of an impact. More than likely, a piece of MariaDB code does something that triggers huge number of branch jumping, triggering a lot of overhead, and that may be fixed in a future release - I wouldn't be surprised if Intel's latest mitigation didn't cause similar performance lisses.
 
  • Like
Reactions: bit_user

rluker5

Distinguished
Jun 23, 2014
644
386
19,260
The penalty seems very workload dependent and averages a lot less than 54%.
It seems like it hurts I/O performance a bit like the early spectre fix, which was bad for some games with my large cache 5775c. I have no way of knowing if something similar will happen with the X3D, but I'd look for it to maybe happen with some games that favor that chip over the similar one with regular cache if and when gaming benchmarks for these mitigations come out.

Personally I turned off that spectre mitigation for gaming. Not that these current mitigations should be off for servers, but if you are the only one you are knowingly putting at risk it is your chip, your choice.
 
Aug 16, 2023
9
2
15
Interesting but how would this effect a server user or an average zen3 user? Is the vulnerability worth patching at the performance loss?
 

abufrejoval

Reputable
Jun 19, 2020
344
239
5,060
I guess the biggest takeaway is that clouds will get partitioned into more expensive and riskier, driving up overall cost, as these mitigations most likely cannot be enabled at a VM level.

If you don't share you hardware with any workload you cannot control, it may be well worth considering to disable the mitigations.

I'd also love to see those researchers have a go at the hyperscaler ARM chips to see if they do better.
 

salgado18

Distinguished
Feb 12, 2007
938
380
19,370
Also note that the MariaDB 11 / 4096 clients on IBPB is the big outlier : using safe RET with firmware is a smaller hit, and other DB benches showed far less of an impact. More than likely, a piece of MariaDB code does something that triggers huge number of branch jumping, triggering a lot of overhead, and that may be fixed in a future release - I wouldn't be surprised if Intel's latest mitigation didn't cause similar performance lisses.
I would like to see them benchmark these same software six months from now, with all updates applied. It's possible that a workaround can be done at the software level, and some of that performance loss is mitigated. I mean, MariaDB developers won't just sit still when a firmware cuts half of its performance, right?
 
  • Like
Reactions: Sleepy_Hollowed

kjfatl

Reputable
Apr 15, 2020
181
130
4,760
If there was ever a reason to upgrade a socketed CPU, this may be it. It could be a win-win for both AMD and the customer. The same applies to Intel. (no, it won't be free or be covered by a class action suit if there is any sanity in this world.). New CPU would be inherently faster and would have all known hardware issues fixed.
 

abufrejoval

Reputable
Jun 19, 2020
344
239
5,060
If there was ever a reason to upgrade a socketed CPU, this may be it. It could be a win-win for both AMD and the customer. The same applies to Intel. (no, it won't be free or be covered by a class action suit if there is any sanity in this world.). New CPU would be inherently faster and would have all known hardware issues fixed.
Sorry, but that's naive and wishfull thinking.

The CPUs are exploitable because they cut corners in search for speed. Not cutting corners will cost speed, even if a little less when done properly in hardware.

Unlike in the FDIV case, nobody ever guaranteed that CPUs cannot be exploited via side-channel attacks. All code executes as described in the instruction set manuals. So I don't see any chance for a class action suit to succeed, especially if most people won't be personally affected.

And as you said: news CPUs can only ever fix the known issues, because protecting against out-of-order side channels, requires an in-order architecture and few people want to go back to that speed.

There might be an opportunity for vendors to support in-order "S-cores" for such security critical code, but given the track record for all current hardware security support blocks, I'm not optimistic and would probably prefer a PCIe card (or even an USB-stick) for that, which can be replaced much easier.

Such a design that might even be formally verified on its hardware description language to contain no potential for speculative exploitation, might get the vendor attestation you'd need for a class action suit if it fails anyway. Until then it's much like demanding that your car shan't kill you or anyone even if you're drunk or just not paying attention.
 

wbfox

Distinguished
Jul 27, 2013
80
38
18,570
Interesting but how would this effect a server user or an average zen3 user? Is the vulnerability worth patching at the performance loss?
Do you use anything on something called the world wide internets? If so you use things that all of these vulnerabilities can exploit. And the servers serving you these exploits and more is what puts them at risk.
Is it worth patching? All depends on what you have to lose and how much you need performance: No money, no credentials/identity worth stealing, or activities of interest to other parties and only the need for max FPS in lame shooting title of the day? Then no, you have nothing to worry about. Otherwise, well, again, it all depends.
 
  • Like
Reactions: KyaraM

wbfox

Distinguished
Jul 27, 2013
80
38
18,570
If there was ever a reason to upgrade a socketed CPU, this may be it. It could be a win-win for both AMD and the customer. The same applies to Intel. (no, it won't be free or be covered by a class action suit if there is any sanity in this world.). New CPU would be inherently faster and would have all known hardware issues fixed.
How much were you paid to post this?
 

wbfox

Distinguished
Jul 27, 2013
80
38
18,570
I would like to see them benchmark these same software six months from now, with all updates applied. It's possible that a workaround can be done at the software level, and some of that performance loss is mitigated. I mean, MariaDB developers won't just sit still when a firmware cuts half of its performance, right?
Then bookmark the Phoronix website where Tom's got this story andd the stats, as the Michael, the guy doing the benchmarks of all of the vulnerabilty fixes, does so regularly to see how bad it gets, how much they are able to (if at all) streamline those fixes down the road, etc...
 

bit_user

Polypheme
Ambassador
Definitely one of those articles that leaves a lot to be desired.
  1. The article is quoting the performance of the slowest mitigation method, IBPB, not the default mitigation - which it only mentions at the very end! The Phoronix article also measured was the default mitigation, called "safe RET", which is not quite as secure, but also had less of a performance impact. Then, because the fixed microcode updates haven't finished rolling out, they also tested "safe RET" with the prior microcode version.
  2. The phoronix article tested two different systems: EPYC 7763 and Ryzen 9 7950X! In the latter case, I think the "safe RET" mitigation is the only supported/necessary mitigation(?) and microcode has yet to land. So, "safe RET without microcode" was the only mitigation tested on it.
  3. The article included 47 different benchmarks of the EPYC 7763, which are themselves a subset of thousands of tests in Phoronix' Test Suite, and this article opted to highlight just 9 of them. So, that's a cherry-pick of a cherry-pick.
  4. There were a further 37 benchmarks of the 7950X, some of which I see charts from, in the article. I don't know if the table includes data from any of those, but they shouldn't be mixed because they're different generations and are tested with different mitigations.

I think Toms really needs to do a bit better. This smacks of either laziness or sensationalism. Perhaps a bit of both.

Better yet, Toms (just like anyone else!) can freely download Phoronix Test Suite and run it themselves. You can browse results from a wide diversity of systems, on OpenBenchmarking.org:
 

bit_user

Polypheme
Ambassador
More than likely, a piece of MariaDB code does something that triggers huge number of branch jumping, triggering a lot of overhead, and that may be fixed in a future release
I wouldn't bet on it. In some cases, there's not a lot you can do to make the code faster. If it's inherently branchy, then that's just how it is. As long as there aren't a significant number that can be easily be converted to a couple conditional moves, then you're just out of luck.

I wouldn't be surprised if Intel's latest mitigation didn't cause similar performance lisses.
Are you talking about Downfall? That has a very different performance impact, mostly slowing down heavy users of AVX2 & AVX-512 gather operations.

If there was ever a reason to upgrade a socketed CPU, this may be it. ... New CPU would be inherently faster and would have all known hardware issues fixed.
No, because Zen 4 is also affected by the Inception vulnerability! If you look at the phoronix article, you can see the test results for "safe RET without microcode" on Ryzen 9 7950X starting about 2/3rds the way through the article.
 
Last edited:

bit_user

Polypheme
Ambassador
The CPUs are exploitable because they cut corners in search for speed. Not cutting corners will cost speed, even if a little less when done properly in hardware.
It's not a fair characterization to say they "cut corners". That was true of prior generations, but I believe current CPUs were designed mostly during/after Spectre & Meltdown, so they knew they had to try and cover their bases. It's just that the eye-watering complexity of modern CPUs makes that a non-trivial exercise.

protecting against out-of-order side channels, requires an in-order architecture and few people want to go back to that speed.
This isn't strictly accurate. Some of the problems we've seen in the past were due to things like branch predictor state being shared between SMT threads or not cleared upon context-switches. Those fall in the category of cut-corners and I think are no longer being done.

There's some truth to what you're saying, though. If they were simpler and less ambitious, they'd definitely be easier to get right.
 

Hotrod2go

Proper
Jun 12, 2023
115
28
110
Reckon it won't be long now before performance hits with AM5 start showing up due to some kind of undiscovered malware bug... this circle goes on & on & on...
 

bit_user

Polypheme
Ambassador
Has phoronix taken over tomshardware? Watching your newsfeed closly one could come to that conclusion.
It's not only them. I've seen a couple other news sites report on Phoronix' benchmarks. It's a lot less work than running the tests, yourself.

As long as Phoronix is properly credited and linked, I suppose it's okay. I know he's aware of it, and I think only takes issue if he's not properly credited.
 

bit_user

Polypheme
Ambassador
Question.

I own a Ryzen 9 7950x3d chip.

1. Does this affect me?

2. How do I implement the fix? Do I need a bios update or what?
  1. As far as I know, all Zen 3 & 4 cores are affected. Definitely double-check that it includes you, but assume it does until you confirm otherwise.
  2. There's an OS-level mitigation as well as a microcode update. You get microcode either from your motherboard BIOS or it gets loaded at startup, by the OS.

In the case of IBPB, there might be no further benefit provided by a microcode update. That's what phoronix seems to imply, but I'm not 100% sure. I also don't know if IBPB even applies to Zen 4, because he didn't test that mitigation on it.
 
  • Like
Reactions: sitehostplus
Status
Not open for further replies.