News AMD Set to Substantially Increase Microcode Size of Future CPUs

bit_user

Titan
Ambassador
Plenty of room for future malware to be installed?
If a program or hacker managed to update the microcode in your CPU, it's game over. Even with the current size limits.

Microcode loading normally happens rather early in the boot sequence and would require root/admin privileges, at minimum. I could believe you can't even do it past a certain point.
 
  • Like
Reactions: PaulAlcorn

rluker5

Distinguished
Jun 23, 2014
911
594
19,760
How exactly?
Maybe if somebody found a way to alter the C:>Windows>System32>mcupdate_AuthenticAMD.dll file that rewrites the microcode your system is using at windows boot?

I've removed the Intel equivalent to get around microcode security updates in the past by changing the owner of the file and moving it. But it has been around since a little after Spectre at least and has been secure all that time AFAIK.

Edit: One way you can check if you are having your microcode changed by windows is to see what your bios says, then what HWinfo64 says. If they are different then Windows is changing it. When I moved said file then HWinfo64 agreed with my bios and I got my pre mitigation performance back, at the expense of security ofc. If I wanted the security back I just moved the file back to it's original place. This might be handy for bugs? I personally stopped moving the file when I moved on from my old 5775c so it has been a while.
 
Last edited:
  • Like
Reactions: prtskg

bit_user

Titan
Ambassador
Maybe if somebody found a way to alter the C:>Windows>System32>mcupdate_AuthenticAMD.dll file that rewrites the microcode your system is using at windows boot?
Yeah, but if a hacker gets root, you're hosed.

AFAIK, microcode updates aren't persisted in the CPU across boots, so it's not a place where someone could hide a root kit. Therefore, it's no more special than tons of other exploits someone could do with admin privileges.
 
  • Like
Reactions: PaulAlcorn

rluker5

Distinguished
Jun 23, 2014
911
594
19,760
Yeah, but if a hacker gets root, you're hosed.

AFAIK, microcode updates aren't persisted in the CPU across boots, so it's not a place where someone could hide a root kit. Therefore, it's no more special than tons of other exploits someone could do with admin privileges.
You are right that the updates aren't persistent across reboots and don't stick with the CPU in any way other than the current windows session.

There may be a way to mess with encryption, or disable it by altering the microcode but that is just my conjecture. I don't know anything of how microcode alterations could affect security processes.

You definitely need admin privileges to change the owner from trustedinstaller, change the permissions and move the file and there are probably a lot of easier ways to snoop or harvest as an admin.

I mostly brough it up as a possibility, not a viable option.
That and I did have improved performance with an older microcode with the 5775c vs the ones Windows inserted. I also tested a variety with a 4770k and saw no noticeable improvement with any of them. Perhaps, if sometime in the future, some microcode update came out with a performance regression for some, then reverting the microcode to the bios version would be useful again. It might even be useful for those with early Alder chips with older bioses to keep their 512 going if an update snatches that away.
 
  • Like
Reactions: bit_user
The thing that would strike me as really bad is if the OS blindly trusts any and all microcode updates. That is, if there's no digital signage on the microcode update itself that either the OS or CPU could check, then I'm going to have to face palm really hard.
 
  • Like
Reactions: bit_user

bit_user

Titan
Ambassador
The thing that would strike me as really bad is if the OS blindly trusts any and all microcode updates. That is, if there's no digital signage on the microcode update itself that either the OS or CPU could check, then I'm going to have to face palm really hard.
I assume the loader does an integrity check on the payload, but that's probably more to protect against unintended corruption.

They could encrypt it with a key that only the CPU manufacturer knows, and then have the CPU decrypt the microcode as it's loaded, but then if the manufacturer ever went out of business (or just flat out refused to issues microcode fixes for old CPUs, as I think Intel did with some of the side-channel vulnerabilities), that would prevent anyone else from offering their own microcode fixes.
 
Last edited:

bit_user

Titan
Ambassador
that zen4c look interesting... when ?
Yeah, the embargo lifted on AMD's Bergamo (128-core server CPU), Wednesday morning. Other sites already have benchmarks up.

Not only that, but Genoa + 3D V-Cache benchmarks are also out. It's interesting to see which workloads perform better with fewer cores + more L3 cache or more cores with less cache. Definitely some surprises, for me.
 

TJ Hooker

Titan
Ambassador
I assume the loader does an integrity check on the payload, but that's probably more to protect against unintended corruption.

They could encrypt it with a key that only the CPU manufacturer knows, and then have the CPU decrypt the microcode as it's loaded, but then if the manufacturer ever went out of business (or just flat out refused to issues microcode fixes for old CPUs, as I think Intel did with some of the side-channel vulnerabilities), it that would prevent anyone else from offering their own microcode fixes.
I don't think you'd need to encrypt it, it just needs to be signed. Then the motherboard FW or OS (depending on where the microcode is being loaded from) can verify the signature. Could be signed by the CPU maker, mobo maker, OS maker, or some combination thereof.
 
  • Like
Reactions: bit_user

rluker5

Distinguished
Jun 23, 2014
911
594
19,760
I used to use UEFI bios updater to swap out microcodes on my fancier motherboards. https://www.majorgeeks.com/files/details/uefi_bios_updater.html Cheap motherboards wouldn't have bios flashback buttons or be like EVGA that took anything, and would only take unadulterated signed bioses. But a lot of boards would take altered ones. UBU acted like it had a whitelist for the microcodes to me, but others were able to get the newest, non matching ones in there somehow.

That method of switching them has ceased due to lack of interest, but it kind of shows the security isn't absolute. All sorts of amateur novices were messing around for giggles as exampled here: https://www.overclock.net/threads/i...ocode-through-software.1643053/#post-26474193

Somebody that really knew what they were doing and had admin privileges may be able to access the current one used and swap it out for an altered version. One plausible scenario would be if some site like Techpowerup, MajorGeeks or Guru3d had some mcupdate files that had a microcode that gave some better performance if you swapped them, some people would. And if some knock off download site popped up in the search they could have some malware modified microcodes for people to put in themselves. Or the bad actors could just put the installer in with a different one like how Google Chrome always used to get on people's PC's.

I don't think microcode hacking is impossible, it just might not be feasible.
 
  • Like
Reactions: bit_user

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,452
996
20,060
Yeah, the embargo lifted on AMD's Bergamo (128-core server CPU), Wednesday morning. Other sites already have benchmarks up.

Not only that, but Genoa + 3D V-Cache benchmarks are also out. It's interesting to see which workloads perform better with fewer cores + more L3 cache or more cores with less cache. Definitely some surprises, for me.

Wendell from Level1Techs brings up a good point, what if AMD decided to swap 1x of those CCD's on normal DeskTop Ryzen to counter Intel's "E-cores" strategy.

Then most of the power budget can go to the "Non-C cores CCD" to boost Frequency.

It would offer a interesting combo.

For Dual CCD Asymmetric Pairs you can create interesting combos:
1) X3D (Cache Optimized) + Regular CCD (Frequency Optimized) <- Optimized for Gaming & Dev Work
2) X3D (Cache Optimized) + C-cores CCD (Parallel Work Optimized) <- Optimized for Heavy Multi-Threading or Multi-Threading that takes up alot of Cache
3) Regular CCD (Frequency Optimized)+C-cores CCD (Parallel Work Optimized) <- Optimized for Heavy Parallel Work Loads or Parallel Work Loads that can benefit from High Frequency but limited work time.

If you create a 4x CCD Ryzen based platform (Let's call it Ryzen FX) that's targeted at the Prosumer/Light WorkStation market.

You can create a unique Assymetric setup for Developers who need to test different CCD configs.
1) Regular CCD (Frequency Optimized)
2) X3D CCD (Cache Optimized) CCD
3) C-cores CCD (Parallel Work Optimized)
4a) L4 $ SRAM CCD (Just dump in a large CCD that is pure SRAM and designed as a victim cache for L3$), this would help feed the other 3x CCD's.
4b) Put in a large FPGA CCD so developers can test if their code-path would work better with a FPGA
 

bit_user

Titan
Ambassador

Wendell from Level1Techs brings up a good point, what if AMD decided to swap 1x of those CCD's on normal DeskTop Ryzen to counter Intel's "E-cores" strategy.

Then most of the power budget can go to the "Non-C cores CCD" to boost Frequency.

It would offer a interesting combo.

For Dual CCD Asymmetric Pairs you can create interesting combos:
1) X3D (Cache Optimized) + Regular CCD (Frequency Optimized) <- Optimized for Gaming & Dev Work
2) X3D (Cache Optimized) + C-cores CCD (Parallel Work Optimized) <- Optimized for Heavy Multi-Threading or Multi-Threading that takes up alot of Cache
3) Regular CCD (Frequency Optimized)+C-cores CCD (Parallel Work Optimized) <- Optimized for Heavy Parallel Work Loads or Parallel Work Loads that can benefit from High Frequency but limited work time.

If you create a 4x CCD Ryzen based platform (Let's call it Ryzen FX) that's targeted at the Prosumer/Light WorkStation market.

You can create a unique Assymetric setup for Developers who need to test different CCD configs.
1) Regular CCD (Frequency Optimized)
2) X3D CCD (Cache Optimized) CCD
3) C-cores CCD (Parallel Work Optimized)
4a) L4 $ SRAM CCD (Just dump in a large CCD that is pure SRAM and designed as a victim cache for L3$), this would help feed the other 3x CCD's.
4b) Put in a large FPGA CCD so developers can test if their code-path would work better with a FPGA
Probably unrelated to the article, but one configuration you forgot to enumerate was an iGPU tile. I was pretty sure AMD would go down that route, especially when they "solved" the bandwidth problem via Infinity Cache.

Anyway, AMD's VP of Client Computing says AMD isn't planning on doing hybrid-core CPUs for the desktop. I guess he doesn't consider the half X3D CPUs hybrids.

 

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,452
996
20,060
Probably unrelated to the article, but one configuration you forgot to enumerate was an iGPU tile. I was pretty sure AMD would go down that route, especially when they "solved" the bandwidth problem via Infinity Cache.
So far, AMD wants to leave the iGPU in the cIOD, they don't want to put it into the CCD.
They probably have their reasons for doing so, but that's the route they seem to be going.

Anyway, AMD's VP of Client Computing says AMD isn't planning on doing hybrid-core CPUs for the desktop. I guess he doesn't consider the half X3D CPUs hybrids.
If you don't consider X3D CPU's a "Hybrid".

Then Zen #C cores aren't a "Hybrid" either, it's just a different L3$ configuration.

It's not like a truely different architecture like what Intel is doing with P/E cores.
 

bit_user

Titan
Ambassador
If you don't consider X3D CPU's a "Hybrid".

Then Zen #C cores aren't a "Hybrid" either, it's just a different L3$ configuration.

It's not like a truely different architecture like what Intel is doing with P/E cores.
Well, although I believe the throughput and latency of instructions is the same number of clock cycles between Zen 4 and 4C, I'm pretty sure the density-optimizations they did in the 4C mean its peak clock speed is even lower than the CCDs with the 3D V-Cache.

So, in that sense, there's a slightly stronger case to call them "hybrid", because their neither clock as high nor have extra L3 cache. In every way (other than efficiency), they're worse than the regular Zen 4 cores. They truly are "efficiency" cores, in that the main reason they'd be mixed with regular Zen 4 cores would be to improve efficiency.

Even so, the performance difference is clearly less than what we see between Intel's P and E cores, or else 128-core Begramo wouldn't be wiping the floor with 96-core Genoa, on so many benchmarks.
 

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,452
996
20,060
Well, although I believe the throughput and latency of instructions is the same number of clock cycles between Zen 4 and 4C, I'm pretty sure the density-optimizations they did in the 4C mean its peak clock speed is even lower than the CCDs with the 3D V-Cache.

So, in that sense, there's a slightly stronger case to call them "hybrid", because their neither clock as high nor have extra L3 cache. In every way (other than efficiency), they're worse than the regular Zen 4 cores. They truly are "efficiency" cores, in that the main reason they'd be mixed with regular Zen 4 cores would be to improve efficiency.

Even so, the performance difference is clearly less than what we see between Intel's P and E cores, or else 128-core Begramo wouldn't be wiping the floor with 96-core Genoa, on so many benchmarks.
But Intel defines their "Hybrid-ness" by having actually different architectures with different Instruction set support. Whether that is Good/Bad is a question that is out of the scope of this thread we're on.

But the similarity between the Zen # cores & Zen #C cores is so similar, that it's something closer than what a "Hybrid" would be.

What word would you use beyond "Hybrid"?

I'd argue that they're "Heterogeneous" but not a true "Hybrid" like Intel's setup.

I know, Tomatoe, To-ma-to!
 

bit_user

Titan
Ambassador
What is the architecture that runs this microcode? x86? arm? MIPS? :LOL:
No, it's not published. The micro-ops in modern Intel CPUs are said to be RISC-like, but I've seen people claim that some of them are far too complex to qualify as true RISC. They certainly wouldn't use a licensed ISA, partly due to IP reasons, but also because the mapping from x86 would be inefficient in some respects. Plus, we know that high-performance ARM cores use micro-ops, as well. So, even that "RISC" ISA is apparently non-optimal for a CPU to natively execute.

Here's a presentation from 2017 on attempts to reverse-engineer it:

Just skimming through the slides, I see that the microcode files weren't signed, at least at that time @TJ Hooker .

Here's a USENIX paper by mostly the same authors:

I don't know how much more is known about Intel or AMD microcode, but if you're interested in the subject more generally, the Wikipedia page seems like a good place to start:
 
Last edited:
  • Like
Reactions: Hartemis

bit_user

Titan
Ambassador
Microcode is micro-architecture specific. It can't even equate to an ISA because the software doesn't know it exists. So microcode from say an i7-13700K won't work for an i7-12700K or vice versa. Or if it does, horrible things will likely happen.
That probably wasn't a great example, given how little changed between those generations. Plus, you have to be more specific, now that those CPUs contain different core types.

I'd be much more confident asserting that Golden Cove microcode is completely incompatible with Skylake's. Or probably Golden Cove vs. Gracemont.