News Oracle Ports Database to Arm-Based Ampere CPUs: Might Ditch Intel and AMD

So, they are now supposed to use the "Oracle Recovery Manager (RMAN)" to back up and transfer databases without altering/harming them in any way.

That being said, x86 still rules the roost. As per figures from research company Omdia, released last year, Arm chips have a 7% share of the data center processor market. However, this is up from less than 1% in 2019.

Oracle's licensing constructs are more enticing for Ampere though, since Ampere CPUs are presumably rated at 0.25 licenses per core, which Oracle already offers for its own SPARC processors. According to data, recent AMD and Intel CPUs are rated as requiring 0.5 of a license for each physical core.

Also, because Arm is an open architecture, the hardware for each Arm design is unique and can be heavily customized. But there are challenges on getting major enterprise applications like Oracle Database on Arm.

Usually as in old context, rebuilding an app for ARM has meant recompiling the entire application: an "all-or-nothing" move, since all the binaries within a process need to be rebuilt before others can see the benefit. FWIW, this has also resulted in some buggy performance.

As you may already know, that Arm and x86 are entirely different. In x86, the hardware, like graphic cards, storage, and the CPU are independent of each other. ARM processors on the other hand do not have a separate CPU, but rather, the processing unit is on the same physical substrate as the other hardware controllers as an integrated circuit/IC.

So based on this, I think ARM can catch up pretty fast.
 
Last edited by a moderator:
FWIW, according to Microsoft's statement given last year, Azure VMs running on Ampere chips offered 50% better price-performance than x86-based VMs for scale-out workloads.

So we can expect the same use cases to likely cover/span everything from web servers, application servers, open-source databases, cloud-native and rich .NET applications, Java apps, gaming servers etc.
 
I'm sure they're only talking about their cloud-hosted instances. For customers running Oracle on their own hardware, Oracle would be shooting itself in both feet to drop support for x86-64, any time in the foreseeable future.
 
  • Like
Reactions: TJ Hooker
I'm sure they're only talking about their cloud-hosted instances. For customers running Oracle on their own hardware, Oracle would be shooting itself in both feet to drop support for x86-64, any time in the foreseeable future.
Aren't all Zen arch chips still vulnerable to SQUIP? Maybe Oracle is leaving AMD for security reasons in cloud hosted instances. They could also disable SMT but that wouldn't be good either.
 
My take on this that with Intel not innovating much for a long time now, Oracle invested in Ampere because Oracle needed something efficient and it didn't look like Intel was going to deliver. Oracle had no choice. Thy hyperscalers did a similar thing; developed an ARM based chip in-house. None of them thought (few years ago) that AMD would become solution to x86 stagnation but it did. Still, Oracle recently released an EPYC based system, so maybe Oracle was hedging their bets by investing in Ampere too, just in case.
 
Aren't all Zen arch chips still vulnerable to SQUIP? Maybe Oracle is leaving AMD for security reasons in cloud hosted instances. They could also disable SMT but that wouldn't be good either.

Aren't all Zen arch chips still vulnerable to SQUIP? Maybe Oracle is leaving AMD for security reasons in cloud hosted instances. They could also disable SMT but that wouldn't be good either.
The SQUIP is a one year old vulnerability that affects zen 1,2,3, not zen4. Being so old, the mitigations are well known.
Almost all of today's SMT architectures suffers from some sort of side-channel vulnerability. Does not depends from the ISA but from the architecture. I suppose that Ampere's have similar problems.
 
My take on this that with Intel not innovating much for a long time now, Oracle invested in Ampere because Oracle needed something efficient and it didn't look like Intel was going to deliver. Oracle had no choice. Thy hyperscalers did a similar thing; developed an ARM based chip in-house. None of them thought (few years ago) that AMD would become solution to x86 stagnation but it did. Still, Oracle recently released an EPYC based system, so maybe Oracle was hedging their bets by investing in Ampere too, just in case.

It's more that while the x86 uArch is very good at high single treaded performance, it gets complicated to expand horizontally. ARM on the other hand is kind of the exact opposite, and while they've been getting better they aren't going to compete with Intel/AMD's branch predictor and loop unroller to enable ridiculous IPS. Yet it's very easy to just keep adding ARM processing units next to each other, it's no more difficult to have 64 units as it is to have 16 units. RDBMS's workload profile is very amenable to wide processing, they can just keep using new threads for each transaction / request to each shard.

Basically down something like (just example) 16 cores at 5ghz vs 64 cores at 2.5ghz.
 
As you may already know, that Arm and x86 are entirely different. In x86, the hardware, like graphic cards, storage, and the CPU are independent of each other. ARM processors on the other hand do not have a separate CPU, but rather, the processing unit is on the same physical substrate as the other hardware controllers as an integrated circuit/IC.
I don't understand what you're trying to say here, can you rephrase or elaborate?
 
I don't understand what you're trying to say here, can you rephrase or elaborate?
He just thinks all ARM CPUs are SoCs. Given how many actually are, it's somewhat understandable (but obviously wrong).

@Metal Messiah. here are the CPUs they're talking about:

In other words, it's just a standard server CPU. No graphics or other specialized IP blocks.

Bonus: here's Amazon's Graviton 3:

 
  • Like
Reactions: TJ Hooker
It's more that while the x86 uArch is very good at high single treaded performance, it gets complicated to expand horizontally. ARM on the other hand is kind of the exact opposite, and while they've been getting better they aren't going to compete with Intel/AMD's branch predictor and loop unroller to enable ridiculous IPS.
They're making progress, though. The newly-announced X4 has 10-way dispatch and a 384-entry reorder buffer.

That's still about half the size of Apple's RoB (600+), and a fair bit smaller than Golden Cove's (512).
 
They're making progress, though. The newly-announced X4 has 10-way dispatch and a 384-entry reorder buffer.

That's still about half the size of Apple's RoB (600+), and a fair bit smaller than Golden Cove's (512).

Yes they are making progress, just remember Intel and AMD have massive amounts of experience here and these ST optimizations are what really determine performance. Like it's stupid simple to do transistors that do binary math, much more complex to interpret and predict binary code.
 
Yes they are making progress, just remember Intel and AMD have massive amounts of experience here and these ST optimizations are what really determine performance. Like it's stupid simple to do transistors that do binary math, much more complex to interpret and predict binary code.
I'm not sure it's a matter of experience, so much as that the optimization target for these X-series cores is different. They're mobile-first cores, so I think they keep more of an eye on energy-efficiency*.

To make the server cores, ARM traditionally has either started with a A7x core and built it into a N-series core, or taken an X-series and built it into a V-series.

That focus on efficiency is probably one of the reasons so many cloud operators are turning to ARM. As I'm sure I've mentioned before, Amazon's Graviton 3 is a 64x V1 core CPU that burns a meager 100 W @ 2.5 GHz. Including 8-channel DDR5, PCIe 5.0, and 2x 256-bit SVE vector units. Such examples are why Intel and AMD had to respond with Sierra Forest and Bergamo.

* Apple's cores are also mobile-first, but Apple's vertical integration and the high selling price of its products means they can afford to make larger cores.
 
I'm not sure it's a matter of experience, so much as that the optimization target for these X-series cores is different. They're mobile-first cores, so I think they keep more of an eye on energy-efficiency*.

Yes it's experience, or rather Intellectual Property / Trade Secrets developed over decades of Research and Development. Neither Intel nor AMD actually make x86 CPU's anymore, instead they each make a CPU with it's own proprietary language that has an abstraction layer built on top of it that emulates x86. That is how they are able to use RISC design philosophy internally while still executing x86 binary code. Those decoders, schedulers and predictors act together to tear apart x86 operations and turn them into several smaller RISC-like micro-ops that can be dispatched and executed on several resources simultaneously Using this IP they can then execute many instructions before they've happened, essentially predicting the future with high accuracy.

That .. is not easy, it's something no one else can do very well. Instead everyone else is relying on the simplicity of RISC Load / Store architectures to make all that scheduling and prediction easier. While it's nowhere near as fast / powerful, it also takes up only a fraction of silicon space and power leaving more room for expanding horizontally. When you look at a die shot for anything Intel / AMD make, the actually execution units (ALU/AGU/MMU/FPU/etc..) take up a minute space, it's all the supporting components that dominate the die.
 
Yes it's experience, or rather Intellectual Property / Trade Secrets developed over decades of Research and Development.
ARM isn't a noob at the CPU game, either.

Neither Intel nor AMD actually make x86 CPU's anymore, instead they each make a CPU with it's own proprietary language that has an abstraction layer built on top of it that emulates x86.
Calling it "emulation" is going way too far. Their CPUs are carefully designed to efficiently implement x86. I guarantee you that the x86 ISA has influences on design decisions made throughout the entire backend, including things like int/fp register partitioning and memory-ordering.

For Apple to efficiently perform x86 "emulation" (it's actually JIT-translation) atop AArch64, they actually had to take a couple liberties with the ARM ISA.


That's a good example of the sort of "impudence mismatch" that happens by executing one ISA atop an independent one.

In strict terms, that put Apple in violation of the ARM ISA, which I'm sure runs in contravention of Apple's architectural license. However, ARM is in no position to antagonize its marquee customer.


Those decoders, schedulers and predictors act together to tear apart x86 operations and turn them into several smaller RISC-like micro-ops
Source?

Looking at Golden Cove (and every other x86 core I recall seeing block diagrams for), the decoder is cleanly separated from scheduling.

Here's Zen 4:

Using this IP they can then execute many instructions before they've happened, essentially predicting the future with high accuracy.

That .. is not easy, it's something no one else can do very well.
Apple sets the gold standard, here. That's one of the ways they beat everyone else at IPC.

When you look at a die shot for anything Intel / AMD make, the actually execution units (ALU/AGU/MMU/FPU/etc..) take up a minute space, it's all the supporting components that dominate the die.
Got any recent comparison? I'm not having any luck finding annotated die shots of recent Apple or ARM cores, themselves (i.e. not entire SoCs). The only annotated core photos I'm able to find are Intel and AMD.

Anyway, as I mentioned, the Firestorm cores in Apple's M1 have a RoB that's even larger than Golden Cove and Zen. So, that shows x86 aren't the only ones who've gone big in this area.
 
  • Like
Reactions: rluker5
The SQUIP is a one year old vulnerability that affects zen 1,2,3, not zen4. Being so old, the mitigations are well known.
Almost all of today's SMT architectures suffers from some sort of side-channel vulnerability. Does not depends from the ISA but from the architecture. I suppose that Ampere's have similar problems.
No one has ever said SQUIP does not affect Zen 4 and no one has ever said that AMD has changed their architecture from a single scheduler per core to a single scheduler for all cores. This would be a huge change that would have significant effects. Zen 4 is clearly still vulnerable to SQUIP and Zen 5 will in all likelihood be as well. And being a well known vulnerability is a bad thing, not a good one, especially if not every single software package used on a system has mitigations. Remember flash? That was just one vulnerable program, I'm sure many more haven't been rewritten to avoid SQUIP as the last media mention was AMD telling their customers to fix this problem themselves. Here is some more information:
 
No one has ever said SQUIP does not affect Zen 4 and no one has ever said that AMD has changed their architecture from a single scheduler per core to a single scheduler for all cores.
I think you've got it backwards. The link you posted said the issue is that they have multiple schedulers per core.

This would be a huge change that would have significant effects. Zen 4 is clearly still vulnerable to SQUIP and Zen 5 will in all likelihood be as well.
According to that link, the vulnerability depends on 4 things:
  • CPUs execution units must be connected to multiple schedulers
  • those execution units have different capabilities
  • the co-located processors (processes?) compete for free slots in the scheduler queues
  • the flow control for RSA implementation is secret dependent

I assume the point about RSA was relevant only because:

"This was demonstrated by extracting an intact RSA-4096 key from a co-located VM."

So, to generalize, you'd say that whatever thread they're trying to spy on should have flow-control dependent on the data you want to extract. I'd further guess it should be relatively simple flow-control (i.e. I doubt it generalizes to more rich and complex data).

They also say that AMD's Secure Encrypted Virtualization (SEV) won't help, presumably because the attack is happening in the CPU core, where threads are dealing in unencrypted data.

The key point is this one:

"other processors with multiple schedulers like Apple’s M1 and M2 weren’t found to be vulnerable to the SQUIP attack because they lacked SMT."

So, the workaround is really very simple. Either:
  1. Disable SMT.
  2. Use a hypervisor configured not to pair different VMs on the same core and use an OS kernel configured not to pair threads from different processes on a core.

Either of those will solve it and all other SMT-related side-channel attacks, conclusively.
 
I think you've got it backwards. The link you posted said the issue is that they have multiple schedulers per core.


According to that link, the vulnerability depends on 4 things:
  • CPUs execution units must be connected to multiple schedulers
  • those execution units have different capabilities
  • the co-located processors (processes?) compete for free slots in the scheduler queues
  • the flow control for RSA implementation is secret dependent

I assume the point about RSA was relevant only because:
"This was demonstrated by extracting an intact RSA-4096 key from a co-located VM."​

So, to generalize, you'd say that whatever thread they're trying to spy on should have flow-control dependent on the data you want to extract. I'd further guess it should be relatively simple flow-control (i.e. I doubt it generalizes to more rich and complex data).

They also say that AMD's Secure Encrypted Virtualization (SEV) won't help, presumably because the attack is happening in the CPU core, where threads are dealing in unencrypted data.

The key point is this one:
"other processors with multiple schedulers like Apple’s M1 and M2 weren’t found to be vulnerable to the SQUIP attack because they lacked SMT."​

So, the workaround is really very simple. Either:
  1. Disable SMT.
  2. Use a hypervisor configured not to pair different VMs on the same core and use an OS kernel configured not to pair threads from different processes on a core.

Either of those will solve it and all other SMT-related side-channel attacks, conclusively.
That is all true.

But with your observed workarounds, with the first it is unlikely that it would still be worth buying AMD if you had to disable SMT, the second part of the second probably can't be done on Windows based instances and is unknown to most users if it is done on a linux based VM. If you are running an online store how would you find out?

The vulnerability should be dealt with, can be dealt with, but will it? Anybody who buys Zen 4 and later can just claim that they weren't notified of the vulnerability and pass potential legal liability back to AMD. A company just out for profits doesn't stand to gain from implementing these mitigations.

Meanwhile those of us who shared credit card info with dozens of sites, have bank accounts linked to maybe a dozen more for autopay and have their gov using services like AWS to store their biometric data and who knows how much else: https://nypost.com/2020/05/07/homeland-security-to-move-biometric-database-to-amazon-cloud/ have to shoulder extra risks produced by this irresponsible company.

(As a side note, have you tried e-checks? I have for metals and stocks and it is creepy how it is easier to direct transfer money right out of your checking account than it is to use a credit card. But for big purchases it is instant and fee free. But it feels like malware could drain an account in an instant.)

I'm not worried if somebody is having fun on their own computer with an AMD chip. If I had one for any but my office PC I wouldn't disable SMT either because individually I am a small target. But I have critical data shared with big targets and I stand to lose from having them use an insecure infrastructure.

Edit: All true except the multiple schedulers per core. They said multiple schedulers per CPU for AMD and Apple and a single for Intel.
 
Last edited:
it is unlikely that it would still be worth buying AMD if you had to disable SMT,
The latest data I can find is for Milan and Ice Lake SP. In those two cases, the former gains only 18.3% from SMT and the latter gains only 8.1% (or 9.2% in single-socket config) on SPECint2017:

Those are merely averages, however. It'll help more on some workloads and less on others. Since they didn't provide the subscores for those configs, this is all we've got. However, I think disabling SMT isn't enough to destroy the value proposition of these server CPUs.

the second part of the second probably can't be done on Windows based instances and is unknown to most users if it is done on a linux based VM.
Given how dominant Linux is in the cloud, I think it's probably not a deal-breaker if Windows Server doesn't support it? But, it seems easy enough that maybe they do?

As for Linux, Google was the one to contribute the patch for limiting core-sharing to threads in the same process. So, I'd assume they enable it in their cloud-hosted instances. Who knows about the others, but I think it's something most admins of high-value services are likely to know about and enable.

A company just out for profits doesn't stand to gain from implementing these mitigations.
You assume mitigations are possible. Given what a low-level chip function instruction-scheduling is, I'd be surprised if there even is a microcode-level fix.

I think the mitigations are the ones I listed above, because so many side-channel attacks we know about depend on SMT. So, avoiding core-sharing by one means or another not only protects you from the vulnerabilities we know about, but also future ones yet to be discovered. In this day and age, it's just the responsible thing to do.

It seems to me another thing you could do is simply rent cloud instances that occupy an entire machine, leaving no room for other tenants. I don't know too much about cloud computing, but I'd be surprised if that wasn't an option available to customers with high-value data.

Meanwhile those of us who shared credit card info with dozens of sites, have bank accounts linked to maybe a dozen more for autopay and have their gov using services like AWS to store their biometric data and who knows how much else: https://nypost.com/2020/05/07/homeland-security-to-move-biometric-database-to-amazon-cloud/ have to shoulder extra risks produced by this irresponsible company.
I think you're being a little hyperbolic, here. To successfully exfiltrate data via this attack, you need to know what other tenants are on your host and what software they're running. These kinds of timing-based side-channel attacks all require models of the specific software being targeted. On a big, public cloud platform, you have no idea or control over what tenants you share a host with. It's very much an exploit for hostile governments and not garden-variety cyber-thieves.

And, for this particular vulnerability, the CVE score is only 5.6 (Medium), with an impact of 4.0 and exploitability of just 1.1:

If you're really concerned about your personal data getting stolen, you should disable SMT on any machine from which you do any online banking or from which you perform other sensitive operations.

(As a side note, have you tried e-checks? I have for metals and stocks and it is creepy how it is easier to direct transfer money right out of your checking account than it is to use a credit card. But for big purchases it is instant and fee free. But it feels like malware could drain an account in an instant.)
Yes, it's a clear example of an archaic electronic finance network. Hopefully, it gets phased out before long. Unfortunately, it seems most of the likely substitutes are proprietary services rather than an open standard.

Edit: All true except the multiple schedulers per core. They said multiple schedulers per CPU for AMD and Apple and a single for Intel.
IMO, the only sensible interpretation I see is that they meant multiple schedulers per core.
 
Last edited:
The latest data I can find is for Milan and Ice Lake SP. In those two cases, the former gains only 18.3% from SMT and the latter gains only 8.1% (or 9.2% in single-socket config) on SPECint2017:

Those are merely averages, however. It'll help more on some workloads and less on others. Since they didn't provide the subscores for those configs, this is all we've got. However, I think disabling SMT isn't enough to destroy the value proposition of these server CPUs.
The SQUIP vulnerability seems like it is mostly a threat for shared public servers running VMs. How many VMs do you lose by disabling SMT?
[QUOTE/ Given how dominant Linux is in the cloud, I think it's probably not a deal-breaker if Windows Server doesn't support it? But, it seems easy enough that maybe they do?

As for Linux, Google was the one to contribute the patch for limiting core-sharing to threads in the same process. So, I'd assume they enable it in their cloud-hosted instances. Who knows about the others, but I think it's something most admins of high-value services are likely to know about and enable.
[/QUOTE]
Hopefully Google is implementing it, but that leaves the other 8/9 of the market. I wonder if Oracle will implement it when they get in?
You assume mitigations are possible. Given what a low-level chip function instruction-scheduling is, I'd be surprised if there even is a microcode-level fix.

I think the mitigations are the ones I listed above, because so many side-channel attacks we know about depend on SMT. So, avoiding core-sharing by one means or another not only protects you from the vulnerabilities we know about, but also future ones yet to be discovered. In this day and age, it's just the responsible thing to do.
I just meant the mitigations you listed, not microcode ones. Once again the responsible thing isn't always done. Case in point: AMD on this matter.
It seems to me another thing you could do is simply rent cloud instances that occupy an entire machine, leaving no room for other tenants. I don't know too much about cloud computing, but I'd be surprised if that wasn't an option available to customers with high-value data.


I think you're being a little hyperbolic, here. To successfully exfiltrate data via this attack, you need to know what other tenants are on your host and what software they're running. These kinds of timing-based side-channel attacks all require models of the specific software being targeted. On a big, public cloud platform, you have no idea or control over what tenants you share a host with. It's very much an exploit for hostile governments and not garden-variety cyber-thieves.
There are hostile govs out there right now and Dept of Homeland security and our large US banks are likely targets. Also those targets are the biggest ones for better than garden variety thieves.
And, for this particular vulnerability, the CVE score is only 5.6 (Medium), with an impact of 4.0 and exploitability of just 1.1:

If you're really concerned about your personal data getting stolen, you should disable SMT on any machine from which you do any online banking or from which you perform other sensitive operations.
I do my sensitive operations and online banking exclusively on my office PC.
 
The SQUIP vulnerability seems like it is mostly a threat for shared public servers running VMs. How many VMs do you lose by disabling SMT?
Well, I told you the rough performance impact on integer code. The number of VMs probably depends on how compute-intensive they are. If they're all compute-bound, then the upper limit would seem to correspond directly to that performance reduction.

Hopefully Google is implementing it, but that leaves the other 8/9 of the market. I wonder if Oracle will implement it when they get in?
The link I provided names the feature "core-scheduling", so you can research it further, if you care that much. It's not really my job to know this stuff.
 
Last edited:
  • Like
Reactions: rluker5