News Intel and AMD forge x86 ecosystem advisory group that aims to ensure a unified ISA moving forward

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

waltc3

Honorable
Aug 4, 2019
454
252
11,060
Intel and AMD cross-licensing agreements go way back in time, so nothing new here. Indeed, Intel licensed x86-64 from AMD as RDRAM failed with the Itanium bus before Intel got completely out of its original RDRAM investment. Years ago, Sanders at AMD, I believe, said they had the license for the Pentium bus but didn't want to use it, IIRC. It's good for both companies, as they can elect to share things without suing each other over patents constantly. Glad to see it...;) Both companies are miles and miles away from x86 code, of course, but it's still a good descriptor for general software/hardware compatibilities, imo.
 
  • Like
Reactions: artk2219

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,457
1,002
21,060
No. He first went to ARM, and then decided RISC-V was better for the flexibility it gave him.
It's hard to beat "Free" & "Open Standard ISA" & "Royalty Free Open-Source Licensing".

ARM has licensing costs, Royalties, & restrictions.

AMD & Intel are busy with their own AI stuff.

So the obvious choice was RISC-V.

And he has his own CPU Engineers as well, so he knows what it takes to make a "High Performance" CPU.
 
  • Like
Reactions: artk2219

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,457
1,002
21,060
At this point, I'm sure they wouldn't even bother if they could.
Let's assume that "They Could" for the right price$$

We're talking Billions of dollars in one time licensing fees to both AMD & Intel & to share patents.

Given that ARM wants to play alot of licensing shenanigans with Qualcomm


Having a "One-time Fee" license and ZERO Royalties to anybody after the fact to join "Team x86".

Remember x86 USED to have many licensees back in the day.

I wouldn't be surprised if converting nVIDIA & Qualcomm to join Team x86 would be a better move.
 

NinoPino

Respectable
May 26, 2022
496
310
2,060
I wonder if Intel could license x86 to many companies, would that destroy ARM?
The existence of both Intel (more) and AMD (less) till now was assured by the duopoly. If they start to license x86 to others they pose a risk on the near future. The only advantage to open x86 licensing is for the sake of x86 ISA itself, not for the companies that use it, so I doubt it will happen.
Anyway, to kill ARM a technically better ISA is needed, and for sure that cannot be x86. Maybe Risc-V or the new "x86" that will born from this consortium, but in the latter case we will lose the compatibility with actual softwares.
 
  • Like
Reactions: dimar and artk2219

bit_user

Titan
Ambassador
avx10.2 and AMX would be huge giveaways.
I'm not convinced AMD even wants AMX. They seem to be on a different path, by mixing CPU and GPU dies in MI300A. For client workloads, they have XDNA.

So, what big problem is left for AMX to solve? It's too low-precision to be very useful for much beyond AI, and adding higher-precision support would be very costly in die area.
 
  • Like
Reactions: artk2219

bit_user

Titan
Ambassador
It's hard to beat "Free" & "Open Standard ISA" & "Royalty Free Open-Source Licensing".

ARM has licensing costs, Royalties, & restrictions.
That's not what Keller said. He said he asked ARM to add a couple of custom instructions to convert between the data formats used by their Tensix DSP cores, but ARM said "no". Then, he turned to RISC-V, because it lets you do whatever the heck you want.

AMD & Intel are busy with their own AI stuff.
AMD has a custom IP business, which is how console makers gain use of their IP. Intel has also talked about licensing their cores to fab customers, in which case I'd bet you might be able to get a few custom tweaks.

And he has his own CPU Engineers as well, so he knows what it takes to make a "High Performance" CPU.
The first few gens of their chips with RISC-V used IP licensed from SiFive. It was only later that he decided to start designing their own, possibly as an alternate business model, in case the whole AI thing didn't work out for them.
 
  • Like
Reactions: artk2219

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
I'm not that tech savvy but can someone clearly point out the advantages of arm? Practically though, not in theory.

From what I've gathered, Apple has the most advanced arm chips. Still, they seem to be severely lagging behind x86 in performance and contrary to popular belief, in efficiency as well. The reason Apple chips are efficient is the super low power limits coupled with absolutely insanely huge dies. The m2 ultra has 140 billion transistors. That is basically the equivalent of a couple of threadrippers and a couple of 4080 gpus. Is there any workload that an M2 / M3 ultra is faster than a bunch of threadripper and a bunch of 4080 gpus?

From what I've seen from most tests a 3080 coupled with a normal desktop chip is faster than an ultra m cpu, so I can't fathom how it can ever compete against a bunch of TRs.
 

bit_user

Titan
Ambassador
I'm not that tech savvy but can someone clearly point out the advantages of arm? Practically though, not in theory.

From what I've gathered, Apple has the most advanced arm chips. Still, they seem to be severely lagging behind x86 in performance and contrary to popular belief, in efficiency as well.
Nope. CB24 (ST) is 141 compared to the top spec Lunar Lake 288V of 121 points. On efficiency, the M3 gets 12.7 points/W, compared to the 258V's 5.38 points/W (and note that we switched to a lower-performing model, here). Stick with the 288V and you get only 4.78 points/W.

On CB24 (MT), they perform about the same, but efficiency is still a massacre, with the M3 managing 28.3 points/W vs. the 258V's 19.3 points/W and the 288V's 17.9 points/W.


The reason Apple chips are efficient is the super low power limits
Nope, keep trying to cope. If it were just down to power limits, they wouldn't also perform the same or better.

coupled with absolutely insanely huge dies. The m2 ultra has 140 billion transistors.
So, the above data is not from the Ultra. The Ultra is two of the Max dies stuck together, and not available in a laptop. The M3 mentioned above uses just one of the smallest of the 3 die sizes Apple makes.

So, you're back to the same old tactic of making ridiculous comparisons, I see. You might as well be comparing Lunar Lake to an AMD ThreadRipper, because that would be about as valid.

From what I've seen from most tests a 3080 coupled with a normal desktop chip is faster than an ultra m cpu, so I can't fathom how it can ever compete against a bunch of TRs.
So, now you're no longer talking about the CPU performance but instead focusing on GPU performance? The GPUs are neither x86 nor ARM. It's totally irrelevant to the point in question. Stick to CPU benchmarks.
 
  • Like
Reactions: artk2219

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
Nope. CB24 (ST) is 141 compared to the top spec Lunar Lake 288V of 121 points. On efficiency, the M3 gets 12.7 points/W, compared to the 258V's 5.38 points/W (and note that we switched to a lower-performing model, here). Stick with the 288V and you get only 4.78 points/W.
Those are single thread scores man. Who cares about ST scores in types of workloads that scale to n cores?

I don't see what's ridiculous about the comparison. If arm is better then it should be faster than an x86 cpu while using the same transistor count, no? The m3 you mentioned is using 25b transistors. That's equivalent to 2 and a half 14900k chips. Is it anywhere near as fast as 2.5 14900k?

Quite the contrary I think your comparison is rediculous. Basically what you are saying is, a 130b arm chip is faster than a 10b x86 chip, therefore arm > x86. Sorry what?
 

bit_user

Titan
Ambassador
Those are single thread scores man. Who cares about ST scores in types of workloads that scale to n cores?
I also quoted MT performance and efficiency.

The part that should really mess your pants is how Apple managed that with only 128-bit NEON SIMD. That's the equivalent of SSEn. Apple has yet to implement ARM's SVE, which is what enables wider vector sizes.

I don't see what's ridiculous about the comparison.
Ultra is not a laptop CPU, and you were talking about Lunar Lake.

If arm is better then it should be faster than an x86 cpu while using the same transistor count, no?
Do you have some estimates on the transistor counts of just the cores? Find those, and then we'll talk.

The m3 you mentioned is using 25b transistors. That's equivalent to 2 and a half 14900k chips. Is it anywhere near as fast as 2.5 14900k?
Again, it's a ridiculous comparison, because the M3 is an entire SoC. It has a bigger GPU, a NPU, an ISP, plus south bridge.

Also, it's made on a smaller process node, so yeah they're going to use more transistors. That's one of the main ways you get more performance by using smaller nodes. CPU performance never scales linearly with transistor count, it works more like log(n). You should stick to comparing just the cores, and then use Lunar Lake, because it's made on the same TSMC node as the M3.
 
Last edited:

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
I also quoted MT performance and efficiency.


Ultra is not a laptop CPU, and you were talking about Lunar Lake.


Do you have some estimates on the transistor counts of just the cores? Find those, and then we'll talk.


Again, it's a ridiculous comparison, because the M3 is an entire SoC. It has a bigger GPU, a NPU, an ISP, plus south bridge.

Also, it's made on a smaller process node, so yeah they're going to use more transistors. That's one of the main ways you get more performance by using smaller nodes. CPU performance never scales linearly with transistor count, it works more like log(n). You should stick to comparing just the cores, and then use Lunar Lake, because it's made on the same TSMC node as the M3.
Well that's why I added gpus into the mix but you objected. I don't care about the size of the cores, I'm saying that 130b transistor counts would get you a 64 core threadripper and 2x4080s, with transistors to spare as well. Is the m3 ultra faster than the above configuration? If not, then how is arm better than x86?

I wasn't talking about Lunar Lake btw. I don't know how you got that impression. Not that I mind we can talk about it as well, no problem.
 

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
The subject is x86 (vs. ARM). Not Apple vs. whatever. That's a different thread.
Well how else would you compare arm vs x86 if not with actual products out in the wild?

As I've said, I'm not tech savvy, but it looks like you get a ton (like 5x) more performance on x86 than on the best arm chips currently available with the same transistor count. I can't fathom how arm is superior. That's why I asked for explanations.
 
  • Like
Reactions: Roland Of Gilead
Well that's why I added gpus into the mix but you objected. I don't care about the size of the cores, I'm saying that 130b transistor counts would get you a 64 core threadripper and 2x4080s, with transistors to spare as well. Is the m3 ultra faster than the above configuration? If not, then how is arm better than x86?
If you limited the entire 64 core TR + 2x 4080s to the 295W maximum an M2 Ultra in the Mac Studio (or 335W in the Mac Pro) pulls it would probably be a tossup if the Mac didn't just flat out win. This is what Apple gets from having such a comparatively large die size. We don't really know how well Arm scales up in more similar power numbers/core design outside of the Altera reviews Anandtech did. These weren't with custom Arm cores but rather N1 Neoverse cores and they kept up relatively will to the competition at the time. Apple's designs are all from a mobile first standpoint and Qualcomm's were definitely shifted towards the same as the performance increases for increased power consumption don't even remotely make sense (based on what little I've seen of the dev kit testing).

LNL and the M3 is about the best comparison available in Arm vs x86 as they have the same core counts and are similar size SoCs. However even that isn't perfect because Intel has more display outputs and PCIe lanes as Apple scales with die size. It's undeniable that while Intel has come a long way the M3 is still quite a bit better performance wise when you factor in power efficiency. In terms of absolute performance they are very close to one another.
 

bit_user

Titan
Ambassador
We don't really know how well Arm scales up in more similar power numbers/core design outside of the Altera reviews Anandtech did.
Sure we do. There are plenty of benchmarks of Amazon Graviton 3 & 4, Nvidia Grace, and AmpereOne. The latter is the least impressive, IMO. That's where Ampere basically tried to design something a bit more like an E-core. Now, Ampere is trying to get bought by Oracle.

Qualcomm's were definitely shifted towards the same as the performance increases for increased power consumption don't even remotely make sense (based on what little I've seen of the dev kit testing).
It's only 12 cores, though. The shape of their perf/W curve seems roughly similar to that of Raptor Lake.

However even that isn't perfect because Intel has more display outputs and PCIe lanes as Apple scales with die size.
What does that matter? Just take a die shot and measure the area of the cores, themselves. As for testing, NotebookCheck only used one external monitor and probably about the same number of PCIe lanes, in both cases.

In terms of absolute performance they are very close to one another.
No way, dude. The M3 had a single-thread advantage of 16.5%, while using less power. That's a whole generation ahead. And M3 is basically Apple's previous gen, by this point.

You both need a lot of copium.
 
  • Like
Reactions: Roland Of Gilead

Alpha_Lyrae

Reputable
Nov 13, 2021
28
26
4,560
If you limited the entire 64 core TR + 2x 4080s to the 295W maximum an M2 Ultra in the Mac Studio (or 335W in the Mac Pro) pulls it would probably be a tossup if the Mac didn't just flat out win. This is what Apple gets from having such a comparatively large die size. We don't really know how well Arm scales up in more similar power numbers/core design outside of the Altera reviews Anandtech did. These weren't with custom Arm cores but rather N1 Neoverse cores and they kept up relatively will to the competition at the time. Apple's designs are all from a mobile first standpoint and Qualcomm's were definitely shifted towards the same as the performance increases for increased power consumption don't even remotely make sense (based on what little I've seen of the dev kit testing).

LNL and the M3 is about the best comparison available in Arm vs x86 as they have the same core counts and are similar size SoCs. However even that isn't perfect because Intel has more display outputs and PCIe lanes as Apple scales with die size. It's undeniable that while Intel has come a long way the M3 is still quite a bit better performance wise when you factor in power efficiency. In terms of absolute performance they are very close to one another.
But why would you limit the power? They're different designs, so it's not really fair to limit power. Simply measure the power use over time to complete a task and compare that.

Without Apple's use of custom accelerator logic, it will get obliterated. With Apple's custom accelerator logic allowed, I think power use over time will still be lower for the 64c TR + 2x4080s as this set-up will complete the task in less time, especially in common HPC workloads.

So yes, you trade higher power up front in the discrete set-up, but if it completes tasks before the fully integrated Apple M2 Ultra, isn't that a win?
 
  • Like
Reactions: artk2219

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
If you limited the entire 64 core TR + 2x 4080s to the 295W maximum an M2 Ultra in the Mac Studio (or 335W in the Mac Pro) pulls it would probably be a tossup if the Mac didn't just flat out win. This is what Apple gets from having such a comparatively large die size.
That might be the case because the interconnect on the threadripper consuming a lot in such a big chip. It probably at some points starts scaling negatively with less power in efficiency.
 

TheHerald

Respectable
BANNED
Feb 15, 2024
1,630
502
2,060
No way, dude. The M3 had a single-thread advantage of 16.5%, while using less power. That's a whole generation ahead. And M3 is basically Apple's previous gen, by this point.

You both need a lot of copium.
But that's still irrelevant. If the m3 core is 55 times as big then it makes sense that it's faster and more efficient.