I think they're co-driving... like they each have a hand on the steering wheel! What could go wrong??So AMD isn't in the rearview mirror, but driving the car now?
: D
I think they're co-driving... like they each have a hand on the steering wheel! What could go wrong??So AMD isn't in the rearview mirror, but driving the car now?
It's hard to beat "Free" & "Open Standard ISA" & "Royalty Free Open-Source Licensing".No. He first went to ARM, and then decided RISC-V was better for the flexibility it gave him.
Let's assume that "They Could" for the right price$$At this point, I'm sure they wouldn't even bother if they could.
The existence of both Intel (more) and AMD (less) till now was assured by the duopoly. If they start to license x86 to others they pose a risk on the near future. The only advantage to open x86 licensing is for the sake of x86 ISA itself, not for the companies that use it, so I doubt it will happen.I wonder if Intel could license x86 to many companies, would that destroy ARM?
nVidia was never seriously interested in general purpose CPUs. For nVidia the CPUs are useful only at the point where can squeeze optimal performance for his GPUs.This move is to break Nvidia legs on the CPU segment, and it will work.
I'm not convinced AMD even wants AMX. They seem to be on a different path, by mixing CPU and GPU dies in MI300A. For client workloads, they have XDNA.avx10.2 and AMX would be huge giveaways.
That's not what Keller said. He said he asked ARM to add a couple of custom instructions to convert between the data formats used by their Tensix DSP cores, but ARM said "no". Then, he turned to RISC-V, because it lets you do whatever the heck you want.It's hard to beat "Free" & "Open Standard ISA" & "Royalty Free Open-Source Licensing".
ARM has licensing costs, Royalties, & restrictions.
AMD has a custom IP business, which is how console makers gain use of their IP. Intel has also talked about licensing their cores to fab customers, in which case I'd bet you might be able to get a few custom tweaks.AMD & Intel are busy with their own AI stuff.
The first few gens of their chips with RISC-V used IP licensed from SiFive. It was only later that he decided to start designing their own, possibly as an alternate business model, in case the whole AI thing didn't work out for them.And he has his own CPU Engineers as well, so he knows what it takes to make a "High Performance" CPU.
Nope. CB24 (ST) is 141 compared to the top spec Lunar Lake 288V of 121 points. On efficiency, the M3 gets 12.7 points/W, compared to the 258V's 5.38 points/W (and note that we switched to a lower-performing model, here). Stick with the 288V and you get only 4.78 points/W.I'm not that tech savvy but can someone clearly point out the advantages of arm? Practically though, not in theory.
From what I've gathered, Apple has the most advanced arm chips. Still, they seem to be severely lagging behind x86 in performance and contrary to popular belief, in efficiency as well.
Nope, keep trying to cope. If it were just down to power limits, they wouldn't also perform the same or better.The reason Apple chips are efficient is the super low power limits
So, the above data is not from the Ultra. The Ultra is two of the Max dies stuck together, and not available in a laptop. The M3 mentioned above uses just one of the smallest of the 3 die sizes Apple makes.coupled with absolutely insanely huge dies. The m2 ultra has 140 billion transistors.
So, now you're no longer talking about the CPU performance but instead focusing on GPU performance? The GPUs are neither x86 nor ARM. It's totally irrelevant to the point in question. Stick to CPU benchmarks.From what I've seen from most tests a 3080 coupled with a normal desktop chip is faster than an ultra m cpu, so I can't fathom how it can ever compete against a bunch of TRs.
Those are single thread scores man. Who cares about ST scores in types of workloads that scale to n cores?Nope. CB24 (ST) is 141 compared to the top spec Lunar Lake 288V of 121 points. On efficiency, the M3 gets 12.7 points/W, compared to the 258V's 5.38 points/W (and note that we switched to a lower-performing model, here). Stick with the 288V and you get only 4.78 points/W.
I also quoted MT performance and efficiency.Those are single thread scores man. Who cares about ST scores in types of workloads that scale to n cores?
Ultra is not a laptop CPU, and you were talking about Lunar Lake.I don't see what's ridiculous about the comparison.
Do you have some estimates on the transistor counts of just the cores? Find those, and then we'll talk.If arm is better then it should be faster than an x86 cpu while using the same transistor count, no?
Again, it's a ridiculous comparison, because the M3 is an entire SoC. It has a bigger GPU, a NPU, an ISP, plus south bridge.The m3 you mentioned is using 25b transistors. That's equivalent to 2 and a half 14900k chips. Is it anywhere near as fast as 2.5 14900k?
log(n)
. You should stick to comparing just the cores, and then use Lunar Lake, because it's made on the same TSMC node as the M3.Well that's why I added gpus into the mix but you objected. I don't care about the size of the cores, I'm saying that 130b transistor counts would get you a 64 core threadripper and 2x4080s, with transistors to spare as well. Is the m3 ultra faster than the above configuration? If not, then how is arm better than x86?I also quoted MT performance and efficiency.
Ultra is not a laptop CPU, and you were talking about Lunar Lake.
Do you have some estimates on the transistor counts of just the cores? Find those, and then we'll talk.
Again, it's a ridiculous comparison, because the M3 is an entire SoC. It has a bigger GPU, a NPU, an ISP, plus south bridge.
Also, it's made on a smaller process node, so yeah they're going to use more transistors. That's one of the main ways you get more performance by using smaller nodes. CPU performance never scales linearly with transistor count, it works more likelog(n)
. You should stick to comparing just the cores, and then use Lunar Lake, because it's made on the same TSMC node as the M3.
The subject is x86 (vs. ARM). Not Apple vs. whatever. That's a different thread.Well that's why I added gpus into the mix but you objected. I don't care about the size of the cores,
Well how else would you compare arm vs x86 if not with actual products out in the wild?The subject is x86 (vs. ARM). Not Apple vs. whatever. That's a different thread.
If you limited the entire 64 core TR + 2x 4080s to the 295W maximum an M2 Ultra in the Mac Studio (or 335W in the Mac Pro) pulls it would probably be a tossup if the Mac didn't just flat out win. This is what Apple gets from having such a comparatively large die size. We don't really know how well Arm scales up in more similar power numbers/core design outside of the Altera reviews Anandtech did. These weren't with custom Arm cores but rather N1 Neoverse cores and they kept up relatively will to the competition at the time. Apple's designs are all from a mobile first standpoint and Qualcomm's were definitely shifted towards the same as the performance increases for increased power consumption don't even remotely make sense (based on what little I've seen of the dev kit testing).Well that's why I added gpus into the mix but you objected. I don't care about the size of the cores, I'm saying that 130b transistor counts would get you a 64 core threadripper and 2x4080s, with transistors to spare as well. Is the m3 ultra faster than the above configuration? If not, then how is arm better than x86?
I'm not arguing against that. I'm saying we should stay focused on the actual cores. The way to do that is to find stats on them, instead of trying to compare gross die sizes.Well how else would you compare arm vs x86 if not with actual products out in the wild?
Sure we do. There are plenty of benchmarks of Amazon Graviton 3 & 4, Nvidia Grace, and AmpereOne. The latter is the least impressive, IMO. That's where Ampere basically tried to design something a bit more like an E-core. Now, Ampere is trying to get bought by Oracle.We don't really know how well Arm scales up in more similar power numbers/core design outside of the Altera reviews Anandtech did.
It's only 12 cores, though. The shape of their perf/W curve seems roughly similar to that of Raptor Lake.Qualcomm's were definitely shifted towards the same as the performance increases for increased power consumption don't even remotely make sense (based on what little I've seen of the dev kit testing).
What does that matter? Just take a die shot and measure the area of the cores, themselves. As for testing, NotebookCheck only used one external monitor and probably about the same number of PCIe lanes, in both cases.However even that isn't perfect because Intel has more display outputs and PCIe lanes as Apple scales with die size.
No way, dude. The M3 had a single-thread advantage of 16.5%, while using less power. That's a whole generation ahead. And M3 is basically Apple's previous gen, by this point.In terms of absolute performance they are very close to one another.
But why would you limit the power? They're different designs, so it's not really fair to limit power. Simply measure the power use over time to complete a task and compare that.If you limited the entire 64 core TR + 2x 4080s to the 295W maximum an M2 Ultra in the Mac Studio (or 335W in the Mac Pro) pulls it would probably be a tossup if the Mac didn't just flat out win. This is what Apple gets from having such a comparatively large die size. We don't really know how well Arm scales up in more similar power numbers/core design outside of the Altera reviews Anandtech did. These weren't with custom Arm cores but rather N1 Neoverse cores and they kept up relatively will to the competition at the time. Apple's designs are all from a mobile first standpoint and Qualcomm's were definitely shifted towards the same as the performance increases for increased power consumption don't even remotely make sense (based on what little I've seen of the dev kit testing).
LNL and the M3 is about the best comparison available in Arm vs x86 as they have the same core counts and are similar size SoCs. However even that isn't perfect because Intel has more display outputs and PCIe lanes as Apple scales with die size. It's undeniable that while Intel has come a long way the M3 is still quite a bit better performance wise when you factor in power efficiency. In terms of absolute performance they are very close to one another.
That might be the case because the interconnect on the threadripper consuming a lot in such a big chip. It probably at some points starts scaling negatively with less power in efficiency.If you limited the entire 64 core TR + 2x 4080s to the 295W maximum an M2 Ultra in the Mac Studio (or 335W in the Mac Pro) pulls it would probably be a tossup if the Mac didn't just flat out win. This is what Apple gets from having such a comparatively large die size.
But that's still irrelevant. If the m3 core is 55 times as big then it makes sense that it's faster and more efficient.No way, dude. The M3 had a single-thread advantage of 16.5%, while using less power. That's a whole generation ahead. And M3 is basically Apple's previous gen, by this point.
You both need a lot of copium.