News Apple's A17 Pro Challenges Core i9-13900K, Ryzen 7950X in Single-Core Performance

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
How does modern code differ from code written 5 years ago, or even more? During this time, I don't think there are any general changes, at the very bottom of the processor architectures and in how the execution of instructions is ordered and implemented.
X86 was created in 1978 for a processor that had 29,000 transistors total. You think modern code looks anything like how it did then?
 
I remember all the noise Apple made about the M1 that also had good ST performance but also a node advantage. Once AMD and Intel moved to newer nodes that advantage went away. And even if you compare apple now on 3nm still behind AMD on 5nm and intel on its current.
Yeah, Apple hasn't made any real fundamental changes since the M1 chip to their CPU architecture. Of course, they didn't need to. The fact remains, ST performance is roughly on PAR with Intel/AMD while also remaining massively more efficient. Remember, Apple has to put these cores in a smart phone, not just a large desktop with a wind tunnel like fan setup.
Apple should care less about synthetic benchamarks and more about real-world performance that is achieveable through the optimizations and uniformity of their ecosystem. Hardware is way ahead, while software is crappled, unoptimized and unable to utilize this performance effectively. Winning the "watts" war isn't going to benefit anyone but the "hardcore" people out there looking for the smallest yet most expensive drop of perfomance at the cost of power and price.
I don't see Apple caring about synthetic benchmarks at all. I only see them quoted on media outlets such as this. As for performance / watt, it goes without saying how critical that is for mobile devices like smartphones. Even for laptops, Apple enjoys a tremendous advantage. The difference in battery life and lack of loud fans is huge. It's less of a big deal on the desktop, but that's pretty much the minority of users these days.
Wow, 3% multi traded Gen and gen performance improvement in Apple’s favorite benchmark! Amazing!! Thanks for the ad.
Apple never uses or advertises this benchmark. You also fail to see the massive improvements to the rest of the SoC such as the Neural Engine being twice as fast and large architecture changes to the GPU, etc.
 
  • Like
Reactions: gg83
X86 was created in 1978 for a processor that had 29,000 transistors total. You think modern code looks anything like how it did then?
He cited a 5-year delta, and it's a fair question. I'm sure code has changed in that time, but I don't know in what directions or how much.

I'd expect there's more JIT code, for one thing. Perhaps APIs and SDKs are doing more parameter-checking and input-sanitizing, as cybersecurity has continued to grow in importance and fuzz-testing has become more prevalent. Perhaps more code is being compiled with settings to tune for newer CPUs, which should affect the instruction mix and register windows.

Oh, and compiled-in mitigations are going to be much more prevalent. I think it was discovered that retpolines come at no performance penalty, on Zen 4.

Intel and AMD do performance modeling and analysis - I wonder how their workloads have adapted to the times.
 
  • Like
Reactions: gg83 and George³
The relationship between BSD and MacOS X is more tenuous and superficial than that, from what I've read. I'm not an expert on the subject, but I think not much of MacOS X hews from its BSD roots.
I'm not sure what other articles say. However, Apple's own documentation talks about the rooting in BSD.


Also if you crawl through the FreeBSD git repo you'll see commits from "Obtained From Apple Inc.". These are contributions back to FreeBSD.


FreeBSD is very much used by and linked to macOS by Apple's own admission and actions.
 
Last edited:
There are so many asterisks tied to this statement that it pretty much lacks any factual basis. Even the most power hungry single x86 cores don't use remotely close to 100W or even half that.
An 8.5 watt part in single core likely doesn't come close to 8.5 watts. What's the point here? Even if one were to argue a single core A17 used 8.5 watts it's still less than the 13900k's 32 watts for single core.

 
Apple's own documentation talks about the rooting in BSD.

I've seen that page. Did you actually read it?
"The BSD portion of the OS X kernel is derived primarily from FreeBSD"​

Doesn't say what % of it that "BSD portion" comprises. I'd assume their FreeBSD portions have just continued to shrink and diverge from upstream. They already talk about some of the changes they've made, in the section titled Differences between OS X and BSD:
"Although the BSD portion of OS X is primarily derived from FreeBSD, some changes have been made:"​

Also, did you get that the kernel, itself isn't BSD? It's Mach.

Under the section For Further Reading, they state:
"Although the BSD layer of OS X is derived from 4.4BSD, keep in mind that it is not identical to 4.4BSD. Some functionality of 4.4 BSD has not been included in OS X. Some new functionality has been added."​

Lastly, look at the footer:
"Copyright © 2002, 2013 Apple Inc. All Rights Reserved. Terms of Use | Privacy Policy | Updated: 2013-08-08"​

That page is 10 years old.

Also if you crawl through the FreeBSD git repo you'll see commits from "Obtained From Apple Inc.". These are contributions back to FreeBSD.


FreeBSD is very much used by and linked to macOS.
Dude, that's libc! To the extent MacOS still supports BSD userspace, they're going to share common bits of libc with other BSDs. That's a far cry from the actual guts of the OS!
 
Last edited:
  • Like
Reactions: Order 66
I've seen that page. Did you actually read it?
"The BSD portion of the OS X kernel is derived primarily from FreeBSD"​

Doesn't say what % of it that "BSD portion" comprises. I'd assume their FreeBSD portions have just continued to shrink and diverge from upstream. They already talk about some of the changes they've made, in the section titled Differences between OS X and BSD:
"Although the BSD portion of OS X is primarily derived from FreeBSD, some changes have been made:"​

Also, did you get that the kernel, itself isn't BSD? It's Mach.

Under the section For Further Reading, they state:
"Although the BSD layer of OS X is derived from 4.4BSD, keep in mind that it is not identical to 4.4BSD. Some functionality of 4.4 BSD has not been included in OS X. Some new functionality has been added."​

Lastly, look at the footer:
"Copyright © 2002, 2013 Apple Inc. All Rights Reserved. Terms of Use | Privacy Policy | Updated: 2013-08-08"​

That page is 10 years old.


Dude, that's libc! To the extent MacOS still supports BSD userspace, they're going to share common bits of libc with other BSDs. That's a far cry from the actual guts of the OS!
That's pretty nit-picky. I picked a random commit, there are plenty more to pick from.

We don't say Linux forks from older kernels are not Linux do we? The point here is macOS is based on FreeBSD, it is Unix based. We can argue semantics about how much is what, but it doesn't change my point or the fact that it is a Unix based and is as complex as any Unix based system.
 
Last edited:
That's pretty nit-picky. I picked a random commit, there are plenty more to pick from.
If you picked a bad example, that's on you. The burden is on you to demonstrate they meaningfully upstream to non-userspace portions of BSD. You should've looked at what you were citing and you should understand what you're arguing well enough to know what evidence supports your point vs. what does not.

Personally, I doubt it. Not only because they probably long-ago diverged past the point where it makes any sense for Apple to try and track upstream, but also because Apple has a poor track record of upstreaming stuff. The whole reason they promoted the LLVM/Clang project is because they got sick of dealing with upstreaming their GCC patches.

We don't say Linux forks from older kernels are not Linux do we?
Huh? That sentence doesn't parse.

The point here is macOS is based on FreeBSD,
No, they said the BSD layer derives from FreeBSD. That's a very different statement.

Windows has long had a POSIX layer, but even if they ripped that from someone like SCO, that wouldn't make Windows SCO UNIX-based.

it is Unix based.
It's NeXT-based. Sure, NeXT ripped some stuff from BSD. Makes sense - you need to make it easy for people to port their software to your OS.

Did you read everything they said is different? Including the whole driver stack, which they wrote in C++? I can promise you won't find a shred of C++ in FreeBSD. And, again, the actual kernel is not BSD!

We can argue semantics about how much is what,
I'm not arguing semantics. I'm arguing that maybe 10% of it is still BSD-based, but people like you cling to that shred and try to claim it proves something it doesn't.

but it doesn't change my point or the fact that it is Unix based and is as complex as any Unix based system.
It has a userspace based on BSD. It borrowed some other plumbing, which we don't know how much of is still in place.

Microsoft's WSL 1.0 enabled the Linux API/ABI on Windows 10. That did not make Windows 10 Linux-based. You need to understand what a layer is, in a layered software model.
 
I'm not arguing semantics. I'm arguing that maybe 10% of it is still BSD-based, but people like you cling to that shred and try to claim it proves something it doesn't.
What does that have to do with the point that it is Unix based? Please tell me, because you responded to my comment that it was Unix based and as complex as a Unix system.

Are you now claiming it's not Unix based?
 
What does that have to do with the point that it is Unix based?
People overstate how closely-related it is to BSD. That's all. It bugs me, because it's misleading. I was once mislead by such loose talk, until someone set me straight.

I think it has a lot less to do with the overall topic of application performance than various things Apple has been doing in their OS, over the past decade or so. Linux could go down the same road, if it wanted to, but so far it hasn't. I can't really comment on Windows, as I don't follow developments in the Windows kernel.
 
People overstate how closely-related it is to BSD. That's all. It bugs me, because it's misleading. I was once mislead by such loose talk, until someone set me straight.

I think it has a lot less to do with the overall topic of application performance than various things Apple has been doing in their OS, over the past decade or so. Linux could go down the same road, if it wanted to, but so far it hasn't. I can't really comment on Windows, as I don't follow developments in the Windows kernel.
I agree with that whole heartedly. I wasn't really trying to say macOS is just FreeBSD with a fancy UI (though maybe it come off that way).

I only took issue with the original comment because it was claiming less OS complexity as the reason for higher scores and that just didn't make sense to me given the product's roots.
 
I only took issue with the original comment because it was claiming less OS complexity as the reason for higher scores and that just didn't make sense to me given the product's roots.
From my perspective, Linux, Windows, MacOS are all similar enough, complexity-wise. If anything, Mach might even put MacOS at a slight disadvantage.

What matters to application performance is more in userspace threading facilities, and here's where Apple has made some substantial innovations I wish Linux would try to match.
 
  • Like
Reactions: JamesJones44
He cited a 5-year delta, and it's a fair question. I'm sure code has changed in that time, but I don't know in what directions or how much.

I'd expect there's more JIT code, for one thing. Perhaps APIs and SDKs are doing more parameter-checking and input-sanitizing, as cybersecurity has continued to grow in importance and fuzz-testing has become more prevalent. Perhaps more code is being compiled with settings to tune for newer CPUs, which should affect the instruction mix and register windows.

Oh, and compiled-in mitigations are going to be much more prevalent. I think it was discovered that retpolines come at no performance penalty, on Zen 4.

Intel and AMD do performance modeling and analysis - I wonder how their workloads have adapted to the times.
A fair question indeed, but one that does not adequately represent the scope of what x86s is solving, which is the statement of mine they were commenting on. X86 was designed for the 8086 processor, and layers upon layers have been built up onto said original X86 instruction set. It makes complete sense for Intel to perform a “spring cleaning” to remove the legacy bloat and bring it in line with their competitor’s modern lean instruction sets.
 
It makes complete sense for Intel to perform a “spring cleaning” to remove the legacy bloat and bring it in line with their competitor’s modern lean instruction sets.
Yeah, but that's really not coming anywhere close to a full cleanup. It's basically just trimming the hedges.

APX is a bigger deal, but only gets them like 3/5ths of the way, as I previously claimed. And no, I can't quantify the performance impact of the remaining cruft.
 
You can't compare Geekbench scores on Apple to Qualcomm or x86/64, it's known that the performance figures don't translate between them appropriately. However, you can compare them within Apple products, and to quote 9To5Mac:

You can compare some of the Geekbench results below:

  • A15 Bionic: 2183 single-core | 5144 multi-core
  • A16 Bionic: 2519 single-core | 6367 multi-core
  • A17 Pro: 2914 single-core | 7199 multi-core
  • M1: 2223 single-core | 7960 multi-core

Quite a bit faster than the M1 in single core and only 10% slower in multicore. If the iPhone 15 Pro (because non Pro is just USB 2.0 standard) works with the same docks the M based iPads do...


 
An 8.5 watt part in single core likely doesn't come close to 8.5 watts. What's the point here? Even if one were to argue a single core A17 used 8.5 watts it's still less than the 13900k's 32 watts for single core.


I'm not arguing that A17 isn't going to be more efficient, I'm saying the comparison isn't remotely 8.5W vs over 100W. The A17 is on a significantly better node and the 13900k isn't remotely tuned for efficiency, because Intel is trying to win benchmark wars with it, not maximize battery life. There is an enormous power penalty paid for those last few 100Mhz of clock speed that result in single digit performance improvement. If you tuned a 13900k to score the same as an A17 and then account for the node difference, the power usage would end up being a whole lot closer than most people would predict.
 
  • Like
Reactions: P.Amini
Bull. At peak clocks, even the worst current CPU's are under 50W. There are plenty of sites on the web that have tested this and you can verify it yourself.
Not according to this.
Power%2012900K%20POVRay%20Ramp%20EP.png
Look at that "1+0" case: CPU Package power: 78 W.

Here's another source: https://www-computerbase-de.transla...x_tr_hl=en&_x_tr_pto=wapp#chart-groups-127081

In the CB R23 Single Core section of the chart CPU Package Power (according to HWiNFO), they claim:
  • Ryzen 9 7950X DDR5-5200 CL32: 53.5 W
  • i9-12900KS DDR5-4800 CL38: 56.2 W
 
Last edited:
I'm not arguing that A17 isn't going to be more efficient, I'm saying the comparison isn't remotely 8.5W vs over 100W. The A17 is on a significantly better node and the 13900k isn't remotely tuned for efficiency, because Intel is trying to win benchmark wars with it, not maximize battery life.
I agree on this point. Intel is basically trying to balance peak single-threaded performance against area. In contrast Apple's cores are designed to prioritize power-efficiency at the cost a greater area and lower clocks, even to the point of hurting absolute single-thread performance.

There is an enormous power penalty paid for those last few 100Mhz of clock speed that result in single digit performance improvement.
Also agreed.

If you tuned a 13900k to score the same as an A17 and then account for the node difference, the power usage would end up being a whole lot closer than most people would predict.
Heh, those are a lot of weasel words. First, exactly what do you mean by "tuned a 13900K"? Do you mean if the microarchitecture were designed differently? Or, do you merely mean if its clock speed was capped so it stayed out of the horribly inefficient range?

Second, we already have a natural experiment in accounting for the node difference, because Apple's M1 and M2 are made on TSMC N5! So, when Meteor Lake launches on Intel 4, then it (according to Intel) should have node-parity. Not only that, but Meteor Lake looks to be capped at about 5 GHz. So, that should be an excellent test of your claims.

Even today, we can compare M2 macs against AMD's Phoenix, which is made on TSMC N4, giving AMD the node advantage. It's almost kind of funny to say, because every loves to use the excuse that "Apple is on a better node", but we can conclusively say that's not true of the M2 Macs.
 
Not according to this.
Power%2012900K%20POVRay%20Ramp%20EP.png
Look at that "1+0" case: CPU Package power: 78 W.

Here's another source: https://www-computerbase-de.transla...x_tr_hl=en&_x_tr_pto=wapp#chart-groups-127081

In the CB R23 Single Core section of the chart CPU Package Power (according to HWiNFO), they claim:
  • Ryzen 9 7950X DDR5-5200 CL32: 53.5 W
  • i9-12900KS DDR5-4800 CL38: 56.2 W
Not according to this.
Power%2012900K%20POVRay%20Ramp%20EP.png
Look at that "1+0" case: CPU Package power: 78 W.

Here's another source: https://www-computerbase-de.transla...x_tr_hl=en&_x_tr_pto=wapp#chart-groups-127081

In the CB R23 Single Core section of the chart CPU Package Power (according to HWiNFO), they claim:
  • Ryzen 9 7950X DDR5-5200 CL32: 53.5 W
  • i9-12900KS DDR5-4800 CL38: 56.2 W
They're measuring package power, not cpu core power. I'm talking about CPU core power which will be lower. Even so, their measurement still looks peculiarly high. Anandtech measured power usage of an M1 Max, POVRay had by far the least power usage of the single threaded test.
119344.png

That's half to a third of the other tests, while on a 12900k it's freakishly above everything else? I stand by my original statement.
 
Heh, those are a lot of weasel words. First, exactly what do you mean by "tuned a 13900K"? Do you mean if the microarchitecture were designed differently? Or, do you merely mean if its clock speed was capped so it stayed out of the horribly inefficient range?
Just clock reductions to match performance then undervolting accordingly. Stop making things overly complicated.

Second, we already have a natural experiment in accounting for the node difference, because Apple's M1 and M2 are made on TSMC N5! So, when Meteor Lake launches on Intel 4, then it (according to Intel) should have node-parity. Not only that, but Meteor Lake looks to be capped at about 5 GHz. So, that should be an excellent test of your claims.

Intel renamed its nodes to bring them in line with the competition. A17 is on a 3nm node, so Intel 4 would still be behind. There is an Intel 3, but I don't know if Intel plans to release any mainstream CPU's on it. I believe they plan to skip to 20A.

For 3nm, at least, Apple has a better performing node, N3B, than what everyone else is going to be using, N3E. Do we know if the same situation exists for other current/older nodes?
 
They're measuring package power, not cpu core power. I'm talking about CPU core power which will be lower.
So, you're going to try to weasel out of a few Watts? Okay. 50 W was your number. Let's say we go with core power so you can squeak under the line. It doesn't change much.

BTW, James was clearly talking about the CPU:

Your comparing an 8.5 watt TDP part to a 100+ watt TDP part. In no world is that an apples to apples comparison and not even worth mentioning.

So, it's more than a little disingenuous for you to shift the discussion from package power @ single-threaded load to single-core power, but if that's how you want to play...

There are so many asterisks tied to this statement that it pretty much lacks any factual basis. Even the most power hungry single x86 cores don't use remotely close to 100W or even half that.

...and, if we're to language lawyer this, it sure sounds like you were saying they don't use "remotely close" to 50 W, but I expect you'll disagree.

Even so, their measurement still looks peculiarly high.
It does, but that's why I also cited ComputerBase, which was a little more in line with what I was expecting and still easily beats the 50 W threshold on package power.

Anandtech measured power usage of an M1 Max, POVRay had by far the least power usage of the single threaded test.
119344.png

That's half to a third of the other tests, while on a 12900k it's freakishly above everything else?
For one thing, the M1 doesn't have SVE - it's using just 128-bit NEON SIMD, which is the equivalent of SSE. If you look at the single-threaded scores of the native apps (i.e. not CineBench), Pov-RAY is also the least competitive score posted up by the M1, in that article.

M1 Max MacBook Pro 16"(M1 Max)(M1 Max)Intel i9-11980HK MSI GE76 Raider(i9-11980HK)(i9-11980HK)
ScorePackage Power (W)Wall Power Total - Idle (W)ScorePackage Power (W)Wall Power Total - Idle (W)
Idle
0.2​
7.2
(Total)
1.08​
13.5
(Total)
502.gcc_r ST
11.9​
11.0​
9.5​
10.7​
25.5​
24.5​
511.povray_r ST
10.3​
5.5​
8.0​
10.7​
17.6​
28.5​
503.bwaves_r ST
57.3​
14.5​
16.8​
44.2​
19.5​
31.5​
Source: https://www.anandtech.com/show/17024/apple-m1-max-performance-review/3
The M1 trounced the 8-core Tiger Lake i9, in the other two singe-threaded benchmarks. So, whether it was due to NEON SIMD or something else, that's at least consistent with using usually low power. In other words, it suggests the (only) anomaly isn't Anandtech measurements of Alder Lake that are the outlier, but that their measurement of the M1 Max is an even bigger outlier.

I stand by my original statement.
Which statement? That no cores use remotely close to 50 W? Because you didn't address the ComputerBase data, at all. And, let's just call your attempt to use data from a completely different ISA a feeble attempt, at best.
 
Just clock reductions to match performance then undervolting accordingly. Stop making things overly complicated.
I'm not. I'm just trying to get you to make a clear statement that we can attempt to analyze.

Intel renamed its nodes to bring them in line with the competition. A17 is on a 3nm node, so Intel 4 would still be behind.
But we don't have any laptops which use the A17, which is why I proposed comparing the M2 (which uses N5) against Meteor Lake. That's an even better comparison, since they will be competing with each other in the market.

For 3nm, at least, Apple has a better performing node, N3B, than what everyone else is going to be using, N3E.
No, it's not. According to this, N3B will deliver only 15% more performance than N5 (ISO-power), while N3E will perform 18% better than N5, along the same metric:

Do we know if the same situation exists for other current/older nodes?
I don't know specifically what you're asking.
 
So, you're going to try to weasel out of a few Watts? Okay. 50 W was your number. Let's say we go with core power so you can squeak under the line. It doesn't change much.

BTW, James was clearly talking about the CPU:

So, it's more than a little disingenuous for you to shift the discussion from package power @ single-threaded load to single-core power, but if that's how you want to play...

...and, if we're to language lawyer this, it sure sounds like you were saying they don't use "remotely close" to 50 W, but I expect you'll disagree.


It does, but that's why I also cited ComputerBase, which was a little more in line with what I was expecting and still easily beats the 50 W threshold on package power.


For one thing, the M1 doesn't have SVE - it's using just 128-bit NEON SIMD, which is the equivalent of SSE. If you look at the single-threaded scores of the native apps (i.e. not CineBench), Pov-RAY is also the least competitive score posted up by the M1, in that article.
M1 Max MacBook Pro 16"(M1 Max)(M1 Max)Intel i9-11980HK MSI GE76 Raider(i9-11980HK)(i9-11980HK)
ScorePackage Power (W)Wall Power Total - Idle (W)ScorePackage Power (W)Wall Power Total - Idle (W)
Idle
0.2​
7.2
(Total)
1.08​
13.5
(Total)
502.gcc_r ST
11.9​
11.0​
9.5​
10.7​
25.5​
24.5​
511.povray_r ST
10.3​
5.5​
8.0​
10.7​
17.6​
28.5​
503.bwaves_r ST
57.3​
14.5​
16.8​
44.2​
19.5​
31.5​


The M1 trounced the 8-core Tiger Lake i9, in the other two singe-threaded benchmarks. So, whether it was due to NEON SIMD or something else, that's at least consistent with using usually low power. In other words, it suggests the (only) anomaly isn't Anandtech measurements of Alder Lake that are the outlier, but that their measurement of the M1 Max is an even bigger outlier.


Which statement? That no cores use remotely close to 50 W? Because you didn't address the ComputerBase data, at all. And, let's just call your attempt to use data from a completely different ISA a feeble attempt, at best.
I'm not even sure what you are arguing here. You already agreed with my original point, so I'm going to move on.
 
No, it's not. According to this, N3B will deliver only 15% more performance than N5 (ISO-power), while N3E will perform 18% better than N5, along the same metric:
So the cheaper N3E performs better across the board with a slight decrease in density. This article got it wrong then.


The initial version of the A17 Bionic chip will reportedly be manufactured using TSMC's N3B process, but Apple is planning to switch the A17 over to N3E sometime next year. The move is said to be a cost-cutting measure that could come at the expense of reduced efficiency.
 
Status
Not open for further replies.