News Apple's M2 Beats AMD's Ryzen 7 6800U in Shadow of the Tomb Raider

Shadow of the Tomb Raider is running under Rosetta 2, it's definitely not optimized for ARM/Apple Silicon. It is optimized for macOS however since it uses Metal, but it's getting a fair dig in performance anyway since it's x86/x64 compiled.
 
The M2 doesn't sip far less power. As HU shows AMD's Rembrandt is very comparable most of the time with similar power efficiency. Sometimes worse, but sometimes also better. Which is quite impressive for Ryzen 6000, considering it uses a worse manufacturing process and an almost two year old core architecture. 1-3 more fps also isn't something I would call beating. It's practically a tie. Again, at similar power consumption. Which makes AMD's Rembrandt even more impressive, considering it has ~35% less transistors to achieve that performance. And unlike the M2 you can run almost any game on Ryzen 6000.
 
The M2 doesn't sip far less power. As HU shows AMD's Rembrandt is very comparable most of the time with similar power efficiency. Sometimes worse, but sometimes also better. Which is quite impressive for Ryzen 6000, considering it uses a worse manufacturing process and an almost two year old core architecture. 1-3 more fps also isn't something I would call beating. It's practically a tie. Again, at similar power consumption. Which makes AMD's Rembrandt even more impressive, considering it has ~35% less transistors to achieve that performance. And unlike the M2 you can run almost any game on Ryzen 6000.

The issue is you're comparing an emulated game vs a natively executed game. It's not really only 1 to 3 frames better, compile the game for ARM and run it under emulation on the 6800 and see what you get.

P.S. yes, yes, I know Rosetta 2 is translation, but you still get a significant drop in performance vs none translated applications none the less
 
Last edited:
  • Like
Reactions: artk2219
Shadow of the Tomb Raider is running under Rosetta 2, it's definitely not optimized for ARM/Apple Silicon. It is optimized for macOS however since it uses Metal, but it's getting a fair dig in performance anyway since it's x86/x64 compiled.

yes we all know that arm is running translation x86 code, but the reason is arm is not previously treated as PC game platform. also apple hw is very expensive, too. before arm could take over 1/3 market share, this will still happen again and again.
 
  • Like
Reactions: artk2219
yes we all know that arm is running translation x86 code, but the reason is arm is not previously treated as PC game platform. also apple hw is very expensive, too. before arm could take over 1/3 market share, this will still happen again and again.
Yep, it's a fair comparison. And considering how few developers were willing to port games to Macs even when they were on X86, adding an additional porting complexity layer by moving to ARM means that the number of ports of big games to the platform is only likely to drop further. So, I wouldn't expect many demanding games to be running natively on the M2 chip anytime soon, meaning the emulated performance will likely be as good as it gets for at least the near-future. Maybe ports of mobile games could fare better, but those generally won't be pushing the hardware's limits as much.
 
  • Like
Reactions: artk2219
The issue is you're comparing an emulated game vs a natively executed game.
That's irrelevant if the test was GPU bound. GPUs don't care about x86 or ARM. They have their own ISA. Which has to be translated by drivers anyway. No matter if it's AMD or Apple silicon.

And no, even the CPU machine code is not really "emulated". Rosetta 2 supports JIT translation and AOT compilation. So, in fact the code has to be converted only once at runtime or before. Then it runs natively as well. You won't see much difference to a natively compiled executable. If the whole game was just emulated then no way the M2 could achieve such a performance. Look at projects like DOSBox. That are real emulators as they interpret and translate every instruction one by one at runtime. For old DOS applications it doesn't matter at all. Because modern processors are fast enough. But on modern and high demanding apps you would face a heavy performance penalty.
 
Last edited:
  • Like
Reactions: artk2219
The 6800U isn't marketed as a "gaming" CPU, it's a U-series <25w design for everyday use (even though it has the rdna2 igpu).

Not that I in any way question the performance, but the notion that Apple's "non-gaming CPU" beats "AMDs gaming CPU" seems odd.

sGGfywgWfJwBDeTzQdEi53-970-80.jpg
 
  • Like
Reactions: artk2219 and gruffi
Shadow of the Tomb Raider is running under Rosetta 2, it's definitely not optimized for ARM/Apple Silicon. It is optimized for macOS however since it uses Metal, but it's getting a fair dig in performance anyway since it's x86/x64 compiled.

I played through both Tomb Raider and Shadow of the Tomb Raider at the beginning of the year, on a Ryzen 5 3600 with GTX 1070. The Xbox Game Bar lets you monitor resource usage while you're playing, and neither of those two games ever got much above 20% CPU usage, with typical usage somewhere between 10 and 20% during gameplay. They're definitely not CPU intensive games, and I wouldn't expect a native version to perform any better.
 
  • Like
Reactions: artk2219 and gruffi
I don't know what the belly-aching is for. This is a pretty fair comparison of low power M2 iGPU / APU vs what is probably the best low power iGPU / APU in the x86 world.

Apple pretty much rules this very low power space in performance. They have the best LOW POWER laptop CPU and iGPU.

The issue is that if you need more performance in the Apple ecosystem, you really can't get much more. There's no equivalent right now for a competitor to say the 5900X or 12700K, much less the 5950X or 12900K.

There also isn't any good competitor to the 12900H or 6900H.

Take a look at what happens when you put the M2 in perspective against other recent laptop chips - not restricted by 25W max power.

I mean, a 1280P - which is a 28W median chip allowed to go to 64W on turbo - is 13% faster. The 45W 12800H is 31% faster. The top line 12900HX is 84% faster.

Apple does not have anything that competes with these, so if you need anything faster than the M2 you are SoL in the Apple space.

bkYsXv5.jpg
 
  • Like
Reactions: artk2219
That's irrelevant if the test was GPU bound. GPUs don't care about x86 or ARM. They have their own ISA. Which has to be translated by drivers anyway. No matter if it's AMD or Apple silicon.

And no, even the CPU machine code is not really "emulated". Rosetta 2 supports JIT translation and AOT compilation. So, in fact the code has to be converted only once at runtime or before. Then it runs natively as well. You won't see much difference to a natively compiled executable. If the whole game was just emulated then no way the M2 could achieve such a performance. Look at projects like DOSBox. That are real emulators as they interpret and translate every instruction one by one at runtime. For old DOS applications it doesn't matter at all. Because modern processors are fast enough. But on modern and high demanding apps you would face a heavy performance penalty.

This is not correct, just look at games that are universal binaries for macOS (x86/64 + ARM in the same binary), they typically have about a 10 - 15% boost in FPS on average. I agree games are more GPU bound than CPU, but the CPU still plays a role, just see the 5800 x3d for evidence if you're skeptical.
 
  • Like
Reactions: artk2219
I don't know what the belly-aching is for. This is a pretty fair comparison of low power M2 iGPU / APU vs what is probably the best low power iGPU / APU in the x86 world.

Apple pretty much rules this very low power space in performance. They have the best LOW POWER laptop CPU and iGPU.

The issue is that if you need more performance in the Apple ecosystem, you really can't get much more. There's no equivalent right now for a competitor to say the 5900X or 12700K, much less the 5950X or 12900K.

There also isn't any good competitor to the 12900H or 6900H.

Take a look at what happens when you put the M2 in perspective against other recent laptop chips - not restricted by 25W max power.

I mean, a 1280P - which is a 28W median chip allowed to go to 64W on turbo - is 13% faster. The 45W 12800H is 31% faster. The top line 12900HX is 84% faster.

Apple does not have anything that competes with these, so if you need anything faster than the M2 you are SoL in the Apple space.

bkYsXv5.jpg

Apple does have parts that compete with those...

The M1 Pro and Max compare directly with a 12800H and 12900HK. Comparing the M2 to a 12800H or 12900HX is cherry picking the lowest end CPU form one vendor and comparing it to the top end of another vendors. M1 Pro and Max both fit within the 45 and 65W envelopes the 12800 and 12900HK can be restricted to and have similar performance. In the same benchmarks the M1 Pro scores 9942 vs the 12800H's 8962 (10% better performance), the M1 Max scores 12697 vs the 12900HX's 13648 (7.5% lower performance). If you look at the average score of the M1 Max vs the average score of the 12900HK it's 12563 (M1 Max) vs 11868 (12900HK), but for the sake of argument we'll go with the advertised score for the 12900HK. Both of the Intel CPUs in these benchmarks are running with RTX 3080 GPUs attached.

https://www.notebookcheck.net/i9-12900HK-vs-M1-Max-vs-M1-Pro-8-Core_14041_13843_13847.247596.0.html
 
  • Like
Reactions: artk2219
The biggest problem with these Mac/Windows comparisons is that mac ports basically suck 99% of the time and ultra settings on a mac is nowhere near the same as ultra settings on a pc with much less eye candy and graphics fidelity. It would be the same as claiming a 3060 on low settings at HD is faster than a 3090 running on ultra settings at 4K and claiming victory for the 3060.
 
  • Like
Reactions: artk2219
Apple does have parts that compete with those...

The M1 Pro and Max compare directly with a 12800H and 12900HK. Comparing the M2 to a 12800H or 12900HX is cherry picking the lowest end CPU form one vendor and comparing it to the top end of another vendors. M1 Pro and Max both fit within the 45 and 65W envelopes the 12800 and 12900HK can be restricted to and have similar performance. In the same benchmarks the M1 Pro scores 9942 vs the 12800H's 8962 (10% better performance), the M1 Max scores 12697 vs the 12900HX's 13648 (7.5% lower performance). If you look at the average score of the M1 Max vs the average score of the 12900HK it's 12563 (M1 Max) vs 11868 (12900HK), but for the sake of argument we'll go with the advertised score for the 12900HK. Both of the Intel CPUs in these benchmarks are running with RTX 3080 GPUs attached.

https://www.notebookcheck.net/i9-12900HK-vs-M1-Max-vs-M1-Pro-8-Core_14041_13843_13847.247596.0.html

Try comparing them M1 Max / Pro to x86 instead of to the M1.

You get a lot more performance consistency with a MacBook for obvious reasons, but if you look at this chart closely, a 12700H in a good laptop chassis can solidly beat an M1 Pro by a wide margin.

The M1 Pro / Max are also Apple's premier desktop CPUs - and you'll pay a premium for that.

The M1 Pro is completely uncompetitive against both Intel and AMD midrange desktop parts (12600K / 5800X and higher).

5fgElOi.jpg


The 12700K average here is literally more than 59% faster than the M1 Max, while the 5800X is 25% faster. These are fairly common, upper midrange parts now. The M1 Max and pro are Apples top of the line.

And therein is what I am saying - the best that it gets in Apple's current lineup is easily thrashed by midrange x86 parts.

BSIoCGz.jpg
 
  • Like
Reactions: artk2219
Try comparing them M1 Max / Pro to x86 instead of to the M1.

You get a lot more performance consistency with a MacBook for obvious reasons, but if you look at this chart closely, a 12700H in a good laptop chassis can solidly beat an M1 Pro by a wide margin.

The M1 Pro / Max are also Apple's premier desktop CPUs - and you'll pay a premium for that.

The M1 Pro is completely uncompetitive against both Intel and AMD midrange desktop parts (12600K / 5800X and higher).

5fgElOi.jpg


The 12700K average here is literally more than 59% faster than the M1 Max, while the 5800X is 25% faster. These are fairly common, upper midrange parts now. The M1 Max and pro are Apples top of the line.

And therein is what I am saying - the best that it gets in Apple's current lineup is easily thrashed by midrange x86 parts.

BSIoCGz.jpg

Now you're comparing a desktop part rated at 125w TDP (over 200 turbo) to a 60w laptop part. These are not Apples to Apples comparisons. You can't compare something that can use 200+ watts with 190w turbo TDP and claim victory against something that uses 60 on a bad day. Not even mentioning that 12700K is regularly overclocked which will skew the results, power consumption and TDP even further. It's just not a proper comparison. Intel's best laptop CPU is equal to Apple's best laptop CPU which was my point when you said Apple doesn't have anything that competes with a 12800H or a 12900HK which is clearly incorrect.
 
  • Like
Reactions: artk2219
This is not correct, just look at games that are universal binaries for macOS (x86/64 + ARM in the same binary), they typically have about a 10 - 15% boost in FPS on average.
Universal binaries are another topic. It's the same with binaries that support different instruction set extensions (like SSE or AVX for x86). At some point you need code branches which MIGHT lead to a performance loss. But well optimized it shouldn't be noticeable. Definitely not 10-15%.

With good JIT/AOT implementations you have a performance loss only ONCE at runtime. Or even before, when the app gets loaded. Where it doesn't matter at all.

But again, that's irrelevant. If the test was GPU bound and CPU utilization is very low, as Jiminez suggests, then even 10-15% make no real difference. At 10-20% utilization with a native binary you might need something like 11-23% utilization with a nonnative binary. That's negligible and shouldn't result in fewer fps.

But what I wanted to point out, emulation has a much bigger impact. Here are some examples from DOSBox, a real emulator, and what host processor you would need to emulate another processor.

Host CPUEmulated CPU
Pentium Pro 200 MHz286 6 MHz
Pentium II 350 MHz386SX 25 MHz
Pentium III 1.0 GHz486 66 MHz
Pentium 4 3.0 GHzPentium 133 MHz
Core 2 Duo 3.3 GHzPentium II 300 MHz
Core i5 4xxx 4.0 GHzPentium III 500 MHz
 
  • Like
Reactions: artk2219
Now you're comparing a desktop part rated at 125w TDP (over 200 turbo) to a 60w laptop part. These are not Apples to Apples comparisons. You can't compare something that can use 200+ watts with 190w turbo TDP and claim victory against something that uses 60 on a bad day. Not even mentioning that 12700K is regularly overclocked which will skew the results, power consumption and TDP even further. It's just not a proper comparison. Intel's best laptop CPU is equal to Apple's best laptop CPU which was my point when you said Apple doesn't have anything that competes with a 12800H or a 12900HK which is clearly incorrect.

I already said that Apple had the best within the low power space, and that it's issue was that it had nothing to offer if you moved up the food chain.

Since Apple has nothing beyond the general category of a 6800U or a 1260P, I'll compare it to what I like beyond that. Try reading a little closer.
 
  • Like
Reactions: artk2219
Universal binaries are another topic. It's the same with binaries that support different instruction set extensions (like SSE or AVX for x86). At some point you need code branches which MIGHT lead to a performance loss. But well optimized it shouldn't be noticeable. Definitely not 10-15%.

With good JIT/AOT implementations you have a performance loss only ONCE at runtime. Or even before, when the app gets loaded. Where it doesn't matter at all.

But again, that's irrelevant. If the test was GPU bound and CPU utilization is very low, as Jiminez suggests, then even 10-15% make no real difference. At 10-20% utilization with a native binary you might need something like 11-23% utilization with a nonnative binary. That's negligible and shouldn't result in fewer fps.

But what I wanted to point out, emulation has a much bigger impact. Here are some examples from DOSBox, a real emulator, and what host processor you would need to emulate another processor.

Host CPUEmulated CPU
Pentium Pro 200 MHz286 6 MHz
Pentium II 350 MHz386SX 25 MHz
Pentium III 1.0 GHz486 66 MHz
Pentium 4 3.0 GHzPentium 133 MHz
Core 2 Duo 3.3 GHzPentium II 300 MHz
Core i5 4xxx 4.0 GHzPentium III 500 MHz

How was Jiminez measuring that, over 4, 6, 8, 16 logical cores? I would put money down that if you look at a single core utilization it's higher than 20%, most games can't do a lot with a large number of cores/threads, especially First Person Shooters were almost all almost rendering is done from user movements in the main game loop. It can be parallel a little, but not to a massive extent.

Again, you just have to look at the FPS increase from the 5800x to the 5800 x3d, it should be obvious that differences in op code handling can have a big impact. The 5800x3d get's 20% more FPS in Shadow of the Tomb Raider despite having a lower clock, ops per second clearly matter on the CPU for that game, there is no arguing it. With the AOT argument you're making an assumption that Apple put a massive effort into optimizing those AOT instructions to minimize the instruction overhead. The odds are they didn't, I'm sure they did some, but this isn't Java or .Net where if the performance isn't well optimized, no one is going to use your platform. In Rosetta's case it's just a hold you over solution, it doesn't need to be optimized for peak performance and likely isn't. Rosetta2 hasn't changed performance in 2 OS versions and I haven't seen any evidence that macOS 13 is bringing improvements. My guess is Apple is done with it and have called it good enough, but I doubt it's fully optimized in the way your typical VM AOT engines are.
 
Last edited:
  • Like
Reactions: artk2219
I already said that Apple had the best within the low power space, and that it's issue was that it had nothing to offer if you moved up the food chain.

Since Apple has nothing beyond the general category of a 6800U or a 1260P, I'll compare it to what I like beyond that. Try reading a little closer.

My point was you can compare beyond it all you like it doesn't prove anything when comparing those two class types was my point. You essentially looked beyond it from one side. Apple has almost never used desktop class CPUs in their machines, they have on occasion, but rarely. Apple usually does laptop parts and server parts only, until Apple comes out with whatever their Mac Pro processor will be, you can't draw any conclusions on how well it will scale (not MacBook Pro). What's worse is your ignoring core count here when comparing multicore scores. The M1 Ultra will look like it's on par with a 12900k in the R23 scores, but it's not really.

Currently the M1/M2 don't exceed 3.2/3.4 GHz, Intel's 125/241w CPUs almost all boost up to the high 4+ GHz (some in the 5 GHz range. In the M1/M2 scale simply from a frequency point of view go up 41% the odds are it will match those same processors you referenced for desktop. Maybe they can't scale that high, it's a possibility, but there are plenty of CPU designs out there that can go over 4 GHz, I would be surprised if that was a limiting factor. That's why when doing these comparisons needs to be done looking at the same baseline parts instead of taking parts that are not meant to compete with each other.
 
Last edited:
That's irrelevant if the test was GPU bound. GPUs don't care about x86 or ARM. They have their own ISA. Which has to be translated by drivers anyway. No matter if it's AMD or Apple silicon.
This is a really good point. Shadow of the Tomb Raider was only running on integrated graphics, so this was almost entirely a test of integrated graphics performance. Looking at the video, the game was only averaging around 30fps on both the 6800U and the M2, so it's unlikely CPU core performance (or emulation for that matter) was playing much of a role here.

And really, we're talking about a roughly 2 fps difference, so the title of this article seems a bit click-bait. In reality, performance was rather similar between the two, and according to the video "both were running at a similar level of power consumption".
 
  • Like
Reactions: artk2219 and gruffi
This is a really good point. Shadow of the Tomb Raider was only running on integrated graphics, so this was almost entirely a test of integrated graphics performance. Looking at the video, the game was only averaging around 30fps on both the 6800U and the M2, so it's unlikely CPU core performance (or emulation for that matter) was playing much of a role here.

And really, we're talking about a roughly 2 fps difference, so the title of this article seems a bit click-bait. In reality, performance was rather similar between the two, and according to the video "both were running at a similar level of power consumption".

The game is largely GPU bound, no one here is arguing that, but from the chart below the CPU is not inconsequential to the overall frame rate, there is a 20% difference between the top and bottom and 8% difference if you factor out the 5800x3d and the 5800x bottom result. Also, it's clear by these stats (and others) that this game is largely written to utilize a single core on the CPU side. If you take this evidence and large evidence and the evidence that FPS performance for not Rosetta2 games is about 10% then it's easy to suspect that Rosetta2 is causing some loss in FPS, likely 2 to 5 FPS, making the difference close to 4 to 8 overall, not Earth shattering, but it shouldn't be thrown out when doing comparisons either.



Shadow-of-the-Tomb-Raider-FPS-1280-x-720-Pixels.png
 
  • Like
Reactions: artk2219