News AMD launches Ryzen 9 9950X3D and 9900X3D, claims 20% faster gaming performance than Intel’s flagship Arrow Lake processors

Admin

Administrator
Staff member
AMD announced its two flagship Ryzen 9000-series X3D processors here at CES 2025 in Las Vegas, with the 16-core 32-thread Ryzen 9 9950X3D leading the way with the potent Zen 5 architecture paired with AMD’s dominating game-boosting X3D technology to provide 128MB of L3 cache, all of which AMD says makes it the world’s best CPU for gaming and creator workloads.

AMD launches Ryzen 9 9950X3D and 9900X3D, claims 20% faster gaming performance than Intel’s flagship Arrow Lake processors : Read more
 
Alternate headline option:

"AMD launches Ryzen 9 9950X3D and 9900X3D, claims same gaming performance as 9800x3D"
Which is a great improvement from the previous generation where the lower end chip beat them in gaming performance.

I do more workstation things than gaming, so the 7950X3D was important, but I certainly would be better off if it also performed better in games.

New builders will not have to sacrifice gaming for productivity.
 
  • Like
Reactions: philipemaciel
New builders will not have to sacrifice gaming for productivity.
They already don't. If someone is making their CPU choice based on gaming they're a fool falling for marketing and nothing more. There's no real world difference. Benchmarks have clearly shown that at a 2K to 4K+ gaming that even a half decade old i5 only has single digit differences in framerates. Who's gaming at 1080p where it actually makes a difference? Oh and only with high end high refresh monitors and GPU's, there's only a tiny niche of competitive gaming where that's a thing so it doesn't apply to the vast majority. If someone cares about their gaming performance they focus on the GPU, simple as that, and a fast drive for loading.
 
I have only done a few AMD builds in modern times. I will most likely be doing one several months from now when things settle down. Of course, availability will be a problem. I will try to get the CPU as soon as possible, but I want to wait a bit on the Motherboards. (Just incase there are issues or better MBs coming a little later).

This reminds me of when the Athlon 64 came out. Intel was not looking good for a while. Eventually they responded with the Core Series and passed them again. This time it is looking grim for Intel. I hope they have something up their sleeve, we need them to compete.

I prefer Intel and Nvidia, but I won't hesitate to buy AMD if they have what I need. AMD is a great company!
 
  • Like
Reactions: gasaraki
In order to get me to upgrade my 7950X3D they would have needed to add the cache to both CPU dies in order to ditch the requirement for software to pick which cores to utilize. Hopefully in the future they will be able to add the cache to both chiplets to make all cores equal.
Of course they will do this - next year to get people to spend more money on upgrades.. Everything we buy in the PC space is technically "incremental" with planned updates in the pipeline for years showcasing increases in speed, efficiency, etc for those that want to update yearly (which is most PC enthusiasts 🤣)

AMD really are on a roll at the moment, and no doubts they have the architecture to crush team Blue for many years to come... Until the coin flips again in Intel's favour
 
  • Like
Reactions: KyaraM and drajitsh
In order to get me to upgrade my 7950X3D they would have needed to add the cache to both CPU dies in order to ditch the requirement for software to pick which cores to utilize. Hopefully in the future they will be able to add the cache to both chiplets to make all cores equal.
Why would you be thinking of upgrading your 7950X3D this soon anyway? If the 9950X3D was such a generational gain that made it a compelling upgrade over the 7950X3D it would only mean that the prior gen CPU was a bad product which in this case it is not.
I also highly doubt that putting 3D cache on both dies for the 950 line is a logical move since it would not add much to gaming performance while denting its productivity prowess drastically. 8c/16t is more than enough for games right now and if someone is paying $500+ for a CPU I'm pretty sure they aren't gaming at 1080p unless they are a CS Pro in that case the 9800X3D would be the better option anyway.
 
I'd love to see statistical data on how they've improved their chipset drivers to work with thread scheduling, and integration with developers. Especially support for the Linux community, which has fallen behind Windows for general purpose usage and support.

AMD is so close to switching me over to their platform, benchmarks will tell, but I'm still not sold.
 
I wonder what the target population for this 16 core X3D CPU may be:
It's definitely NOT the gamers, as they already have vitually same performance in the 8-core X3D variant, but at much lower cost.
It's certainly also not the creators, as they will get even better performance in productivity/creativity tasks from the non-X3D-variant of this chip - at a lower price.

So, the only people who might be interested in this 9950X3D chip are content creators who like to play heavy AAA games while their computer is rendering videos.
 
  • Like
Reactions: TeamRed2024
They already don't. If someone is making their CPU choice based on gaming they're a fool falling for marketing and nothing more. There's no real world difference. Benchmarks have clearly shown that at a 2K to 4K+ gaming that even a half decade old i5 only has single digit differences in framerates. Who's gaming at 1080p where it actually makes a difference? Oh and only with high end high refresh monitors and GPU's, there's only a tiny niche of competitive gaming where that's a thing so it doesn't apply to the vast majority. If someone cares about their gaming performance they focus on the GPU, simple as that, and a fast drive for loading.
I haven't watched this in awhile, so don't remember the details, but might be worth a watch.
View: https://youtu.be/98RR0FVQeqs?si=dMuzwYqJ1Al_t8Yj
 
  • Like
Reactions: Sluggotg
I wonder what the target population for this 16 core X3D CPU may be:
It's definitely NOT the gamers, as they already have vitually same performance in the 8-core X3D variant, but at much lower cost.
It's certainly also not the creators, as they will get even better performance in productivity/creativity tasks from the non-X3D-variant of this chip - at a lower price.

So, the only people who might be interested in this 9950X3D chip are content creators who like to play heavy AAA games while their computer is rendering videos.

I have no idea. I've tested with multiple AAA games running while rendering/encoding and my 9950X/4090 didn't even burp.

Already running at capped fps... don't see any x3D benefit.
 
  • Like
Reactions: ottonis
In order to get me to upgrade my 7950X3D they would have needed to add the cache to both CPU dies in order to ditch the requirement for software to pick which cores to utilize. Hopefully in the future they will be able to add the cache to both chiplets to make all cores equal.
100% i was hoping for dual ccd cache stacking .. while i game mostly i was hoping the 9950x3d bought something more than just another 9800x3d with more cores ..

I was happy to pay the premium with the 9950x3d if i knew i was going to get better performance in games for the money ..

I dont know much about cache in the sense is there diminishing returns past a certain point so if the 9800x3d has got 96mb of L3 cache and the 9950x3d has 128mb of L3 and able to use the full amount of cache over the both ccds will that result if higher frame rates ??
 
So, the only people who might be interested in this 9950X3D chip are content creators who like to play heavy AAA games while their computer is rendering videos.
Why do such people need this? It is easier for them to have 2 system units connected via a switch to one monitor, keyboard and mouse (if necessary, although gaming mice are poorly suited for everyday tasks due to increased sensitivity, and mechanical keyboards are too noisy in normal operation), completely independent in terms of their intended use, since they have money for all this. Simultaneous execution of heavy tasks on one machine will never provide both calculation of something and smooth gameplay - the architecture of buses and cores does not allow this a priori.
 
100% i was hoping for dual ccd cache stacking .. while i game mostly i was hoping the 9950x3d bought something more than just another 9800x3d with more cores ..

I was happy to pay the premium with the 9950x3d if i knew i was going to get better performance in games for the money ..

I dont know much about cache in the sense is there diminishing returns past a certain point so if the 9800x3d has got 96mb of L3 cache and the 9950x3d has 128mb of L3 and able to use the full amount of cache over the both ccds will that result if higher frame rates ??
The only reason I would want the cache to be on both CCDs is that it would eliminate the risk of a process being executed on the wrong CCD. Since having the cache under the processor allows the processor to run at the same speed as a non cached variant, having the cache on both CCDs would make every core fully and 100% interchangeable and there would not be a need to schedule threads.

Anyways, I am hoping they figure out how to add the cache across multiple CCD chips and make it work in the future. Either that, or make CCDs with more than 8 cores.
 
In order to get me to upgrade my 7950X3D they would have needed to add the cache to both CPU dies in order to ditch the requirement for software to pick which cores to utilize. Hopefully in the future they will be able to add the cache to both chiplets to make all cores equal.
I'm afraid that need to choose CCDs won't go away.

Nor is it actually caused by the presence of the V-cache, it's just exacerbated by it.

Caches exploit locality to cut down on the latency on RAM access and that locality is significantly reduced as you go off CCD.

That data can be found and used from the caches of other CCDs is great and one of the reasons EPYCs are doing so well with up to 16 CCDs (with or without V-cache) currently. But any code not aware of the CCD topology and trying to manage it consciously is likely to do much worse than code that does (while most hyperscaler workloads are scaled-in, not HPC scale-out and thus do not need to exploit locality as much).

Now in the HPC arena, where those EPYC 9684X (1152 MB of L3 cache in 12 CCD on Zen 4) or their Zen 5 successors will be used, developers carefully tune their software very carefully to take the greatest advantage of those dearly paid resources. And the presence of V-cache only means, they have to tune more to get more, because getting it wrong means shuffling more data between CCDs than getting it from RAM.

Game developers won't likely invest a similar effort for a niche that is extremely small and not likely to pay extra money for that effort.

So if your game (or OS) were to just randomly choose CPU cores assuming they are all the same and it doesn't matter, game performance, which is mostly about "fluidity" or about consistent and predictable "real-time" reactivity will suffer.

At a scheduling time slice it could choose a secondary hyperthread over a real CPU core, or it could hit an "efficiency" core or it could choose a core that doesn't have the data (nor the code) you're trying to execute in any of its three levels of cache.

If if you thought this is bad, consider the additional complexity of energy constrained computing like in mobile, where putting extra load on a currently unused core might cause all existing cores to drop their clocks to conform to thermal constraints, while the other CCD might still be cool about it...

AMD knows the reality of gaming: they sell tons of 8-core APUs for consoles, still the most important target for game developers, even if they also support PC gaming. And those console CPUs tend to be monolithic 8-core CPUs, relatively weak compared to their desktop counterparts and without any turbo complexity built-in, because that only makes life more difficult They know that GPUs dictate most of the game performance, but also the sort of game you can actually sell. There could be games never written, because they would require 128 cores to run.

Now don't get me wrong: I also would have gone and likely bought the dual V-cache CCD, but that's because my main focus isn't actually on gaming but technical architecture. And the peak clock loss from the V-cache seems to be much less in this generation because they placed it underneath the CPU. Perhaps it could even be zero, if you could selectively turn off the V-cache for workloads that won't profit from it.

But from my practical experience with my Ryzen 9 7950X3D (vs 5950X, 5800X, 5800X3D, 5700U, 7840H, 7945HX and various others, which I also own), I believe their choice is wise and right for the vast majority: chips depend on vast scales to be affordable at all, so bespoke parts like a dual V-cache desktop part simply won't reach retail. Perhaps some Youtuber will get it done anyway or AMD might even sell something like that as an EPYC.

The clock loss from the V-cache on the 7950X3D really never comes into play (not gaming) on sustained computational workloads, because thermal limits drop clocks below what the V-cache CCD could still burn even if it didn't have one, once all cores on the V-cache less CCD go full throttle: so it doesn't really "suffer" from having less peak clocks, nothing will run with 16x 5.7 GHz, except NOP perhaps. That theoretical disadvantage should be even less noticeable on the 9950X3D, perhaps not even measurable on a synthetical benchmark.

Yet core game workloads won't ever stray beyond the 8 V-cache cores, if they want to optimize fluidity. Sometimes they need help in making the right choice from an OS, but that's also needed when you have any dual (or more) CCD CPU in your system and workloads disregarding topology.

Ironically it's an area where Intel's monolithic CPUs are making a comeback. I got into Haswell and Broadwell Xeons when they became cheap enough to afford. And I noticed that I'm not the only one, there is a lot of really cheap and new "gamer" boards out there, supporting LGA 2011 CPUs which have flooded the recycled parts market in recent years.

Turns out that these are becoming quite good at gaming even with their relatively modest peak clocks, as games evolve to take advantage of multi-cores. I've observed some games really going out and loading all cores on these machines and a Xeon E5-2696 v4 has 55MB of L3, which it is putting to good use while the quad DDR4 memory subsystem also isn't that bad compared to a dual channel DDR5. One of these days I hope to get so bored, I'll put my RTX 4090 into that Xeon system to see how much performance actually suffers vs say a Ryzen 7 5800X3D, which offers very near the same multithreaded performance (and near identical CPU Wattage) with only 8 cores, but much better IPC and higher clocks.

And I bought that Broadwell 22-core Xeon E5-2696 v4 for €160, while my first Haswell 18-core E5-2696 v3 was still €700 (vs. around €5000 MSRP).

Of course these were extremely costly chips to make and become attractive for gaming only after the hyperscalers moved on.
 
Last edited:
Game developers won't likely invest a similar effort for a niche that is extremely small and not likely to pay extra money for that effort.
but with cache becoming king in games now why would game developers not invest alot into more cache orientated games..

Nobody really really cares about price theses days ( they say they do )

The people with higher income will always pay WHATEVER Nvidia charges for there over priced <Mod Edit> or high end nothing beats it GPU..

With intel licking its wounds after basically killing themselves AMD has free range to charge what it likes ..

To me cache is king in gaming so why not invest !!
 
Yet core game workloads won't ever stray beyond the 8 V-cache cores,
I remember similar statements about 2 cores, then 4, then 8, now about 16. But the reality is that x86 has a huge memory bus bandwidth gap. That is why further increase in the number of cores in the mass segment will be useless without simultaneously increasing the bandwidth of the RAM shared between all devices and software many times over.
Caches can do little when the code becomes more and more voluminous and complex. And really complex neural networks (to complicate the same banal gameplay almost indistinguishable from reality) require hundreds of terabytes of RAM with a very large bandwidth.

Of course, people will play primitive games on smartphones. And what else can they do if we have reached a silicon dead end and buying more and more advanced hardware (with ever less advantage over the old one) is becoming an increasingly difficult financial task even for the "middle class" of developed countries?
 
I wonder what the target population for this 16 core X3D CPU may be:
It's definitely NOT the gamers, as they already have vitually same performance in the 8-core X3D variant, but at much lower cost.
It's certainly also not the creators, as they will get even better performance in productivity/creativity tasks from the non-X3D-variant of this chip - at a lower price.

So, the only people who might be interested in this 9950X3D chip are content creators who like to play heavy AAA games while their computer is rendering videos.
Engineers who want to tune their code for the big EYPC HPC iron with a dozen V-cache CCDs might also be interested.

My main interest would be that the clock loss from V-cache seems significantly lower with Zen 5, so there is even less of a largely theoretical penalty.

I've seen manufacturing and parts cost estimates for the first generation V-cache around €20, much lower than I expected. If that were to be the extra cost at retail, I can't see myself not jumping for it, given I'm already spending around €500 for a CPU anyway.

Still that niche is too small for AMD to service: their focus on scale enabled to break the Intel wall of product for every conceivable niche.
 
I remember similar statements about 2 cores, then 4, then 8, now about 16. But the reality is that x86 has a huge memory bus bandwidth gap. That is why further increase in the number of cores in the mass segment will be useless without simultaneously increasing the bandwidth of the RAM shared between all devices and software many times over.
Caches can do little when the code becomes more and more voluminous and complex. And really complex neural networks (to complicate the same banal gameplay almost indistinguishable from reality) require hundreds of terabytes of RAM with a very large bandwidth.

Of course, people will play primitive games on smartphones. And what else can they do if we have reached a silicon dead end and buying more and more advanced hardware (with ever less advantage over the old one) is becoming an increasingly difficult financial task even for the "middle class" of developed countries?
Note that I didn't say that games won't stray beyond 8 cores, but beyond the 8 V-cache cores [if they are either topology aware themselves or "managed" in their topology by the OS or an OS gaming helper.

As I also said, I've seen games actually using and profiting even from the (monolithic) 22-core Xeon, when their core engine had been designed for full multi-core capabilities. But unless games understand and design for the CCD-to-CCD cliff, they are likely to suffer if they assume all cores being equal.

The memory gap isn't x86 specific, either, as twelve channel DDR5-6400 RAM on EPYCs will prove: you can buy a ton of bandwidth on x86, but very few seem ready to pay the price. It isn't easy to push bandwidth, "normal" DDR, HBM, GDDR or on-die-carrier LPDDDR stacks each owe their existance to different compromises around capacity, bandwidth, latency and price being a better fit for a given use case, none will fit all.

I also see code as being the far smaller "offender" in AI and gaming workloads, especially since that is typically unmodifiable and thus can be cached on every CCD without consistency issues and the need for snooping.

It's the data that's growing in spades (and typically requires consistency against multi-core updates and thus snooping and cache line reloads), although not quite to hundreds of terabytes just yet: hundreds of terabytes still take hundreds of seconds for a single sequential pass at current GDDR7 or even HBM data rates of 1-4TBytes/s, which means LLM token rates that are quite inacceptable. When HP promised practically infinite amounts of RAM at cache speeds and HDD economy via their memristor 10 years ago (which you might have noticed never came to pass), that mostly meant that CPUs were simply no longer capable of doing anything useful with that and that processing had to be moved into the RAM itself. That's also not become economically viable so far, even if quite a few have tried.

And actually language models run out of meaningful content these days, I remember G42 having to translate English content into Arabic to get even get beyond 7B parameters: there simply wasn't enough Arabic text to gobble up.

There are nice charts out there that prove that your "middle-class developed countries" is fading into irrelevance in gaming market revenues: it's all mobile and I'd say that gaming is actually being replaced by TikTok and similar, user orinated and AI transformed "original" content optimised towards screen addiction at the lowest price possible. What might have been your dominant form of entertainment is really just a blip in the curve of how primates amuse themselves.

Here is another shameless plug for an article I wrote a few years ago, which was sponsored by the European Comission on that subject.
 
Last edited: