News Instinct MI300 Could be Fastest Product to $1 Billion in AMD History

Giroro

Splendid
So is there suddenly, like, a lot more demand for GPUs with no tensor cores that can't run CUDA?

Because it doesn't sound like AMD has solved either of those problems.
 

The Hardcard

Distinguished
Jul 25, 2014
33
42
18,560
So is there suddenly, like, a lot more demand for GPUs with no tensor cores that can't run CUDA?

Because it doesn't sound like AMD has solved either of those problems.
Read more on AMD Instinct. The previous generation had matrix math cores and the MI 250X can dance with the A100 in most use cases.

ROCm HIP stack has functional parity with CUDA. Worst case you have to modify your code some, but a lot of code will run unmodified. Sometimes you can even leave the CUDA headers intact.

AMD’s remaining problem is that ROCm is still harder and more confusing to use. The documentation is behind and there are more code specific, GPU specific quirks and issues. But those gaps are getting closed.
 
  • Like
Reactions: Avro Arrow
I was wondering when El Captian would be coming online because I think it was like three years ago when Cray was first talking about it. I remember how they wanted to use Radeon Instinct because of the great successes that the US government already had with the Frontier supercomputer. The capabilities of El Capitan sounded pretty exciting but then we heard nothing about for a long time. Supercomputers are the one topic that actually makes me somewhat interested in what is possible on the server side.
 
  • Like
Reactions: Order 66

Order 66

Grand Moff
Apr 13, 2023
2,165
909
2,570
I was wondering when El Captian would be coming online because I think it was like three years ago when Cray was first talking about it. I remember how they wanted to use Radeon Instinct because of the great successes that the US government already had with the Frontier supercomputer. The capabilities of El Capitan sounded pretty exciting but then we heard nothing about for a long time. Supercomputers are the one topic that actually makes me somewhat interested in what is possible on the server side.
I know that supercomputers are not meant for gaming. I would like to see a supercomputer that is made for gaming, or maybe a game being coded well to scale well on a supercomputer. I wonder if games were coded to run on supercomputers or if supercomputers were designed for gaming, what the performance would be. I bet cyberpunk could be run at 8k max settings at 120 fps, if it scaled well and was coded properly, or is this totally impossible to do?
 
  • Like
Reactions: Avro Arrow
So is there suddenly, like, a lot more demand for GPUs with no tensor cores that can't run CUDA?

Because it doesn't sound like AMD has solved either of those problems.
That's because you think that the entire computing world is PCs when they're actually a tiny fraction of what is actually out there. The server/data centre side makes the PC market look like a bad joke in comparison. What, do you think that these supercomputers are being used for Blender? Man, that's just hilarious!

Unfortunately for nVidia, computers of this scale are used for simulating extremely complex atomic and molecular interactions like protein folding and nuclear chain reactions. WTH is CUDA going to do for that? Literally nothing, as this list of fastest supercomputers in the world shows us.

The top-5 fastest supercomputers in the world are the following:

1) Cray Frontier (AMD EPYC, Radeon Instinct), USA
2) Fujistu Fugaku (Fujitsu AFX64 ARM, no dedicated GPU cores), Japan
3) Cray LUMI (AMD EPYC, Radeon Instinct), Finland
4) Atos Leonardo (Intel Xeon, nVidia Ampere), Italy
5) IBM Summit (IBM POWER9, nVidia Tesla), USA

When El Capitan comes online, three of the top-5 (#1, #2 & #4) supercomputers in the world will be using Radeon GPUs with the fastest supercomputer using nVidia GPUs will be #5.

If CUDA actually made a difference in supercomputers, we wouldn't see Radeons in the fastest and most expensive supercomputers in the world, would we?

So, with almost all of the fastest supercomputers in the world using Radeons, just what "problems" are you referring to?
 
  • Like
Reactions: Order 66
I know that supercomputers are not meant for gaming. I would like to see a supercomputer that is made for gaming, or maybe a game being coded well to scale well on a supercomputer. I wonder if games were coded to run on supercomputers or if supercomputers were designed for gaming, what the performance would be. I bet cyberpunk could be run at 8k max settings at 120 fps, if it scaled well and was coded properly, or is this totally impossible to do?
That will never happen because games don't need a lot of cores to play so what is a computer with hundreds of thousands of cores going to do with a game? Even the twelve and sixteen-core Ryzen 9 X3D CPUs are showing issues with CCX selection, making them slower at gaming than the single CCX R7-7800X3D.

Here's what a fast PC is like:
1-17.jpg

Here's what a fast supercomputer is like:
C781407

You really can't use one for the purposes of the other because they're completely different animals. Trying to use a supercomputer for gaming would be like trying to win a race with a Tar Sands Earth Mover. It's just not going to happen. ;)
 
  • Like
Reactions: Order 66

purpleduggy

Prominent
Apr 19, 2023
167
44
610
So is there suddenly, like, a lot more demand for GPUs with no tensor cores that can't run CUDA?

Because it doesn't sound like AMD has solved either of those problems.
CUDA is actually not as widely used as believed. Even on Nvidia GPUs which have the largest datacentre usage. CUDA might be popular amongst gamers who want to do things on their PCs, but in datacentre specifically running workloads that require non-proprietary APIs such as sensitive government data, weather models, gis data etc, there are many alternatives to choose from. locking in to CUDA is like choosing an off the shelf component when the budget requires a custom component. The really big clients run entirely their own custom APIs.
 
  • Like
Reactions: Avro Arrow
What do you mean? I have heard nothing about it.
Well, consider that the R7-7800X3D out-performs the R9-7900X3D and R9-7950X3D in games. The reason there is that only one CCX on the latter two CPUs has the 3D V-Cache and sometimes the scheduler makes a mistake and sticks the game on the CCX that doesn't have it.

There also used to be problems with Threadrippers when gaming. Often, disabling one CCX resulted in improved gaming performance.
 
  • Like
Reactions: Order 66

Order 66

Grand Moff
Apr 13, 2023
2,165
909
2,570
Well, consider that the R7-7800X3D out-performs the R9-7900X3D and R9-7950X3D in games. The reason there is that only one CCX on the latter two CPUs has the 3D V-Cache and sometimes the scheduler makes a mistake and sticks the game on the CCX that doesn't have it.

There also used to be problems with Threadrippers when gaming. Often, disabling one CCX resulted in improved gaming performance.
Is there any workaround? I feel like there has to be some kind of third party tool that fixes this.
 
  • Like
Reactions: Avro Arrow

Order 66

Grand Moff
Apr 13, 2023
2,165
909
2,570
I think it is good that AMD's server GPUs are getting some recognition. I feel like everyone just assumes that AMD' server GPUs are worthless in comparison to Nvidia. Heck until about a year ago, I didn't really know that AMD made server GPUs, I mean I knew they had their Radeon Pro GPUs, but I had assumed they were only for workstations. I wonder how a Radeon Pro W7900 would perform in games (I know it is not meant for it, nor is it good value) considering that IIRC (I could be wrong) you can install regular Radeon gaming drivers for them.
 
  • Like
Reactions: Avro Arrow
Is there any workaround? I feel like there has to be some kind of third party tool that fixes this.
There isn't. The problem was not enough interest. The number of people who use 12 and 16-core CPUs for gaming is astonishingly small. Those are productivity CPUs and I tore AMD a new one for releasing X3D versions of them because there was literally no point for those CPUs to exist.

Productivity CPUs are far more expensive than mainstream or gaming-oriented CPUs so gamers don't bother buying them.
 
  • Like
Reactions: Order 66

Order 66

Grand Moff
Apr 13, 2023
2,165
909
2,570
There isn't. The problem was not enough interest. The number of people who use 12 and 16-core CPUs for gaming is astonishingly small. Those are productivity CPUs and I tore AMD a new one for releasing X3D versions of them because there was literally no point for those CPUs to exist.

Productivity CPUs are far more expensive than mainstream or gaming-oriented CPUs so gamers don't bother buying them.
Surely, it can't be that hard (for people who know how) for someone to make a quick workaround essentially telling games to only use the CCX with the 3d cache if possible. even 6 cores and 12 threads (I am saying that because if a game were to only use 1 of 2 CCXs then the number of cores would be cut in half.) should be enough for games with the 3d cache.
 
Surely, it can't be that hard (for people who know how) for someone to make a quick workaround essentially telling games to only use the CCX with the 3d cache if possible. even 6 cores and 12 threads (I am saying that because if a game were to only use 1 of 2 CCXs then the number of cores would be cut in half.) should be enough for games with the 3d cache.
The problem is with the Windows scheduler itself. It's buggy but in a way that makes no difference for 99% of CPUs and only makes a difference for dual CCX CPUs when gaming.
 
  • Like
Reactions: Order 66
Is there any workaround? I feel like there has to be some kind of third party tool that fixes this.
The problem is when you suddenly shift work across the infinity fabric to the second CCD with no X3D cache you are hit with the latency penalty of the interconnect and the penalty of not having access to all of the X3D cache. The increased latency hit can cause stuttering and the use of the cores without the cache cause sudden framerate drops. The only way to fix this is to either set affinity of the programs that benefit from X3D cache to just those cores, microsoft makes/updates their scheduler to better take advantage of the CPUs resources, or AMD creates an AGESA update that works with the current Windows scheduler better.
 
  • Like
Reactions: Order 66