• Happy holidays, folks! Thanks to each and every one of you for being part of the Tom's Hardware community!

News AMD’s Ryzen AI 300-series APUs could offer graphics performance on a par with low-end discrete GPUs

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Probably if they were to add anything it would be to add more Ai/NPU not iGPU , and they aren't doing that initially either. Since both have better add-in options, kind of makes a good iGPU a subset of a small group of buyers for that market. 2CUs for desktop UI , especially if the add-in card is doing GPGPU/Ai workloads makes sense.
Beyond that the majority of people buying that segment likely would prefer to save a buck or two or have higher stable clocks if they had their choice.... edit and also not impact memory resources.

Little secret, there is no difference between a iGPU and a "NPU" processing wise. It's the exact same vector processing doing the same calculations, only instead of running the numbers against pixel arrays, they are doing it against training data arrays that represent words, pictures or other abstracted data blocks. The silicon is then optimized by adding more cache or expanding the memory bus to truly ridiculously levels, with equally ridiculous prices.

Basic class on Vector processing (MMX, VLIW and later nVidia GPU)
https://course.ece.cmu.edu/~ece740/...seth-740-fall13-module5.1-simd-vector-gpu.pdf

nVIdia's investment into CUDA and GPU accelerated SIMD/SIMT vector programming is what enabled them to instantly take over the AI space. The "miracle" of AI was when someone figured out a way to run data pattern analysis models through a vector GPU instead of a CPU.
 
Little secret, there is no difference between a iGPU and a "NPU" processing wise.

No duh, math is math, which is why VPUs have been doing simple bulk workloads for decades, and well before nVidia & CUDA, even before F@H started applying BrookGPU stream computing on ATi cards then NV in what became the first GPGPU boom.

Where NPUs differ from GPUs is silicon budget, which was The POINT when talking about decisions regarding what's is/isn't included in the die.
NPUs don't need alot of the extraneous stuff of GPUs like TMUs & ROPs , RT, codecs ,etc. they can stick primarily to the compute so they can be much smaller, just like GPUs can be more efficient (budget wise) than more generalized CPUs too, FOCUS.

I could provide you a similar link to BrookGPU and CTM discussions (even mine here and @ B3D) from before you joined THG, instead here's AMD's whitepaper for RDNA detailing how their design choices don't just use "brute force compute" in RDNA because of all the extra : https://www.amd.com/system/files/documents/rdna-whitepaper.pdf

Even assuming an overly-generous doubling of the previous 780M TOPs to 32 in the 890M (similar to RX7600S performance) that would put it at 60% compute at two to three times the die space....

GZs0wjGemQmrzSbe.jpg



... which is why it makes more sense to add the NPU than the iGPU to a space limited die (even before considering diff in heat). It makes even more sense when it would be the only one in the market vs the myriad of options for integrated graphics in the desktop that can do a few tops and still rely on dGPUs for graphics or Ai.

The only reason I even mentioned the 2CUs was the typical desktop CPU's iGPU role to run the UI, obviously that confused you as to the point.
 
Looks like the TOPs were a bit generous if GDP's latest numbers are accurate, 80 TOPs total system (most numbers from Computex were higher due to RTX4070 included).

As it relates to the original news post, a much higher Time Spy result of 4221 was posted in GDP's material, with a reference 780M's 3218, which puts it significantly further ahead of the RTX2050, in about RX 5500/6500 territory for that specific test,

GDP positions it as between 2050 & 3050, which is technically true, but as much as it leads the 2050 by a large margin, it also trails the 3050 by a similarly large margin based on that specific test. Actual games will vary of course.

https://www.notebookcheck.net/AMD-R...Ryzen-9-7950X-in-Cinebench-2024.849515.0.html

https://mp.weixin.qq.com/s/tN4tCIZVQjXywTr4v3qYbQ
 
When will you people wake up? They literally say this every year. Literally. Every. Single. Year.

Low end GPUs will just get faster.
 
... which is why it makes more sense to add the NPU than the iGPU to a space limited die (even before considering diff in heat). It makes even more sense when it would be the only one in the market vs the myriad of options for integrated graphics in the desktop that can do a few tops and still rely on dGPUs for graphics or Ai.
There ... is... no ... difference...

This reminds me of the old Geforce vs Quadro debate where people insisted the Quadro silicon was special and "optimized for professional workloads", up until someone figured out how to reflash the BIOS and change a GeForce into a Quadro.
 
There ... is... no ... difference...

There is a difference, but obviously you don't know enough about architecture to understand it, even when it's spelled out for you showing what areas aren't needed in an NPU, also showing the physical difference in size resulting from that. If you can't understand that, then showing you the added changes in XDNA2's layout vs the previous generation is pointless.

The whole idea of dedicated silicon is that they stripped out the extra that isn't needed for just the NPU's Ai workloads.

This reminds me of the old Geforce vs Quadro debate,,.

Nothing like it, because bios & ID hacks don't change the transistor layout from one to the other, nor suddenly make that bigger iGPU in equal the NPU's TOPS performance in Strix.
 
Some recently spotted listings on BestBuy, hint that the upcoming Ryzen AI 300 Laptops are reportedly going to launch on the 15th of July, 2024.

That's pretty good MSRP for 16" 4K OLED touchscreen with that level of power, the i7 & i9 + 4070 ProArt equivalents are about $700/900+ more MSRP (but can be found on mild discounts nowadays, likely deeper once these launch).

I just wish ASUS' OLEDs were a bit better (and didn't get worse as brightness increased), it's like they traded quality for speed, in a time when they finally have competition from mini leds, but each has their drawbacks.
 
There is a difference, but obviously you don't know enough about architecture to understand it, even when it's spelled out for you showing what areas aren't needed in an NPU, also showing the physical difference in size resulting from that. If you can't understand that, then showing you the added changes in XDNA2's layout vs the previous generation is pointless.

Now your getting insulting.

There .. .is ... no ... difference in GPU vs "totally-not-a-GPU-but-really-a-GPU"

The silicon is then optimized by adding more cache or expanding the memory bus to truly ridiculously levels, with equally ridiculous prices.

The granddaddy of all this AI-all-the-things nonsense, the H100.


Tap into exceptional performance, scalability, and security for every workload with the NVIDIA H100 Tensor Core GPU

The first use of these was for enabling media acceleration in datacenters. We bought a bunch of these to help out our Virtual Desktop Infrastructure (VDI) causing doing all that stuff in CPU was just terrible. This was before the AI-craze went nuts and nVidia was marketing them as ways to increase your datacenter production.
 
Now your getting insulting.

As insulting as your first post? Little secret... 🙄
AGAIN, while the base vector processors may be similar an NPU doesn't need a bunch of things the iGPU requires, like much of the RBEs. So if it's a dedicated chunk just for Ai then you do away with the parts you don't need.

If your understanding is limited to nVidia (and limited to their history too obviously) then you will always see them as the same, because that's their path, but there are MANY different models.

As a counter example to yours, the AMD MI300 has 0 ROPs, because they are not needed. Same with the NPU here.

If they are the same as you say, then the simple ask of you (AGAIN) is to explain why the Strix Point NPU is about 1/3 to 1/2 the size of the iGPU yet is more than twice as performant for the required Ai task?

I'm talking about making use of transistor count/space and you seem to be completely missing what the transistors do and remain stuck at the PCB level of thinking of everything is the same as if the various components don't play a role in making them GPUs.
 
Another area the NPU has been trimmed down at the processing level is that they've done away with FP32 & possibly even INT32 support in the AIE-ML tile as a transistor savings, which is why they also play-up BFloat16 support and use cint16/32 to emulate greater precision.

Which would again be removing the components/transistors they don't feel they need in the NPU to match typical Ai workloads. So they can make that transistor budget as effective as possible.

We'll see if this is indeed the case once they are out in the wild, but if an AIEv2 tile is the same as an AI tile in the NPU (layout seems similar to v2 vs AIE v1)then this may be the case, according to AMD's own white-papers on the the AIE-ML architecture released earlier thus month.

https://docs.amd.com/r/en-US/am020-versal-aie-ml/Key-Differences-between-AI-Engine-and-AIE-ML

https://docs.amd.com/r/en-US/am020-versal-aie-ml/AIE-ML-Array-Hierarchy

amd-computex-2024-xdna-2_x9as.1920.jpg
 
..

It's the same silicon...
You realize that the "silicon" you refer to is broken up into a bunch of different highly specialized portions, right? An ALU is not the same thing as CU yet they are on "the same silicon..." @KnightShadey is telling you that an NPU is a highly specialized portion of silicon that excludes any part of the architecture usually afforded to a GPU to get more processing power for AI workloads.

Think of an NPU like an F1 car for AI processing versus a GPU which would be like a BMW M8 competition. An F1 car is an even more highly specific kind of vehicle meant specifically for taking turns at high speed with tons of downforce, where as, an M8 competition is a race car of sorts but has a much more generalized use case, like reliably getting you to work in the morning, but you should not expect it to keep up with an F1 car on the track. The F1 car shed any extraneous weight and parts that did not contend with making it go as fast as possible on a track, just as an NPU versus a GPU. An M8 competition has the amenities that allow it to perform a more broad roll as a vehicle. Not many F1 cars zooming around in the morning to take you to work because they are highly specialized vehicles you would only call on if you have the kind of work an F1 excels at.
 
  • Like
Reactions: KnightShadey
You realize that the "silicon" you refer to is broken up into a bunch of different highly specialized portions, right? An ALU is not the same thing as CU yet they are on "the same silicon..." @KnightShadey is telling you that an NPU is a highly specialized portion of silicon that excludes any part of the architecture usually afforded to a GPU to get more processing power for AI workloads.

The "NPU" is just a new name for GPU but with a 10x price tag increase. We have several in our datacenters we purchased for VDI desktop acceleration, prior to the renaming schemes. Was a really funny sales call we had the other month when they discussed how fortunate we were for buying those "AI accelerators". AI training algorithms are just regular vector instructions, mostly 16-bit mixed in with some 32-bit depending on the model. It's all CUDA style vector programs that get loaded into the GPU and crunch through massive quantities of data. We are used to thinking of that data in relation to flashy graphics on a screen, but could just as easily be Ethereum hashing or data relational hashing. It's the exact same processing profile.

Datacenter GPU's already have much larger amounts of memory and wider memory bus's.

But it's ok, I'm positive the crystal infused blockchain neural plasma processing warp cores will do everything you think they will.
 
The "NPU" is just a new name for GPU but with a 10x price tag increase. We have several in our datacenters we purchased for VDI desktop acceleration, prior to the renaming schemes. Was a really funny sales call we had the other month when they discussed how fortunate we were for buying those "AI accelerators". AI training algorithms are just regular vector instructions, mostly 16-bit mixed in with some 32-bit depending on the model. It's all CUDA style vector programs that get loaded into the GPU and crunch through massive quantities of data. We are used to thinking of that data in relation to flashy graphics on a screen, but could just as easily be Ethereum hashing or data relational hashing. It's the exact same processing profile.

Datacenter GPU's already have much larger amounts of memory and wider memory bus's.

But it's ok, I'm positive the crystal infused blockchain neural plasma processing warp cores will do everything you think they will.
You must have little to no understanding of how much silicon is dedicated to specific types of tasks in any given chunk of silicon. If the same amount of silicon mm^2 was dedicated to an NPU as to a GPU the NPU would be multiple times faster than the GPU at a given AI workload because an NPU sloughs off the fat of everything it doesn't need to do that specific type of workload. You are correct that a GPU can easily do AI workloads, but so called, "NPUs," do them either faster or more energy efficiently for the amount of silicon dedicated to said GPU or NPU. If you want an NPU to do anything other than an AI task it either cant, or a GPU is better at it. Your mind is so poisoned to specific marketing it is leaking into your understanding of the underlying technologies.
 
Last edited:
  • Like
Reactions: KnightShadey
But it's ok, I'm positive the crystal infused blockchain neural plasma processing warp cores will do everything you think they will.
Now, who's being insulting ? 🤨

Helper was trying to make it easier for you as you seem to be incapable of divorcing function from form, and sticking to a simplistic view of A=B.

We also have a mix (nV first, now some AMD) at the telecom company I work for, and have spoken here many times about our interest in Strix Halo (vs 7945HX+40x0) for field use, however that's all irrelevant to this discussion.
You're using your happenstance proximity to the datacenter as a justification for supercilious comments instead of well constructed arguments to defend your simplistic position that a widget is a widget.

If you're as well informed as you would have people believe "because we have them in our datacenters" , then...
AGAIN, explain the simple question of how the NPU portion of the die can provide twice as much compute throughput as the iGPU segment at half the die space?
Let me simplify it further for you with an equation: if A=B as you claim, then why is there that difference that amounts to 2A=B/2 ? 🧐
 
  • Like
Reactions: helper800
@KnightShadey is telling you that an NPU is a highly specialized portion of silicon that excludes any part of the architecture usually afforded to a GPU to get more processing power for AI workloads.

Think of an NPU like an F1 car for AI processing versus a GPU which would be like a BMW M8 competition.

Thanks for trying, he seems incapable of getting past the surface view of these and confuse function for form.

I was going to use a similar analogy using: NPU=Race Car , GPU=SUV , and CPU = 8x8 Huge RV (like the Mercedes).

While they all may be powered by a fundamentally similar V8 , the NPU race car is purpose-built to go fast under specific restricted conditions, the GPU SUV has more flexibility and can do more tasks under a wider variety of conditions at good speeds, and the CPU RV has just about everything... including the kitchen sink to do even more task under more varied conditions, but much slower.

And the H100 and MI300 are Jet-powered Semis at the fair.... both seen here doing music mashups (like 99 problems by Beach Boys) for There I Ruined it https://www.youtube.com/c/thereiruinedit ... 🫠

ftQIN2e.gif
 
  • Like
Reactions: helper800
Thanks for trying, he seems incapable of getting past the surface view of these and confuse function for form.

I was going to use a similar analogy using: NPU=Race Car , GPU=SUV , and CPU = 8x8 Huge RV (like the Mercedes).

While they all may be powered by a fundamentally similar V8 , the NPU race car is purpose-built to go fast under specific restricted conditions, the GPU SUV has more flexibility and can do more tasks under a wider variety of conditions at good speeds, and the CPU RV has just about everything... including the kitchen sink to do even more task under more varied conditions, but much slower.

And the H100 and MI300 are Jet-powered Semis at the fair.... both seen here doing music mashups (like 99 problems by Beach Boys) for There I Ruined it https://www.youtube.com/c/thereiruinedit ... 🫠
@palladin9479 is a highly knowledgeable person, however, he seems to often get caught up in the labels of things and then puts whatever that is in the 'I dont care' section. I can acknowledge that AI marketing is a plague on the IT and consumer electronics sector while also caring to look at least deep enough to understand some of the nuts and bolts of the underlying technologies. The marketing buzzwords that are 'AI,' 'TOPS,' and 'NPU' are certainly being taken hostage by corporate marketing, but that should not detract from the underlying technology aspects that they preface. Some people see some of these terms in a title and immediately just shut down, and I don't even blame them at this point.
 
  • Like
Reactions: KnightShadey
Some people see some of these terms in a title and immediately just shut down, and I don't even blame them at this point.
I understand that, but as this back & forth was initiated by his response to someone else's (my) thoughts on the subject in a thread with Ai in the title (even as just a model #), it seems he shouldn't be triggered by the discussion of that very topic to the point of ignoring the crux of the discussion. 🤔

Nor does it justify the haughty dismissive tone, not only to me, but towards you too. 🫤

I agree the Ai buzzzword bingo card is being widely filled , and it's particularly annoying when new applications/designs/uses are announced with little real world tests to verify the claims/benefits, but to me that should make someone less sure of their traditional way of thinking rather than double down on their set beliefs. Knowledge of non-knowledge, etc.
 
  • Like
Reactions: helper800
What low end GPU's...
They haven't really been making them lately...
Except they have.

Radeon 6500XT/6400XT are available.
RTX 3050 is out.

Both companies obviously haven't fleshed out the current lineups... Took AMD 2 years to do that with the RX6000 series. But those low-end GPU's are readily available despite that.

But I would also argue that the Radeon 7600 and Geforce 4060 8GB could be regarded as low end anyway.