News The refresh that wasn’t - AMD announces ‘Hawk Point’ Ryzen 8040 Series with Zen 4, RDNA3, and XDNA

Status
Not open for further replies.
I still have absolutely no idea how local AI performance is supposed to help me, or anybody, when all the commercially available AI apps require expensive cloud-compute subscriptions.

Very, very few people have the time, motivation, and knowledge to build and train these models locally. So why waste the die space?
 
  • Like
Reactions: NinoPino
I still have absolutely no idea how local AI performance is supposed to help me, or anybody, when all the commercially available AI apps require expensive cloud-compute subscriptions.

Very, very few people have the time, motivation, and knowledge to build and train these models locally. So why waste the die space?
Games!
 
  • Like
Reactions: NinoPino
I still have absolutely no idea how local AI performance is supposed to help me, or anybody, when all the commercially available AI apps require expensive cloud-compute subscriptions.
As their slides say, the NPU is intended to provide efficient inferencing. It's not a powerhouse on par with either the CPU cores or GPU.

Low-power AI is useful for things like video background removal (i.e. for video conferencing), background noise removal, presence detection, voice recognition, and AI-based quality enhancement of video content (i.e. upscaling of low-res content). I'd imagine we'll see new use cases emerge, as the compute resources to do inferencing become more ubiquitous.

The fact that this is a laptop processor is a key detail, here! Again, its value proposition is to make these AI inferencing features usable, even on battery power.

Very, very few people have the time, motivation, and knowledge to build and train these models locally. So why waste the die space?
Indeed, they did start small. I've seen an annotated Phoenix die shot showing it's only something like 5% of the SoC's area, but I'm having trouble finding it.

To the extent you feel this way, don't overlook the fact that Intel also has a new NPU in Meteor Lake (which they call their "VPU"). Call it specsmanship, if you're cynical, but maybe both companies identified a real market need. Time will tell, based on how essential they become.
 
As their slides say, the NPU is intended to provide efficient inferencing. It's not a powerhouse on par with either the CPU cores or GPU.

Low-power AI is useful for things like video background removal (i.e. for video conferencing), background noise removal, presence detection, voice recognition, and AI-based quality enhancement of video content (i.e. upscaling of low-res content). I'd imagine we'll see new use cases emerge, as the compute resources to do inferencing become more ubiquitous.

The fact that this is a laptop processor is a key detail, here! Again, its value proposition is to make these AI inferencing features usable, even on battery power.


Indeed, they did start small. I've seen an annotated Phoenix die shot showing it's only something like 5% of the SoC's area, but I'm having trouble finding it.

To the extent you feel this way, don't overlook the fact that Intel also has a new NPU in Meteor Lake (which they call their "VPU"). Call it specsmanship, if you're cynical, but maybe both companies identified a real market need. Time will tell, based on how essential they become.
Hit the nail on the head. People wonder why doing things like video chat, image sharing, computational photography, heck even video editing in some circumstances is so much easier, smoother, and frankly a better experience on a 7 watt phone than a laptop or desktop. Its because ARM chips have had neural engines running on the hardware level for years. Its why Apple Silicon chips can be so freaking fast at video encoding when their raw performance barely meets an RTX 3070 laptop.

But more to @Giroro 's original intention: Development. I program every day with Copilot and GPT4, but I can run the 13b parameter Llama2 locally on my 32gig macbook pro m1, and its actually really good. The issue is it takes up all of my RAM and I can't use the computer for anything else. Network latency for auto-complete sucks. Building an AI powered customer facing app is terrifying when you are 100% held hostage by OpenAI. There is no _consumer_ need for a locally running AI yet, but all the AI powered products of tomorrow need it today.
 
BTW, what of Phoenix 2, with its mix of Zen 4 and 4C cores? Any news about that (or similar)?

I had high hopes for the efficiency of Zen 4C running on 4nm. Any data on this would be nice to see.
 
As their slides say, the NPU is intended to provide efficient inferencing. It's not a powerhouse on par with either the CPU cores or GPU.
Based on the "AI Roadmap" AMD provided, XDNA 2.0 in Strix Point at ~50 TOPS could easily be more performant than using the CPU+GPU. And I doubt they can get there with clock speeds alone. It needs a better design or more silicon. People will have to get used to x86 including the accelerator that hundreds of millions of mobile devices have.
 
I still have absolutely no idea how local AI performance is supposed to help me, or anybody, when all the commercially available AI apps require expensive cloud-compute subscriptions.

Very, very few people have the time, motivation, and knowledge to build and train these models locally. So why waste the die space?

The only AI I care about are offline models that can be trained. ChatGPT's inadequacies have become apparent as it gives safe answers that agree with its parent company's beliefs. Other times it issues lectures to questions that are quite mundane. One-size models don't work well when the company fears being sued into oblivion for the AI's responses.
Personalizing the AI will let people use it to its fullest extent as a real assistant. That's what people really want. Trainable models are the best way to get there.
 
  • Like
Reactions: usertests
Based on the "AI Roadmap" AMD provided, XDNA 2.0 in Strix Point at ~50 TOPS could easily be more performant than using the CPU+GPU. And I doubt they can get there with clock speeds alone. It needs a better design or more silicon.
Yes, I was only talking about the current 10 - 16 TOPS models.

As for those claims and whether Strix Point's NPU could change the balance, pay close attention to the numbers in this slide:

syHhkc4AMr7nHm9eoDmqeG.jpg


This tells us several things. First, the only compute-impacting change in Hawk Point will be their tweaks to the Ryzen AI block. Second, even at 30 TOPS (because they did say 3x of the 1st gen, presumably referring to Phoenix), it would merely draw on-par with the CPU + GPU cores, as those stand today - still a good improvement, but not totally game-changing.

Finally, if Strix Point indeed increases its NPU to only 30 TOPS, that seems achievable at probably just 2.5x the area of the original. You're not going to run a block 2x the size at the same clocks as Hawk Point's NPU, but they seem to have found some more headroom for better clocks - so, I'm not expecting 3x the size and the same clocks as Phoenix'.
 
Second, even at 30 TOPS (because they did say 3x of the 1st gen, presumably referring to Phoenix), it would merely draw on-par with the CPU + GPU cores, as those stand today - still a good improvement, but not totally game-changing.
MLID leaked in advance that Hawk Point would have 16 TOPS and Strix Point/Halo would have 45-50 TOPS. 16*3=48 aligns exactly. So if that's correct, XDNA 2.0 will be an impressive 5x uplift over Phoenix.

It's also consistent with Snapdragon X Elite having a 45 TOPS NPU. This seems to be the level of performance that Microsoft wants for "Windows 12".

Source: https://videocardz.com/newz/amd-ryz...ch-mid-2024-fire-range-and-strix-halo-in-2025
 
I still have absolutely no idea how local AI performance is supposed to help me, or anybody, when all the commercially available AI apps require expensive cloud-compute subscriptions.

Very, very few people have the time, motivation, and knowledge to build and train these models locally. So why waste the die space?
In part I agree with you. This is a bet for the next future. Every actor in AI business desperately want a pervasive AI in every workflow. Windows calculator that resolve equations, competitive NPC in games, photo editing that use generative AI, real time translators, application level AI assistants.
May be a good thing if they do not screw us.
 
  • Like
Reactions: usertests
... People will have to get used to x86 including the accelerator that hundreds of millions of mobile devices have.
Nice point. I add that while x86 performance becomes less relevant this could be the ugly thing for Intel that till now had promised a x86 solution for everything sure of the x86 ISA closed licencing model.
 
The only AI I care about are offline models that can be trained. ChatGPT's inadequacies have become apparent as it gives safe answers that agree with its parent company's beliefs. Other times it issues lectures to questions that are quite mundane.
It is normal to have wrong answers from every inference model. These are softwares, not thinking minds. There is no reasoning or intelligence in the answers, only statistics.
One-size models don't work well when the company fears being sued into oblivion for the AI's responses.
I doubt that some poor results depends on censorship.

Personalizing the AI will let people use it to its fullest extent as a real assistant. That's what people really want. Trainable models are the best way to get there.
I also think that offline computing is the best, but in case of LLM and relative training I cannot see how a single user can achieve better results.
 
GPUs didn't kill x86 and neither will NPUs.

x86 might not be long for this world anyhow, but it won't be NPUs that kill it.
I'm not saying that NPU will kill x86, but the increasing number of accelerators do the GP computing less relevant and in this way more easy to replace. To be replaced first will be the worst technology, that is x86 ISA. Who has the most to loose in this case is Intel.
 
BTW, what of Phoenix 2, with its mix of Zen 4 and 4C cores? Any news about that (or similar)?

I had high hopes for the efficiency of Zen 4C running on 4nm. Any data on this would be nice to see.

Actually, we already have 2 SKUs sporting the refreshed Phoenix 2 die in this new HAWK POINT lineup series.

Both the U-series processors, AMD Zen 4C & Zen 4 are hybrid chips, codenamed Phoenix 2, which include the 8540U and 8440U. Since these also lack the Ryzen AI unit (NPU), so these chips are basically "Phoenix 2" SKUs (the smaller hybrid die which incorporates both the Zen 4 and Zen 4C cores).

AMD went for a size-saving tactic by dropping out the Ryzen AI unit (NPU), which isn't featured on the Phoenix 2 SKUs. The previous gen Ryzen 5 7545U and the Ryzen 3 7440U also sport the same hybrid configuration (Ryzen 7000 "Phoenix" lineup).
 
  • Like
Reactions: bit_user
It is normal to have wrong answers from every inference model. These are softwares, not thinking minds.
Define thinking.

There is no reasoning or intelligence in the answers, only statistics.
It's as much statistics as your own brain. At some point, a model of a system transcends a mere statistical model and becomes something different. Perhaps the distinguishing factor is that we expect statistical models to hold for aggregates, but not apply in highly-individualized scenarios.

Another thing you can't do with mere statistics is to generate structured data, the way generative AI synthesizes text, images, and videos.

Given that you clearly haven't ever taken a course or read a book on the underlying numerical & algorithmic methods behind neural networks, I don't know how you feel qualified to make such strong assertions about its fundamental nature.
 
Hit the nail on the head. People wonder why doing things like video chat, image sharing, computational photography, heck even video editing in some circumstances is so much easier, smoother, and frankly a better experience on a 7 watt phone than a laptop or desktop. Its because ARM chips have had neural engines running on the hardware level for years. Its why Apple Silicon chips can be so freaking fast at video encoding when their raw performance barely meets an RTX 3070 laptop.

But more to @Giroro 's original intention: Development. I program every day with Copilot and GPT4, but I can run the 13b parameter Llama2 locally on my 32gig macbook pro m1, and its actually really good. The issue is it takes up all of my RAM and I can't use the computer for anything else. Network latency for auto-complete sucks. Building an AI powered customer facing app is terrifying when you are 100% held hostage by OpenAI. There is no _consumer_ need for a locally running AI yet, but all the AI powered products of tomorrow need it today.
Apple integrates its OS tightly with its chips so it takes advantage of its NPU. I wonder if Microsoft will take advantage of AMD's NPU?
 
AMD went for a size-saving tactic by dropping out the Ryzen AI unit (NPU), which isn't featured on the Phoenix 2 SKUs. The previous gen Ryzen 5 7545U and the Ryzen 3 7440U also sport the same hybrid configuration (Ryzen 7000 "Phoenix" lineup).
It looks like Kraken is the spiritual successor to Phoenix2, a cut-down version of Strix Point with XDNA 2.0 intact. So after Hawk Point2, everything (mobile) except for extreme budget chips like Mendocino should have XDNA. Even that segment will get it eventually. No idea about non-APU desktops.
 
Basically seems just like a slight refresh of phoenix series, with a focus on more AI performance. That's all nothing impressive in this lineup though.

I wonder why all the fuss is mainly on AI and stuff like that these days.
 
This might help.
MgUZuAN.png
FWIW, those are content creation apps that should run faster on the iGPU (or, better yet, a dGPU). They aren't playing to the real strengths of this tiny accelerator. I guess they do show some examples of interesting things you can do with AI-based processing.

BTW, the better way to post imgur links is as an image. In this case: [img]https://i.imgur.com/MgUZuAN.png[/img] I fixed it, in my quote of your post. The difference is that the [img] tag scales to the full width and enables zooming without having to leave this site.
 
It is normal to have wrong answers from every inference model. These are softwares, not thinking minds. There is no reasoning or intelligence in the answers, only statistics.
...
I doubt that some poor results depends on censorship.

I'll give an example. I recently tried to ask ChatGPT about geopolitics using hypothetical scenarios for Ukraine and the Middle East. I had a difficult time engaging because the chatbot repeatedly refused to talk about "violent" activities. I understand OpenAI's concerns, but my inquiry is wholly valid and even critical to understanding global affairs. If AI is going to move forward, it has to address this. A trainable, offline model is the best way to do it, IMHO. I want the flexibility to work in highly specialized contexts.
 
Status
Not open for further replies.