News AMD plans for FSR4 to be fully AI-based — designed to improve quality and maximize power efficiency

Admin · Sep 13, 2024

AMD FSR4 has been in the work for 9–12 months already, and will be fully AI-based for frame generation and frame interpolation. It sounds like it's being designed around handhelds first but should apply to laptops and desktops as well.

AMD plans for FSR4 to be fully AI-based — designed to improve quality and maximize power efficiency : Read more

Pierce2623 · Sep 13, 2024

Ok so it’s AI based, will rdna3’s AI focused matrix extensions allow it ton on there without discrete matrix accelerators?

edzieba · Sep 13, 2024

Pierce2623 said:
Ok so it’s AI based, will rdna3’s AI focused matrix extensions allow it ton on there without discrete matrix accelerators?

Likely not without a performance impact comparable to the performance uplift from the reduced rendering resolution, making it pretty much moot for RDNA3 and below. Hence the focus on handheld gaming devices which contain AMD chips with dedicated matrix accelerator hardware. AMD have yet to announce consumer desktop cards with such hardware, so they will likely not promote FSR4 in that market until they have announced cards that could take advantage of it.

coolitic · Sep 13, 2024

What a waste of hardware and dev-time, all just to peg a crummy "AI" label w/ negligible real-world benefits.

ET3D · Sep 13, 2024

Pierce2623 said:
Ok so it’s AI based, will rdna3’s AI focused matrix extensions allow it ton on there without discrete matrix accelerators?

The focus on mobile makes me think that it's going to use the NPU for upscaling.

Makaveli · Sep 13, 2024

edzieba said:
Likely not without a performance impact comparable to the performance uplift from the reduced rendering resolution, making it pretty much moot for RDNA3 and below. Hence the focus on handheld gaming devices which contain AMD chips with dedicated matrix accelerator hardware. AMD have yet to announce consumer desktop cards with such hardware, so they will likely not promote FSR4 in that market until they have announced cards that could take advantage of it.

That hardware may very well be in RDNA 4 but we will have to wait and see.

usertests · Sep 13, 2024

coolitic said:
What a waste of hardware and dev-time, all just to peg a crummy "AI" label w/ negligible real-world benefits.

Not a waste if it works and catches them up to DLSS.

mikeztm · Sep 13, 2024

Pierce2623 said:
Ok so it’s AI based, will rdna3’s AI focused matrix extensions allow it ton on there without discrete matrix accelerators?

RDNA3 does not have any discrete matrix unit as you said, so it will not be useful there.
RDNA3's WMMA extension is almost pointless since it does not have any execution unit support.

thestryker · Sep 13, 2024

While Microsoft has made it simple to implement all of the upscalers I do still hope AMD has an open implementation similar to Intel's. I absolutely think it's necessary for AMD to move beyond what they have been doing as there's a definite ceiling as to how high quality it can be.

DS426 · Sep 13, 2024

What is with everyone's obsession with thinking NPU's are needed for AI work? NPU's just make AI workloads more efficient on low power / battery-life-important devices like laptops.

Additionally, FSR4 would have the potential of more real-world impact on a handheld, even as the desktop gamer world does demand FSR that's more competitive with DLSS. As stated in the article, let it prefer AMD hardware first and then if it's a different vendor, fall back to DP4a and FP16 as I also agree remaining hardware vendor agnostic and mostly open source is important now and going forward.

usertests · Sep 13, 2024

DS426 said:
What is with everyone's obsession with thinking NPU's are needed for AI work? NPU's just make AI workloads more efficient on low power / battery-life-important devices like laptops.

Additionally, FSR4 would have the potential of more real-world impact on a handheld, even as the desktop gamer world does demand FSR that's more competitive with DLSS. As stated in the article, let it prefer AMD hardware first and then if it's a different vendor, fall back to DP4a and FP16 as I also agree remaining hardware vendor agnostic and mostly open source is important now and going forward.

If their focus is on laptops and handhelds, the NPU is an untapped resource that would let them leave GPU resources to games. Especially the XDNA2 NPU which is relatively powerful. But we don't know how this is going to shake out. The only details we actually know are that FSR4 will use AI, and they are focusing on mobile, efficiency, battery life.

DLSS does a lot better than FSR at low input resolutions, so an AI-based FSR4 focusing on mobile makes a lot of sense. Crappy 540p input + AI = 1080p magic. Rendering at 540p lowers power consumption of the APU, leading to longer battery life.

Notton · Sep 13, 2024

Okay, that explains why AMD wants an NPU in the Z2.

deksman · Sep 13, 2024

coolitic said:
What a waste of hardware and dev-time, all just to peg a crummy "AI" label w/ negligible real-world benefits.

I wouldn't be so sure its a waste of hw and dev-time.
For one thing, NV has been using AI for DLSS for a while now. AMD is simply following that route.
to be fair, they've done a remarkable job with what's available, but going one step further would likely help in cleaning up that small % variation that some people are complaining about in regards to quality (personally I agree this is mainly negligible and not very noticeable when gaming, especially when the implementation is well made).

But this will allow existing AI hw to actually be used.

usertests · Sep 13, 2024

baboma said:
>Will that be enough, and will FSR4 usher in a truly competitive alternative to DLSS and XeSS that finally leverages AI?

I don't see framegen's relative performance between GPU vendors as an issue. It's an iterative process; they will all get better. The more important question is if or when frame-gen will be functional enough to be entirely subsumed into GPUs proper. My gauge is it'll be "when," not "if."

Currently, framegen's main drawback is lag, which is a showstopper for fast-twitch gaming. With AI, one can conceivably mitigate lag, as instead of waiting for the "next" frame before interpolating the extra frame, AI can obviate the wait with some sort of predictive "look-ahead."

I have no knowledge or insight into framegen tech, but that would be the logical goal. Just as upscaling is now considered (by present company) to be an acceptable means to get higher fidelity for normal gaming, albeit not for benchmarking purposes, framegen will not be too far behind. The yardstick will be lag (or lack of).

You keep on saying framegen, but FSR4 will probably be for upscaling, separate from frame gen.

FSR3.0 was framegen, but it's otherwise referred to as Fluid Motion Frames (FLUMOFUKR).

baboma · Sep 13, 2024

>You keep on saying framegen, but FSR4 will probably be for upscaling, separate from frame gen.

"So now we're going AI-based frame generation, frame interpolation, and the idea is increased efficiency to maximize battery life."

If there's a distinction, it's semantics. Per Huynh's above statement, AMD will use AI for framegen--as DLSS/XeSS already do--regardless of whether framegen is inside FSR4 or outside. I'm commenting on framegen as a general tech, not just AMD's implementation. Whatever label it has is irrelevant.

Jarred eval'ed framegen is currently good as a "smoothing" feature. I doubt that's the end-goal. Carried to its logical conclusion, framegen will be an integral part of the GPU's performance. That's what the iterations are for, not just to improve efficiency, but to improve functionality. At a certain lag threshold, framegen will be usable for all games including fast-twitch. Upscaling is already there.

JarredWaltonGPU · Sep 14, 2024

baboma said:
>You keep on saying framegen, but FSR4 will probably be for upscaling, separate from frame gen.

"So now we're going AI-based frame generation, frame interpolation, and the idea is increased efficiency to maximize battery life."

If there's a distinction, it's semantics. Per Huynh's above statement, AMD will use AI for framegen--as DLSS/XeSS already do--regardless of whether framegen is inside FSR4 or outside. I'm commenting on framegen as a general tech, not just AMD's implementation. Whatever label it has is irrelevant.

Jarred eval'ed framegen is currently good as a "smoothing" feature. I doubt that's the end-goal. Carried to its logical conclusion, framegen will be an integral part of the GPU's performance. That's what the iterations are for, not just to improve efficiency, but to improve functionality. At a certain lag threshold, framegen will be usable for all games including fast-twitch. Upscaling is already there.

The thing is, framegen and frame interpolation are, to me, the same thing. You're interpolating a frame between two rendered frames. Intel has hinted at investigations into frame prediction or whatever you want to call it — generating the "next frame" based on the current frame plus AI. That would conceivably allow for increased smoothness without the additional lag.

Ultimately, as long as frame generation/prediction stuff isn't using any additional user input, it will still just end up being frame smoothing in my book. I'd like to see something where you take the last rendered frame, sample user input, and predictively generate a next frame from that. Or at the very least something like asynchronous time warp coupled to AI to get user input right before the generation of a new frame. Basically, we need something that responds to user input somehow rather than just interpolating between two rendered frames for this to feel better and not just look better.

thestryker · Sep 14, 2024

JarredWaltonGPU said:
Intel has hinted at investigations into frame prediction or whatever you want to call it — generating the "next frame" based on the current frame plus AI. That would conceivably allow for increased smoothness without the additional lag.

It sounds like there are a lot of hurdles to getting this sort of technology going. I'm not sure how viable it will end up being, but it's good the investigation is happening.

I do think Frame Generation is a very valuable technology that has been marketed all wrong. With 4k/240 displays appearing along with high refresh 1440p/UW 1440p it's starting to become necessary. Even with 4090 levels of performance these frame rates aren't realistic in most gaming outside of e-sports type titles. Instead of marketing pushing the very real advantages we keep getting nonsense about high fps on underpowered cards.

edzieba · Sep 14, 2024

JarredWaltonGPU said:
Intel has hinted at investigations into frame prediction or whatever you want to call it — generating the "next frame" based on the current frame plus AI. That would conceivably allow for increased smoothness without the additional lag.

That's been in active use for many years in the VR space. A combination of depth-buffer-based occlusion prediction and optical-flow-based motion prediction. Oculus pioneered this as 'Aysnchronous Spacewarp' but others have replicated this for their own implementations (e.g. Valve).

coolitic · Sep 14, 2024

usertests said:
Not a waste if it works and catches them up to DLSS.

Something not worth "catching up" to IMO, hence my point

Pierce2623 · Sep 14, 2024

mikeztm said:
RDNA3 does not have any discrete matrix unit as you said, so it will not be useful there.
RDNA3's WMMA extension is almost pointless since it does not have any execution unit support.

Well the WMMA extensions seemingly work well enough in standard generative AI work loads. What do you mean they have no execution unit support? They allow the standard shaders to run matrix math quickly enough that they work well in LLMs. It’s not exactly a fair comparison but I can now get my 7800xt to outperform one of the 3060s in my little “AI experimentation box” with two 3060s. It’s only in a couple cases using models optimized to perform on AMD but I think it shows that the matrix throughput is there or thereabouts.

Pierce2623 · Sep 14, 2024

DS426 said:
What is with everyone's obsession with thinking NPU's are needed for AI work? NPU's just make AI workloads more efficient on low power / battery-life-important devices like laptops.

Additionally, FSR4 would have the potential of more real-world impact on a handheld, even as the desktop gamer world does demand FSR that's more competitive with DLSS. As stated in the article, let it prefer AMD hardware first and then if it's a different vendor, fall back to DP4a and FP16 as I also agree remaining hardware vendor agnostic and mostly open source is important now and going forward.

Nobody thinks it AI REQUIRES matrix acceleration. It just makes it vastly more performant. My question was specifically asking if AMD would vendor lock the “AI-based” upscaling to rdna4, when rdna3’s matrix extensions mean its matrix throughput is actually pretty decent.

tstager · Sep 14, 2024

Pierce2623 said:
Ok so it’s AI based, will rdna3’s AI focused matrix extensions allow it ton on there without discrete matrix accelerators?

I think it will because RDNA3 ai hardware hasn't been used for anything yet. I think they were planning for FSR4 when they added these accelerators to rdna3.

tstager · Sep 14, 2024

mikeztm said:
RDNA3 does not have any discrete matrix unit as you said, so it will not be useful there.
RDNA3's WMMA extension is almost pointless since it does not have any execution unit support.

Yes it does. There are 2 cores per CU!

tstager · Sep 14, 2024

Pierce2623 said:
Well the WMMA extensions seemingly work well enough in standard generative AI work loads. What do you mean they have no execution unit support? They allow the standard shaders to run matrix math quickly enough that they work well in LLMs. It’s not exactly a fair comparison but I can now get my 7800xt to outperform one of the 3060s in my little “AI experimentation box” with two 3060s. It’s only in a couple cases using models optimized to perform on AMD but I think it shows that the matrix throughput is there or thereabouts.

The 7900xtx has 192 ai accelerators. There are 2 per cu! 248 Tops!

tstager · Sep 14, 2024

mikeztm said:
RDNA3 does not have any discrete matrix unit as you said, so it will not be useful there.
RDNA3's WMMA extension is almost pointless since it does not have any execution unit support.

RDNA3 has 2 AI accelerators per CU!

News AMD plans for FSR4 to be fully AI-based — designed to improve quality and maximize power efficiency

Administrator

Commendable

Distinguished

Distinguished

Distinguished

Splendid

Splendid

Distinguished

Judicious

Commendable

Splendid

Reputable

Distinguished

Splendid

Respectable

Splendid

Judicious

Distinguished

Distinguished

Commendable

Commendable

Distinguished

Distinguished

Distinguished

Distinguished

Share this page