Cooe
Prominent
🤦 There's nothing even remotely "pseudo-matrix" about the WMMA AI acceleration units in RDNA 3. The "MM" literally stands for "Matrix Multiplication". 😑
Aka, they are totally dedicated matrix multiplication (and thus addition & division as well) units, but just integrated into the existing CU's/floating point ALU's to save on transistor count & die size instead of as a completely discrete ASIC block ala Nvidia/Intel (although ofc at a performance deficit if said SP's/ALU's are needed for something else compute-wise at the same time).
AMD's "Ray Accelerators" are integrated in a very similar way (although RDNA 4 is beefing them out to similar capability to Nvidia/Intel's fully separate ASIC's).
Each strategy has very different pro's/con's but all three vendors very much have proper dedicated hardware for low precision matrix math acceleration on their current GPU architectures for AI workloads, with absolutely nothing "pseudo" about it. 🤷
And judging that the reason that Intel's XeSS is often better quality on Intel Arc than AMD's FSR is on Radeon because it has fully dedicated tensor units for matrix math vs AMD's WMMA souped up CU's is a pile of absolute NONSENSE considering no existing version of FSR actually uses AI acceleration yet... 🤦
(As AMD doesn't want to restrict compatibility to only native matrix math capable GPU's [Turing/Alchemist/RDNA 3 onwards], or at least they don't want to yet.)
Also, critically, AMD can get away with this kind of "core CU/ALU extension instead of full blown separate ASIC blocks" strategy vastly more successfully that their competitors would be able to do so thanks to their absolutely BEST IN CLASS Asynchronous Compute capabilities.
Being first to implement Async Compute by many, MANY years has left AMD's ACE units head & shoulders above the competitions', allowing them to recapture a bunch of otherwise unused/wasted compute performance each clock-cycle to use for stuff like hardware ray-tracing or AI/matrix math acceleration, making the performance hit for ALU sharing less significant than it'd normally be. (Although ofc it's most definitely still there/a thing!)
Aka, they are totally dedicated matrix multiplication (and thus addition & division as well) units, but just integrated into the existing CU's/floating point ALU's to save on transistor count & die size instead of as a completely discrete ASIC block ala Nvidia/Intel (although ofc at a performance deficit if said SP's/ALU's are needed for something else compute-wise at the same time).
AMD's "Ray Accelerators" are integrated in a very similar way (although RDNA 4 is beefing them out to similar capability to Nvidia/Intel's fully separate ASIC's).
Each strategy has very different pro's/con's but all three vendors very much have proper dedicated hardware for low precision matrix math acceleration on their current GPU architectures for AI workloads, with absolutely nothing "pseudo" about it. 🤷
And judging that the reason that Intel's XeSS is often better quality on Intel Arc than AMD's FSR is on Radeon because it has fully dedicated tensor units for matrix math vs AMD's WMMA souped up CU's is a pile of absolute NONSENSE considering no existing version of FSR actually uses AI acceleration yet... 🤦
(As AMD doesn't want to restrict compatibility to only native matrix math capable GPU's [Turing/Alchemist/RDNA 3 onwards], or at least they don't want to yet.)
Also, critically, AMD can get away with this kind of "core CU/ALU extension instead of full blown separate ASIC blocks" strategy vastly more successfully that their competitors would be able to do so thanks to their absolutely BEST IN CLASS Asynchronous Compute capabilities.
Being first to implement Async Compute by many, MANY years has left AMD's ACE units head & shoulders above the competitions', allowing them to recapture a bunch of otherwise unused/wasted compute performance each clock-cycle to use for stuff like hardware ray-tracing or AI/matrix math acceleration, making the performance hit for ALU sharing less significant than it'd normally be. (Although ofc it's most definitely still there/a thing!)
Last edited: