That just increases hardware complexity.
::shrugs:: Oh Well.
I'm sure everybody has said that about every new instruction set added by Intel
Sometimes you gotta put in the work to have new options for improved long term performance.
BTW, since I noticed that TF32 can losslessly represent both fp16 and bf16, I have come to believe the reason they did it was simply because it's what the underlying hardware had to do to support both formats. Then, someone got the bright idea to expose it, so that users could simultaneously utilize both the range of bf16 and the precision of fp16. I don't think the format is as arbitrary as we might've presumed.
Neither is my proposal for more fp data types, it's not arbitrary.
The current fp data types are very "One Size Fits" all and not enough finesse in terms of data size options for end users to choose from.
It's like clothing sizes, we need more options since people's body types vary by a wide degree.
So is CUDA. They seem to like it that way. Vendor lock-in, you know?
I know! That's nVIDIA's way, they're the proprietary everything vendor.
Many people in the industry HATE them for that.
Especially the "Open Source" community.
Same, but I doubt you'll convince Nvidia of that. Especially if it makes their hardware more expensive, less efficient, and/or slower.
I don't expect to convince Jensen Huang of anything, he's happy at the perch at the top of his little dGPU mountain with his 80% dGPU marketshare.
But the rest of the industry can work on solutions and options that are "Open Standards".
That's why I want to see fp24 so badly.
It is the middle size data type between fp16 & fp32 that has been missing and can be used appropriately for AI or Gaming.
Depends on which variant of fp24 you want to use.
Each one has it's use case.