I totally agree computer hardware is insanely cheap. My only question is if I work with Machine Learning, why would I buy that card instead of just waiting until summer and spend 10k USD on the new Nvidia DGX Station, that comes with over 280gb of GPU memory, 400+ system memory, instead of paying 11k USD on 1 gpu.
I agree that that is a tough one.
I've never really played with the truly big online models, only done extensive tests with what I could run in the company lab (V100) and the home lab (up to RTX 4090).
But while vendors keep claiming that quality is not only improving but reaching GAI levels, my experiments have been very disappointing throughout.
I've gone and tried everything that I could somehow fit into my hardware, nearly every model (family) available from HuggingFace and then gone through the various sizes and weight precisions to see how they'd influence the speed and quality.
And in some cases that meant having to wait a very long time to get answers even from 70B models, which clearly won't fit into an RTX 4090 even at the smallest quantizations, so a lot of layers wind up running on my 16-core and its 128GB of DRAM.
My main takeaway: the garbage they produce is so bad, they are just not useful. And within 1-70B weights and FP16-INT4 quantization that changes remarkably little. Yes, they gain depth and seem to become much more knowledgeable, but even the reasoning models never know when they fall off their knowledge cliff and fall to hallucinations that defy very basic human reality.
I've never been interested in GAI, I'd have been perfectly happy for these LLMs to have as good an understanding of the world as any servant would have, but they must be reliable with regards to any information for the domain/household they are working it. I'd have been happy with a peasant with manners who sticks to my orders: context and RAG data needs to be interpreted with precision and strict obedience.
Alas, when these models are smart enough to know Marie Antoinette as the wife of Loius XVI, but claim that she died in obscurity ten years after being executed and didn't have a biological mother, you obviously can't trust them to even toggle a light switch, because they might as well just electrocute you, let alone take control of family logistics as a domestic with control over sharp blades or foodstuff that can be turned into poison.
The IBM PC-AT represented a value return that was basically guaranteed for years. Today buying AI hardware is like crypto mining: hard to tell if you'll even break even.
For the RTX 4090 it was still easy, it sees dual use in after-hours gaming (far too little, actually). The stuff you mention: no meaningful alternate use that I can see. So even if I could afford that, I wouldn't jump, especially since I no longer have a career riding on it.
If you get around having your DGX station, I'd love to see your results!