News Nvidia's $3,000 mini AI supercomputer draws scorn from Raja Koduri and Tiny Corp — AI server startup suggests users "Just buy a gaming PC

The article said:
As told by the Nvidia CEO, we might believe that Project Digits is akin to AI alchemy in a box, but all that glisters isn’t gold.
First of all, it's "glistens". I didn't know "glisters" was even a word!
: D

Second, that box is fugly! I hadn't seen such a clear pic of it, before. Yuck! I'd have to put it somewhere out of sight.

The article said:
Koduri later elaborated that - in contrast to the big FP4 claims - by his calculations, the FP16 performance of the Project Digits AI supercomputer wasn’t that impressive. Koduri estimated that the FP16 performance of the upcoming GeForce RTX 5070
Even on fp4, the < $600 RTX 5070 still matches it!

These guys are exactly right. The only thing that sets it apart is its memory capacity.
 
  • Like
Reactions: Loadedaxe
And this gentleman, certainly very intelligent, could he explain to us how to get a gaming PC for $3000 that has 128 GB of video memory?
Funny thing is, Nvidia has a long history of selling its lowest-tier GPUs with normal DDR memory.

In that case, it's not inconceivable they could make such a thing as like a RTX 5070 with stacked LPDDR5x, if they wanted to. I'm guessing their Digits box doesn't have more than a 256-bit memory interface, so you'd get at least parity in a card that could be sold far less expensively and where you could use up to 4 of them in a workstation (in contrast to Digits' scaling of only x2).
 
GPU with NVMe attached perhaps, AMD messed with that if I recall. But yeah, slapping some DIMM or CAMM modules onto the back of a GPU wouldn't be impossible if they added the memory controllers to the silicon.
 
GPU with NVMe attached perhaps, AMD messed with that if I recall. But yeah, slapping some DIMM or CAMM modules onto the back of a GPU wouldn't be impossible if they added the memory controllers to the silicon.
DIMMs wouldn't be a good option, due to needing 4 of them for adequate bandwidth and capacity. Also, regular DDR5 is rather power-hungry, when you slam it hard. Supporting a pair of LPCAMMs would be an option, but add bulk and cost for nominal benefits.

I'd really expect them just to solder down LPDDR5X chips, like they currently do in Digits and all their other GPUs.
 
"begging to be swindled"...

Don't hold back...tell us how you REALLY feel about it? 🤣

The idea that you can have a supercomputer on your desk, hidden behind a tissue box is hard to ignore. BUT...I'm adverse to throwing away money...
 
  • Like
Reactions: zcomputerwiz
The idea that you can have a supercomputer on your desk, hidden behind a tissue box is hard to ignore. BUT...I'm adverse to throwing away money...
It's not a supercomputer. It has the same AI horsepower as a < $600 dGPU!

The only thing that sets it apart from a decent gaming PC is the amount of memory its GPU has local access to. Well, that and the ARM cores instead of x86.
 
Well it definitely *is* a word, ever since Shakespeare's time (notably in "The Merchant of Venice") 😀

So a bit archaic LOL
I just realized that I misremembered the quote. Their word was actually closer than mine. It should be "glitters". I was thinking it was a Tolkien quote, but his is yet different (still uses "glitters", though).

...and more fitting, too. Looks like it started out black, and then someone went nuts with glue and a bunch of gold glitter!
 
OBVIOUSLY the gold glitter is added so man can confuse it with their jewelry box. After all, we all keep our jewels in there...

No need to kick me, I'll see myslef out...😛
 
  • Like
Reactions: bit_user
I want to say it has always been possible for GPU makers to put expandable/upgradeable memory in some form of a socket on the back of a video card, be it a SO-DIMM socket or a CAMM socket or something. Even if current gen video cards have a much smaller PCB than older gens, the sheer size of the coolers alone provide some space to place sockets with a ribbon cable leading back to the PCB and GPU. Of course as others said regular DDR4 and DDR5 ram isnt a very good option compared to the memory GPUs come with. But now with more applications demanding more VRAM people keep having battles between the GPU they can afford vs what they want. And companies like Nvidia seem to be very happy locking high VRAM options to their higher tiered GPUs.
 
I want to say it has always been possible for GPU makers to put expandable/upgradeable memory in some form of a socket on the back of a video card, be it a SO-DIMM socket or a CAMM socket or something. Even if current gen video cards have a much smaller PCB than older gens, the sheer size of the coolers alone provide some space to place sockets with a ribbon cable leading back to the PCB and GPU. Of course as others said regular DDR4 and DDR5 ram isnt a very good option compared to the memory GPUs come with. But now with more applications demanding more VRAM people keep having battles between the GPU they can afford vs what they want. And companies like Nvidia seem to be very happy locking high VRAM options to their higher tiered GPUs.
You mean like during the era of S3 Trio or something

Now it isn't that simple, because of memory controllers, memory channels, firmware / BIOS expectations etc. Heck, you can't even have just 7 RAM chips out of 8 on your RAM stick or your computer won't boot, why expect different for GPU?

It is technically possible to have expandable VRAM. We just don't go down that route. Even swapping the memory chips to a higher capacity at the same bus width requires editing the BIOS. And the industry has been following this way since like Riva TNT.
 
TinyCorp isn't making much sense since their tiny box green retails for 25,000$. 8x the FP8 compute for 8x the price, not a swindle btw. Perhaps they should take a second look at the developing market of personal LLMs. Or not, since they're already priced out.
 
The real magic happens at the marketing department. Where they markup anything, and sell 5 to 10% performance increases for over 2000 euro's to consumers. People can't be that stupid can they? Well yes they are. It's every business man (without morals or a conscience) wet dream.
 
Forget the Flops, it's the unified memory that's attracting my attention. The PC architecture is hamstrung by having its memory divided in two parts. OK, so you do get special memory optimized for certain operations on the graphics card, but you have to pay through the nose just to get 32MB of it. That's plent for gamers (at least right now it is), but nowhere near enough for serious AI.
 
Size? Most of the 128GB of ram will be usable as vram... How much for a pc with even 80GB of vram? FP4 has been demonstrated on Nvidia dev pages, they shrunk stable diffusion in FP4 and the result was virtually indistinguishable, that would allow to run much larger models and faster. FP4 might also be used to kick start training but that is still a subject of research
 
  • Like
Reactions: zcomputerwiz
The commentary completely missed the point of the device.

Nvidia advertised it's use for LLMs, particularly the high parameter models that need loads of VRAM. It also uses a relatively small amount of power compared to splitting model layers across multiple smaller GPUs ( to make up for insufficient VRAM individually ). Raw compute really isn't the concern in this use case.

I also have a "gaming pc" with 128gb of system RAM. For models that don't fit in VRAM the PCIe interface to the GPUs is the limiting factor, not the GPU cores raw compute power.
 
And this gentleman, certainly very intelligent, could he explain to us how to get a gaming PC for $3000 that has 128 GB of video memory?
Exactly. Can't see how he missed that. Probably the most important thing from running llm's, which is what it is made for.
 
Size? Most of the 128GB of ram will be usable as vram... How much for a pc with even 80GB of vram? FP4 has been demonstrated on Nvidia dev pages, they shrunk stable diffusion in FP4 and the result was virtually indistinguishable, that would allow to run much larger models and faster. FP4 might also be used to kick start training but that is still a subject of research
The only practical deployment of FP8 is Llama 405B quantised to FP8 such that it can be fit into the 80GB x 8 = 960GB vRAM of one H100 DGX. It is not as mature as one might believe from Nvidia marketing materials. FP4/8 still has a long way to go.