News Nvidia CEO admits next gen DGX systems necessitate liquid cooling - and the new systems are coming soon

Status
Not open for further replies.
I'm not sure if this article was written to somehow shame nVidia, but as work density increases, then stronger cooling methods than air are required, and this applies everywhere and not just computers (hence my use of the term "work"), and as density increases then the number of systems required to perform the same amount of work decreases, thereby increasing overall efficency.

Eventually the AI bubble will collapse, the deep pockets will run dry, and they (all GPU manufacturers) will need to go back to factoring in efficency and price, but right now there's no reason for them not to just care about performance and density.
 
I'm not sure if this article was written to somehow shame nVidia, but as work density increases, then stronger cooling methods than air are required, and this applies everywhere and not just computers (hence my use of the term "work"), and as density increases then the number of systems required to perform the same amount of work decreases, thereby increasing overall efficency.

Eventually the AI bubble will collapse, the deep pockets will run dry, and they (all GPU manufacturers) will need to go back to factoring in efficency and price, but right now there's no reason for them not to just care about performance and density.
The LLM version of AI bubble may collapse but I'm sure it will be replaced with another version if AI.
 
The LLM version of AI bubble may collapse but I'm sure it will be replaced with another version if AI.

It'll collapse because a more efficient method will come along, or because the deepest pockets will run out, or because the saturation point will have been reached. A bubble is built on exponential growth, and once exponential turns linear and then logarithmic, or even to the negative, the bubble collapses.
 
  • Like
Reactions: thisisaname
I am starting to see a future where most of a nations energy consumption is from LLM & data centers.

Eventually the AI bubble will collapse
collapse? no.
it will never collapse until Singularity (where an ai can effectively skip the training needed and learn on fly by itself)

WIll it lessen? likely once we hit point of diminishing advancement yr over yr.
the deep pockets will run dry, and they (all GPU manufacturers) will need to go back to factoring in efficency and price
won't happen.
A restricted market has few vendors thus prices can be kept high as "where else ya gonna go?".
Efficiency is goal of arm versions & specific performance is ASIC benefit and in time they may replace general GPU of today, but GPU's for gaming shall likely never return to the old days...as every "gaming" gpu uses up fab allocation which is lost profit.

"ai" will never truly not be wanted...even if just by bad actors who will use it to make more advanced scams.
 
  • Like
Reactions: jbo5112
Performance will continue to push until we hit the level of AI they are aiming for, then we will see actual efficiency gains. It is just how computing works. We didn't start getting efficient laptops until we go to a point where laptops felt fast enough that we were not giving anything up. This tech will get faster and more power hungry for the next 5 to 10 years, then we will see it back off once we have all the AI ideas they have come up with up and running. Right now doing more is worth more than doing it cheaper so they will push for those headlines, and use a ridiculous amount of power. Then once they have run out of new hurdles that catch clicks, they will start working on efficiency.
 
Well, duh

One DGX H100 has 8 x NVIDIA H100 and the whole module is spec'ed for 10kW, it's amazing it is even air-coolable.

Nothing was admitted, no new relevation. It's just a surprise if you're not part of the industry.
 
What kind of liquid cooling is the question. Because datacentres have a few different flavours:
- You have fully internal closed-loop cooling, which is what we are used to in desktop PCs. Waterblocks on hot components move heat to the coolant, the coolant circulates to radiators within the server chassis, cool air blows through the radiators and cools the coolant. AIO or discrete pump is a minor distinction here.
- Next is rackscale watercooling, Here, each server has no airflow, but instead a water inlet and outlet. These plumb into the rack, which hosts shared radiators to receive cool air and cool the coolant for all the servers in the rack. This is a fairly common setup despite being an 'inelegant hack', because it allows hardware density increases from non-airflow chassis but allows you to install these systems in a datacentre with a regular hot/cold aisle airflow setup, alongside regular air-cooled hardware, with no new site plumbing
- Finally, we have full site liquid cooling, Here, there is little to no AC airflow for the aisles, and instead the coolant is routed from the servers to massive shared chillers (usually with liquid/liquid heat exchangers in between, to aid in maintenance and leak volume mitigation). You need to design the datacentre around this architecture, so it's less common.

If DGX is just being designed around internal closed-loop liquid cooling, this is basically a nothingburger. If they have off-chassis liquid transport, that then becomes more interesting.
 
Status
Not open for further replies.