News DeepSeek might not be as disruptive as claimed, firm reportedly has 50,000 Nvidia GPUs and spent $1.6 billion on buildouts

First rule of tech when dealing with Chinese companies. They are part of the state and the state has a vested interest in making the USA and Europe look bad. Triple check their numbers. Do the same for Elon.
 
  • Like
Reactions: valthuer
I'm not shocked but didn't have enough confidence to buy more NVIDIA stock when I should have. Now Monday morning will be a race to sell airline stocks and buy some big green before everyone else does.
 
This is just cope aiming to protect the inflated value of "AI" companies. It doesn't really matter how many GPU's they have or their parent company has. The real disruptive part is releasing the source and weights for their models.
 
I'm not shocked but didn't have enough confidence to buy more NVIDIA stock when I should have. Now Monday morning will be a race to sell airline stocks and buy some big green before everyone else does.
I think any big moves now is just impossible to get right. I am in a holding pattern for new investments, and will just put them into something interesting bearing for probably a few months, and let the rest ride. No way to guess right on this roller coaster.

I do think the reactions really show that people are worried it is a bubble whether it turns out to be one or not.
 
$1.6 billion is still significantly cheaper than the entirety of OpenAI's budget to produce 4o and o1.

The exact dollar amount doesn't exactly matter, it's still significantly cheaper, so the overall spend for $500 Billion StarGate or $65 Billion Meta mega farm cluster is wayyy overblown.

Plus, the key part is it's open sourced, and that future fancy models will simply be cloned/distilled by DeepSeek and made public. So "commoditization" of AI LLM beyond the very top end models, it really degrades the justification for the super mega farm builds.
 
Ehh this is kinda mixing up two different sets of numbers. Those GPU's don't explode once the model is built, they still exist and can be used to build another model. The $6 million number was how much compute / power it took to build just that program. Building another one would be another $6 million and so forth, the capital hardware has already been purchased, you are now just paying for the compute / power. Most models at places like Google / Amazon / OpenAI cost tens of millions worth of compute to build, this isn't counting the billions in hardware costs.
 
The fact that the hardware requirements to actually run the model are so much lower than current Western models was always the aspect that was most impressive from my perspective, and likely the most important one for China as well, given the restrictions on acquiring GPUs they have to work with.

Being that much more efficient opens up the option for them to license their model directly to companies to use on their own hardware, rather than selling usage time on their own servers, which has the potential to be quite attractive, particularly for those keen on keeping their data and the specifics of their AI model usage as private as possible. And once they invest in running their own hardware, they are likely to be reluctant to waste that investment by going back to a third-party access seller.

I guess it most depends on whether they can demonstrate that they can continue to churn out more advanced models in pace with Western companies, especially with the difficulties in acquiring newer generation hardware to build them with; their current model is certainly impressive, but it feels more like it was intended it as a way to plant their flag and make themselves known, a demonstration of what can be expected of them in the future, rather than a core product.

So, I guess we'll see whether they can repeat the success they've demonstrated - that would be the point where Western AI developers should start soiling their trousers.

Either way, ever-growing GPU power will continue be necessary to actually build/train models, so Nvidia should keep rolling without too much issue (and maybe finally start seeing a proper jump in valuation again), and hopefully the market will once again recognize AMD's importance as well. Ideally, AMD's AI systems will finally be able to offer Nvidia some proper competition, since they have really let themselves go in the absence of a proper competitor - but with the advent of lighter-weight, more efficient models, and the status quo of many corporations just automatically going Intel for their servers finally slowly breaking down, AMD really needs to see a more fitting valuation.
 
Ehh this is kinda mixing up two different sets of numbers. Those GPU's don't explode once the model is built, they still exist and can be used to build another model. The $6 million number was how much compute / power it took to build just that program. Building another one would be another $6 million and so forth, the capital hardware has already been purchased, you are now just paying for the compute / power. Most models at places like Google / Amazon / OpenAI cost tens of millions worth of compute to build, this isn't counting the billions in hardware costs.
Well said.

The $6 million is the "variable" cost, whereas the $1.6 billion is the "fixed cost."

One thing to note it's 50,000 hoppers (older H20, H800s) to make DeepSeek, whereas xAi needs 100,000 H100s to make GrokAI, or Meta's 100,000 H100s to make Llama 3. So even if you compare fixed costs, DeepSeek needs 50% of the fixed costs (and less efficient NPUs) for 10-20% better performance in their models, which is a hugely impressive feat.

So even if you account for the higher fixed cost, DeepSeek is still cheaper overall direct costs (variable AND fixed cost).

One thing that people don't understand is, no matter what model OpenAI publishes, DeepSeek will distill the output, and make it free/publically available (v3 is dstilled 4o, r1 is distilled o1, and they are going to clone o3 etc...) So 90% of the AI LLM market will be "commoditized", with remaining occupied by very top end models, which inevitably will be distilled as well. OpenAI's only "hail mary" to justify enormous spend is trying to reach "AGI", but can it be an enduring moat if DeepSeek can also reach AGI, and make it open source?
 
Look I'm no genius nor do I understand all the implications.. but when I saw these facts - 1) claims of a hilariously paltry budget + 2) ai performance conveniently similar to that of chat gpts o1 + 3) from a rando Chinese financial company turned AI company - the LAST thing I thought was woowww major breakthrough. Are there innovations, yes. More like, innovations on how to copy & build off others work, potentially illegally. Oh and this just so happens to be what the Chinese are historically good at.

I saw the reactions of ppl losing their sht thought.. damn ppl are really not as smart/informed as I assume them to be. Then you noticed the CCP bots in droves all over .. so obvious. Also a red flag

I'm Chinese, raised in North America. My mom LOVES China (and the CCP lol) but damn guys you gotta see things clearly through non western eyes. Get it through your heads - how do you know when China's lying - when they're saying gddamnn anything. It's just the facts and how they operate.