News DeepSeek might not be as disruptive as claimed, firm reportedly has 50,000 Nvidia GPUs and spent $1.6 billion on buildouts

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
How did they train their model? Did they leverage OpenAI models? China never developed anything from scratch. No R&D.

The training code is open source, that is why it's called OpenAI.

You build a model by feeding it massive amounts of data which uses equally massive amounts of computational power. The Chinese company alleges they were able to spend only ~6 million on this phase, which is incredibly low as most western companies have to spend $40~60 million. Assuming this is true and the method works "well enough", it means we can build AI models for a fraction of the previously anticipated cost. This is very bad news for any company who invested billions into building new datacenters and purchasing hardware, their expected profitability just tanked.

As for the "what data did they use", I would imagine they used the same data OpenAI did.


mXqqnM0.jpeg
 
The training code is open source, that is why it's called OpenAI.
No, it's not. It's called OpenAI due to their original mission as a non-profit.

According to Wikipedia, this was their initial position:

The organization stated it would "freely collaborate" with other institutions and researchers by making its patents and research open to the public.

https://en.wikipedia.org/wiki/OpenAI#2015–2018:_Non-profit_beginnings

As for the "what data did they use", I would imagine they used the same data OpenAI did.
OpenAI alleges that they trained it (partly?) by feeding queries to Open AI's model and learning from the sort of output it produced.

I think OpenAI probably would be able to tell if someone were doing that, because I'm sure it would make them one of the highest-volume users. They probably went back through the activity logs and discovered this, after DeepSeek revealed its incredible training claims.
 
Last edited:
No, it's not. It's called OpenAI due to their original mission as a non-profit.

According to Wikipedia, this was their initial position:
The organization stated it would "freely collaborate" with other institutions and researchers by making its patents and research open to the public.​


OpenAI alleges that they trained it (partly?) by feeding queries to Open AI's model and learning from the sort of output it produced.

I think OpenAI probably would be able to tell if someone were doing that, because I'm sure it would make them one of the highest-volume users. They probably went back through the activity logs and discovered this, after DeepSeek revealed its incredible training claims.
Actually the developer of DS have published some papers on "distillation" method to train AI, basically it's exactly that, feeding queries into other AI models and copy/learn from the output it produced, and this time around they "distilled" the ChatGPT model, so it makes perfect sense in the benchmarks it follows so closely ChatGPT. And since the difficulty in training from raw data vs distilling from other model is night and day, it is reasonable that the difference in cost is also night and day
 
Yep. No doubt DeepSeek has some interesting aspects and possibly game changing approaches. However, this was clearly a Chinese psyop to scare foolish investors who have no technical knowledge regarding AI, ML, LLM, etc. Much of the media in this country loves to hate on America and its businesses added fuel to the fire and didn't mention potentially farcical elements of the report. I was asked about it by friends and family and I said there might be some really cool breakthroughs, but let's wait for a full analysis because it sounded too good to be true, and you know what that means: it probably is.
 
China never developed anything from scratch. No R&D.
thats a straight up stupid comment.

paper, gunpowder, compass, printing were created in China ages ago. (all of which changed the world)
they developed paper money...heck most explosive devices were developed by china & copied by rest of world.

They were leaders on quantum computing few yrs ago.
Discovered drug for treating malaria.
invented the modern e-cig, passenger drone, & QUESS (2016) was 1st quantum space device.

China straight up is the 2nd country in world on R&D expenditure (only beat by USA and both of them are massively ahead of rest).


Yes, in past China was a copy cat as they effectively rejected the change in scientific revolutions world (cause of tradition and beliefs), but they have changed and are spending massive amounts of $ on R&D.
 
First rule of tech when dealing with Chinese companies. They are part of the state and the state has a vested interest in making the USA and Europe look bad. Triple check their numbers. Do the same for Elon.
Most of the paragraphs in the report start by "We believe". The report itself mentions that DeepSeek is a subproject invested in by a company that has a budget of $1.6b.
Did DeepSeek cost $6M to create ? No - that's the investment needed to train it. Did it cost $1.6b to create ? No. Does it achieve parity with OpenAI's best ? Looks like it. Does it need as many resources to run once trained ? Many tests say that it requires 2 to 5 times less processing grunt.
Did it cost Nvidia et al. a trillion dollars of speculative funds ? Yes.
Headlines galore.
 
  • Like
Reactions: George³
it's kind funny watching this "cost-efficiency" developing model got so much debate right now.

in fact, it's not just DeepSeek, but almost entire Chinese AI companies were chasing for a "cheap inference", which started years ago.

like China's Qwen model (developed by Alibaba) was the best performing open source model 1 year ago (I mentioned it in some old posts here), and Alibaba set their API price to only 1% of OpenAI's API price. I‘m Chinese and my boss is an American, we worked with the OpenAI platform (to develop some applications with AI capabilities), and back then when I told him the Alibaba's API pricing, my boss gave me a "very American" reply:

"It must be government compensation."

he had been living in China for like 10 years, and still having trouble perceiving China's capability, so no wonder we are witnessing such hot debate around the "if DeepSeek's cost is real" topic here.

My personal experience with DeepSeek started in around June 2024, back then they released their DeepSeek Coder V2 model which provides performance close to GPT-4 Turbo (especially in programming), at the time that OpenAI actively blocked any user access from China (or any "hostile nations" from the US's view due to US sanctions). it was already handy back then, and I watched DeepSeek release v2.5, R1-lite, v3, and R1 models in just half a year. so no matter what the real cost number is, they just keep frog-leaping. with that agility, I would argue it has already shown that they are super lightweight.

and you know what? after the release of DeepSeek-R1, Alibaba (China) released Qwen 2.5-Max which is the best-performing (non-reasoning) LLM, and Kimi (China) also released their K-1.5 multi-modal model that surpasses ChatGPT-o1.

DeepSeek is good, but it's just one of all the Chinese AI companies/labs, the R1's performance and cost-efficiency are the result of tight competition inside China's domestic AI market. it has to be this good to survive. just like China's EVs or solar panels or wind turbines. It's not "the beginning of the competition", but the "result of the competition".

and I see that most discussions here have not realized this, at all.
 
You can never do an apples to apples comparison between new and existing AI. All new AI has been based on the R&D made by existing companies. Yeah they're often cheaper, but that's almost entirely because of the money and effort put in by existing entities. If these companies had to start from scratch like everyone before them they'd have the same problems.
 
I'm not shocked but didn't have enough confidence to buy more NVIDIA stock when I should have. Now Monday morning will be a race to sell airline stocks and buy some big green before everyone else does.

The stock package I got from Amazon has almost doubled in the 15+ months since I took the job... from $128 to $238 a share.

For some reason I have no fear at all with this stock. Nowhere to go but up... which is pretty much all its done.

Nvidia... Tesla... never had enough confidence to buy either... but I feel really good about the big A.
 
thats a straight up stupid comment.

paper, gunpowder, compass, printing were created in China ages ago. (all of which changed the world)
they developed paper money...heck most explosive devices were developed by china & copied by rest of world.

They were leaders on quantum computing few yrs ago.
Discovered drug for treating malaria.
invented the modern e-cig, passenger drone, & QUESS (2016) was 1st quantum space device.

China straight up is the 2nd country in world on R&D expenditure (only beat by USA and both of them are massively ahead of rest).


Yes, in past China was a copy cat as they effectively rejected the change in scientific revolutions world (cause of tradition and beliefs), but they have changed and are spending massive amounts of $ on R&D.
I just wonder why you chose to add e cigs to the list of good stuff China made. It’s literally just a new way to destroy people’s lungs.
 
How many “protect Nvidia’s stock price” articles are we going to get ? They’re EVERYWHERE.

Don't own any Nvidia stock... therefore the 600B or whatever it was dip they took on launch day meant diddly to me.

As long as people continue buying from Amazon that's an investment I'm much more comfortable with. :)
 
  • Like
Reactions: JarredWaltonGPU
Oh wait, the U.S. news media not only got the original details wrong but helped the story spread like wildfire?

Journalism is still f'ed up in this country, even all these years later with "fake news" being a common household term.

I figured Pat G. was correct to some degree, though that didn't mean buy green stocks for me, lol. Sorry but that stuff is still overvalued, IMO. This one little incorrect story shows as such -- the AI bubble that's ready for that pin prick.
 
Journalism is still f'ed up in this country

I agree. One of my favorite parts of YouTube is going back in time to see news reporting back in the good ole days when it wasn't twisted.
 
i never said good stuff. They stated they never made anything when they have a history of it.

also you call out e-cigs as being bad yet ignore the gunpowder/explosives which have killed many more?
Gunpowder and explosives have tons of non-death-inducing uses. E cigs don’t.
 
The question I have based on this article is just how was this Chinese company, that hires almost exclusively from mainland China, able to buy 50,000 NVIDIA datacenter GPUs given there is a complete ban on them to China? And not just the older, discontinued A100’s as previously reported, or even consumer ones, but the latest Hopper GPUs?

I couldn't remember what the penalty is for violating these export controls but knew it has to be extreme, else NVIDIA wouldn’t have come out so publicly on it back in 2023. So I looked it up: “Violating export control regulations can result in severe penalties, including substantial fines, loss of export privileges, seizure of goods, and even criminal charges against responsible individuals.”
 
The question I have based on this article is just how was this Chinese company, that hires almost exclusively from mainland China, able to buy 50,000 NVIDIA datacenter GPUs given there is a complete ban on them to China? And not just the older, discontinued A100’s as previously reported, or even consumer ones, but the latest Hopper GPUs?

I couldn't remember what the penalty is for violating these export controls but knew it has to be extreme, else NVIDIA wouldn’t have come out so publicly on it back in 2023. So I looked it up: “Violating export control regulations can result in severe penalties, including substantial fines, loss of export privileges, seizure of goods, and even criminal charges against responsible individuals.”
If there is something, it can and will be smuggled. Ban shman, there are more than plenty middleman companies that might have unintentional or intentional "leakage" of that sort.
 
The question I have based on this article is just how was this Chinese company, that hires almost exclusively from mainland China, able to buy 50,000 NVIDIA datacenter GPUs given there is a complete ban on them to China? And not just the older, discontinued A100’s as previously reported, or even consumer ones, but the latest Hopper GPUs?

I couldn't remember what the penalty is for violating these export controls but knew it has to be extreme, else NVIDIA wouldn’t have come out so publicly on it back in 2023. So I looked it up: “Violating export control regulations can result in severe penalties, including substantial fines, loss of export privileges, seizure of goods, and even criminal charges against responsible individuals.”
A company outside of China and with no bad history buys H100s, sells them to China, profits. If you won’t need the H100s for yourself in the future there’s no real risk to doing it once for a company.