News Meta to build 2GW data center with over 1.3 million Nvidia AI GPUs — invest $65B in AI in 2025

Admin · Jan 25, 2025

Meta is set to throw some $65 billion in AI infrastructure this year.

Meta to build 2GW data center with over 1.3 million Nvidia AI GPUs — invest $65B in AI in 2025 : Read more

phead128 · Jan 25, 2025

DeepSeek V3 and R1 trained on $5 million budget and 2043 H800s, with performance exceeding GPT 4o and o1, already rendered the massive NPU farm clusters obsolete. Same with Sky-T1 model out of Berkeley trained an 01 peer for $450.

bit_user · Jan 26, 2025

phead128 said:
DeepSeek V3 and R1 trained on $5 million budget and 2043 H800s, with performance exceeding GPT 4o and o1, already rendered the massive NPU farm clusters obsolete. Same with Sky-T1 model out of Berkeley trained an 01 peer for $450.

Performance according to what metric? I don't see any of those on the OpenLLM Leaderboard:

Open LLM Leaderboard - a Hugging Face Space by open-llm-leaderboard

This app lets you compare different open-source Large Language Models by viewing their performance across various benchmarks. You can filter and vote on models in real-time to see which ones perfor...

huggingface.co

phead128 · Jan 26, 2025

bit_user said:
Performance according to what metric? I don't see any of those on the OpenLLM Leaderboard:

Open LLM Leaderboard - a Hugging Face Space by open-llm-leaderboard

This app lets you compare different open-source Large Language Models by viewing their performance across various benchmarks. You can filter and vote on models in real-time to see which ones perfor...

huggingface.co

Huggingface only has open sourced models on their leaderboard, so no GPT or Claude.

Here is an ELO-style leaderboard with OpenAI, Anthropic, Google, DeepSeek models included, where users are blind tested AI chat responses.

https://lmarena.ai/?leaderboard

bit_user · Jan 26, 2025

phead128 said:
Huggingface only has open sourced models on their leaderboard, so no GPT or Claude.

It's not just GPT and Claude missing. I didn't see any of the models you listed!

phead128 said:
Here is an ELO-style leaderboard with OpenAI, Anthropic, Google, DeepSeek models included, where users are blind tested AI chat responses.

https://lmarena.ai/?leaderboard

That's just based on votes by the public. They're not ranking it on skills tests, like Huggingface's scores do.

What I'm getting at is that I can believe you can train a specialized model with a lot fewer resources and end up with something competitive. Training one that's as versatile as leading contenders on Huggingface's LLM leaderboard is a lot harder and probably takes disproportionately more resources.

jp7189 · Jan 26, 2025

I never thought I'd be a fan of Meta/Facebook, but i am a HUGE fan of llama specifically because it's open source/open weight and allows people to add their own fine tune training on top of it.

phead128 · Jan 26, 2025

bit_user said:
That's just based on votes by the public.

These are ranked by blinded votes on best anonymous AI responses, not based on some popularity contest.

bit_user said:
They're not ranking it on skills tests, like Huggingface's scores do.

For skills based test, those are available on DeepSeek 's publication.

Also, for DeepSeek R-1's publication

The MMLU Pro and MMLU scores for DeepSeek/GPT/Claude/Google really blows everything out of the water compared to any model on HuggingFace (which peaks at 70 points for the top model)

bit_user said:
What I'm getting at is that I can believe you can train a specialized model with a lot fewer resources and end up with something competitive. Training one that's as versatile as leading contenders on Huggingface's LLM leaderboard is a lot harder and probably takes disproportionately more resources.

Except as shown by the MMLU Pro scores, DeepSeek/GPT/Claude/Google blows every open source model on the HuggingFace's LLM board out of the water, and they aren't even listed on HuggingFace's leader board. It's likely because it's voluntary submission.

bit_user · Jan 26, 2025

phead128 said:
These are ranked by blinded votes on best anonymous AI responses, not based on some popularity contest.

But the questions are just a pool submitted by the public and not sorted by skill. So, you can still have one specialized model that's really good at certain prompts winning out over a more general and versatile model.

phead128 said:
For skills based test, those are available on DeepSeek 's publication.

Why didn't they submit it to Hugging Face's LLM leaderboard, at least in addition to their comparison against GPT and Claude?

phead128 · Jan 26, 2025

bit_user said:
Why didn't they submit it to Hugging Face's LLM leaderboard, at least in addition to their comparison against GPT and Claude?

Maybe because HuggingFace's LLM leaderboard is not good?

HuggingFace’s leaderboards show how truly blind they are because they actively hurting the open source movement by tricking it into creating a bunch of models that are useless for real usage.

Google Gemini Eats The World – Gemini Smashes GPT-4 By 5X, The GPU-Poors

Compute Resources That Make Everyone Look GPU-Poor Before Covid, Google released the MEENA model, which for a short period of time, was the best large language model in the world. The blog and pape…

semianalysis.com

That explains why no respectable LLM is listed on there, because they are useless (except maybe MMLU Pro).

Gaidax · Jan 26, 2025

All I understand from that is that 6090 will cost $3k MSRP

zsydeepsky · Jan 27, 2025

bit_user said:
Why didn't they submit it to Hugging Face's LLM leaderboard, at least in addition to their comparison against GPT and Claude?

probably because huggingface havn't completed the test yet? I just checked the leaderboard, DeepSeek-R1 has 3 models finished the test and has score there:
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

anyway, there are other tests done by 3rd party, which aim to provide more challenging tests (since the current tests became too simple/too familiar for new models), like the "Humanity's Last Exam", which actually shows DeepSeek can solve hard questions better than GPT-O1 (or any other model)

I personally also compared those 2 models, uploaded the code framework, and asked them to give opinions and analysis,. DeepSeek-R1 does provide WAY BETTER insights than GPT-O1.

Search

News Meta to build 2GW data center with over 1.3 million Nvidia AI GPUs — invest $65B in AI in 2025

Admin

Administrator

phead128

Notable

bit_user

Titan

Open LLM Leaderboard - a Hugging Face Space by open-llm-leaderboard

phead128

Notable

Open LLM Leaderboard - a Hugging Face Space by open-llm-leaderboard

bit_user

Titan

jp7189

Distinguished

phead128

Notable

bit_user

Titan

phead128

Notable

Google Gemini Eats The World – Gemini Smashes GPT-4 By 5X, The GPU-Poors

Gaidax

Distinguished

zsydeepsky

Prominent

TRENDING THREADS

Latest posts

Moderators online

Share this page