News Intel's latest flagship 128-core Xeon CPU costs $17,800 — Granite Rapids sets a new high watermark

I can never figure out from the press coverage just what is what. Exactly what applications work best on these high core count servers? Intel *could* always match core count, but there was no good reason to, too many cores just cause contention and blocking and cache overload and IO queues, not to mention core licensing issues.

But AMD went there more for marketing hype than any real benefit, and now Intel has been dragged into it too. Or at least that's how it looks to me.

Now, in SQL Server you might benefit from a bunch of cores (if you can afford the license, or Azure level), but the way it works is lots of small queries only need one core each, but some big queries can run a lot faster if they are free to grab 4 or 8 or 32 cores for a few seconds or minutes. So the optimal situation is to have a bunch of cores that sit idle 50-80% of the time and are only used for some big (and mostly sloppy) queries. But Microsoft's licenses used to require paying for them linearly - as if they were going to be used 100% of the time. So Microsoft suppressed demand for high core counts on servers from about Y2K until I'm not sure when - are they still doing that, or have they reintroduced a per-processor license that maxes out at 10 or 16 or something?

SMH
 
Exactly what applications work best on these high core count servers?
I think they're mostly used in the form of smaller virtual machines.

But AMD went there more for marketing hype than any real benefit, and now Intel has been dragged into it too. Or at least that's how it looks to me.
This is absurd. AMD wouldn't get very far ahead of customer demand. Customers like more cores per rack unit, since it's more space-efficient and also more energy-efficient.

in SQL Server you might benefit from a bunch of cores (if you can afford the license, or Azure level), but the way it works is lots of small queries only need one core each, but some big queries can run a lot faster if they are free to grab 4 or 8 or 32 cores for a few seconds or minutes. So the optimal situation is to have a bunch of cores that sit idle 50-80% of the time and are only used for some big (and mostly sloppy) queries. But Microsoft's licenses used to require paying for them linearly - as if they were going to be used 100% of the time.
I think you've just answered your own question. People will provision smaller VMs. If they're idle most of the time, then they can be oversubscribed, which enables the datacenter operator to reap even more benefit and/or customers to save more money vs. running instances on bare hardware.
 
  • Like
Reactions: Thunder64
I can never figure out from the press coverage just what is what. Exactly what applications work best on these high core count servers? Intel *could* always match core count, but there was no good reason to, too many cores just cause contention and blocking and cache overload and IO queues, not to mention core licensing issues.

But AMD went there more for marketing hype than any real benefit, and now Intel has been dragged into it too. Or at least that's how it looks to me.

Now, in SQL Server you might benefit from a bunch of cores (if you can afford the license, or Azure level), but the way it works is lots of small queries only need one core each, but some big queries can run a lot faster if they are free to grab 4 or 8 or 32 cores for a few seconds or minutes. So the optimal situation is to have a bunch of cores that sit idle 50-80% of the time and are only used for some big (and mostly sloppy) queries. But Microsoft's licenses used to require paying for them linearly - as if they were going to be used 100% of the time. So Microsoft suppressed demand for high core counts on servers from about Y2K until I'm not sure when - are they still doing that, or have they reintroduced a per-processor license that maxes out at 10 or 16 or something?

SMH
We do need new regulations on CPU licensing of software. It's asinine to license on a core count basis in the era of ever-increasing core counts.

There should be new Laws/Rules/Regulations with a Federal Ban on Software Licensing based on "Core Count" or Instance limits within a Single Physical CPU.

Licensing should be based on a "Per Single Physical CPU" basis only with no limits on the amount of instances per Indiviual Physical CPU.
That would simplify things dramatically.

You want to run the software on more machines/sockets, then you pay for those as needed.
 
At nearly $18,000, Intel's Xeon 6980P 'Granite Rapids' could be the industry's most expensive CPU in modern history.

Intel's latest flagship 128-core Xeon CPU costs $17,800 — Granite Rapids sets a new high watermark : Read more
I remember few years back when first 5nm TSMC Apple Bionic processors wth 12B transistors appeared their manufacturing price was shocking, just around $1.5 per core and selling price something like twice of that. The whole 6-core processor was sold at around unthinkable price around $20.

At exactly the same time AMD also 5nm also TSMC server 8-core chiplets with twice smaller number of transistors around 6B were selling at $100 ***per core*** . Intel chiplets are much larger in size, clearly more than 40 cores count, and hence per core should cost more than minuscule 8-core AMD ones due to smaller yield. But should both AMD and Intel for almost the same with minor differences silicon have factor of 50-100 difference with Apple?

Of course cores with Apple have different sizes and better would be to compare prices per unit number, like 1 Billion, transistors. but the conclusions about huge fishy discrepancy will be approximately the same. All of them are sold like hot cakes in huge quantities and there is no argument in substantially smaller market for anyone of them.
 
Last edited:
Of course multicore processors are needed and so far they scale up and perform great. And they could do great even with AI, just add there native hardware for missing 8 and 16 bit arithmetic.

My understanding is that Intel around one-two decades ago stopped supporting single precision FP32 natively in favor of 64-bit one. Singe precision calculations were started to be done in double precision 64-bit mode and then truncating the final result to 32bits. As a result single and double precision tests started to give the same timing results.

Now if they return all that back, and may be even adding 4-bit one some AI need, the AI would also start singing and dancing on classical processors. Every time precision drops from FP64 to FP32 to FP16 and FP8 the performance increases twice, do the math
 
Last edited:
Exactly what applications work best on these high core count servers?
There are a few applications in the enterprise space I can think of, but the biggest one is probably databases. Especially those who are hosted as shared services (think Amazon RDS) will benefit greatly from higher core counts. Utilities also typically host massive databases for their real-time measurement processes that are often stored in operational databases. Utilities measurement points and devices producing those points have been growing rapidly since the ESG revolution has started (more renewables online with backup sources, people adding wind and solar to their homes, some even use electric vehicles to add energy to the grid during night time), more cores allows them head room for scale as without adding new hardware.
 
There are a few applications in the enterprise space I can think of, but the biggest one is probably databases.
But there are a dozen other parameters that all need to be in balance, you can't just drop 128 motors onto your Mazda Miata, or 128 wheels, or 128 seats, and think it's all good.

To a first approximation a very simple database that assigns one core per query, can be crudely scaled up by putting 128 cores on a chip. But there are going to be limits, putting 65000 of them on the chip will not make it super-duper fast or give it immense capacity, cache and DRAM and IO will all bottleneck, and just multi-task contention and management.
 
I don't know why this is even an article. Nobody is buying at the listed price.

Our new cluster with 8490H (Sapphire rapids top SKU). Each blade's total cost (2 CPUs, RAM, infiniband) is about the price of a 8490H on Intel website.
 
But there are a dozen other parameters that all need to be in balance, you can't just drop 128 motors onto your Mazda Miata, or 128 wheels, or 128 seats, and think it's all good.

To a first approximation a very simple database that assigns one core per query, can be crudely scaled up by putting 128 cores on a chip. But there are going to be limits, putting 65000 of them on the chip will not make it super-duper fast or give it immense capacity, cache and DRAM and IO will all bottleneck, and just multi-task contention and management.
Of course. In the history of computers the limitation has always been can the application(s) keep the processor feed due to the other limits of the system. However, for large operational databases, keeping a processor feed is almost never an issue given the large volume of requests. CPUs for real time operational databases are almost always the limiting factor, that is why there are so many operational databases adding the ability to do compute with GPUs on top of CPUs (among other reasons to leverage vectorization with GPUs). It's also why distributed databases have really accelerated in the market which allows for horizontal compute scale, but horizontal compute comes with latency, which can be undesirable in certain use cases.

In general an application or database that is written to scale with more compute cores will benefit from more compute cores. Efficiency might be reduced, but more load can be handled.

https://researchcomputing.princeton.edu/support/knowledge-base/scaling-analysis
 
  • Like
Reactions: JRStern
In general an application or database that is written to scale with more compute cores will benefit from more compute cores. Efficiency might be reduced, but more load can be handled.

https://researchcomputing.princeton.edu/support/knowledge-base/scaling-analysis
Yes and no, there is also Amdahl's law that in the general case adding additional cores speeds things up only by the log of the number of cores, a losing battle for anything over a dozen or two in most cases. IOW "will benefit" becomes smaller and smaller for each additional core, and can indeed go negative.

It must be fun to work on the HPC systems with a million processors, and I've heard from Google developers how they did a temporary work-around by assigning 10,000 more VMs to the problem, and stuff like that. As far as I've gone was to briefly have some 64 and 128 hyperthreads to play with, the idiot bank I was working with spent $$$$$ on these big, honking servers circa 2010, and $ on building the software to run on them, and AFAIK never got anything deployed at all.
 
I can never figure out from the press coverage just what is what. Exactly what applications work best on these high core count servers? Intel *could* always match core count, but there was no good reason to, too many cores just cause contention and blocking and cache overload and IO queues, not to mention core licensing issues.

But AMD went there more for marketing hype than any real benefit, and now Intel has been dragged into it too. Or at least that's how it looks to me.

Now, in SQL Server you might benefit from a bunch of cores (if you can afford the license, or Azure level), but the way it works is lots of small queries only need one core each, but some big queries can run a lot faster if they are free to grab 4 or 8 or 32 cores for a few seconds or minutes. So the optimal situation is to have a bunch of cores that sit idle 50-80% of the time and are only used for some big (and mostly sloppy) queries. But Microsoft's licenses used to require paying for them linearly - as if they were going to be used 100% of the time. So Microsoft suppressed demand for high core counts on servers from about Y2K until I'm not sure when - are they still doing that, or have they reintroduced a per-processor license that maxes out at 10 or 16 or something?

SMH
Are you for real?

Servers are not exclusively for VM and IT Services. There is an actual need for compute in today's reality.
 
Servers are not exclusively for VM and IT Services. There is an actual need for compute in today's reality.
Well first, I didn't say that, so I'm not sure what triggers your comment.

But if you asked me I *would* say it!

Everything I've done since like 2010 has moved as fast as they could to cloud, and I can't always tell if it's VM there are not, but the last bunch of on-prem I used were wrapped up in VMs whether I liked it or not.

I can often run stuff faster on a laptop than on a really $$$ cloud configuration, given their pricing and capacity levels, but it is what it is.

What's a common example of people needing heavy compute in today's reality?

Even common laptops are now so powerful that they'll do for virtually everything, that is, everything that wouldn't really prefer to run on a supercomputer with a thousand or a million cores and GPUs and whatever.

My friend runs what used to be five full server loads in VMs on his i9 laptop today. OMG
 
I remember few years back when first 5nm TSMC Apple Bionic processors wth 12B transistors appeared their manufacturing price was shocking, just around $1.5 per core and selling price something like twice of that. The whole 6-core processor was sold at around unthinkable price around $20.
I have no idea what you're talking about, but I can assure you that no 5nm SoC Apple ever made cost only $20. Not even the base manufacturing costs.

should both AMD and Intel for almost the same with minor differences silicon have factor of 50-100 difference with Apple?
I really don't see how you can compare them. Since Apple doesn't sell their chips outside of their devices, we don't really know what they would cost.

... but the conclusions about huge fishy discrepancy will be approximately the same.
What's fishy is the data on those Apple chips that you're basing all this off of.
 
Of course multicore processors are needed and so far they scale up and perform great. And they could do great even with AI, just add there native hardware for missing 8 and 16 bit arithmetic.
MMX, SSE2+, AVX, AVX-512, and Intel's AMX instruction set extensions all have special support for low-precision arithmetic, like 8-bit ints and one or more of a few different 16-bit formats. Their AI performance is good, but still not on par with datacenter GPUs and purpose-built AI accelerators.

My understanding is that Intel around one-two decades ago stopped supporting single precision FP32 natively in favor of 64-bit one.
x87 arithmetic is natively 80-bit. Most math done on modern CPUs uses SSE or AVX, instead. Each of these ISA extensions supports scalar arithmetic, not only vector operations. They have separate instructions, depending on whether you want to use fp32 or fp64.

Singe precision calculations were started to be done in double precision 64-bit mode and then truncating the final result to 32bits.
The x87 configuration actually depends on the OS, where I think Windows defaults to 64 bits and Linux defaults to 80 bits. The 32-bit truncation usually happens when an intermediate (or final result) is written out to a memory location that's only fp32. This behavior can lead to x87 fp arithmetic being very temperamental and your results can change from one run to the next, with only minor, seemingly unrelated code changes being made in between. It wasn't fun.

Now if they return all that back, and may be even adding 4-bit one some AI need, the AI would also start singing and dancing on classical processors. Every time precision drops from FP64 to FP32 to FP16 and FP8 the performance increases twice, do the math
It's potentially even better than you say, due to potentially lower latency on lower-precision arithmetic operations.
 
I don't know why this is even an article. Nobody is buying at the listed price.

Our new cluster with 8490H (Sapphire rapids top SKU). Each blade's total cost (2 CPUs, RAM, infiniband) is about the price of a 8490H on Intel website.
8490H is Sapphire Rapids, which is now 2 generations old. If you got a good deal on them, that might have something to do with it.