News Intel's 1500W TDP for Falcon Shores AI processor confirmed — next-gen AI chip consumes more power than Nvidia's B200

Status
Not open for further replies.
Intel's codenamed Falcon Shores hybrid processor combines x86 and Xe GPU cores to offer formidable performance for AI and HPC workloads,

Intel's Falcon Shores will be a multi-tile processor featuring both x86 cores (tiles) for general-purpose processing and Xe cores (tiles) for highly parallel AI and HPC workloads.

Uh ?

Didn't Intel later downgraded and redesigned the product to a GPU-only design ? So it can now take the best of Intel's Gaudi AI accelerators, combined with next-gen Xe graphics architecture for HPC and other compute-heavy workloads.

Instead of CPU, GPU, and CPU+GPU configurations, we're just getting a GPU.
 
Last edited by a moderator:
But anyway, Intel didn't fully confirm whether we can see x86 cores implemented in future though. Plans might change, but for now we can only expect a GPU-only design IMO.

But if we look at the current market trend, most of the customers are now interested in discrete GPUs and AI chips, which is why the company has also prioritized the Gaudi AI processor, and this whole "generative AI" hype/trend has also dried up demand for the original Falcon Shares, which was supposed to integrate both GPU and CPU on a single chip.

Also, current computing environment is not yet mature enough to achieve the initial goal of mixing CPU and GPU cores into the same Falcon Shores package. The "decoupling" of CPU and GPU could also provide more options for customers with different workloads.

I mean the move might allow more customers to logically use a variety of different CPUs, including AMD's x86 and Nvidia's Arm chips, as well as their GPU designs, and therefore does not limit them to only Intel's x86 cores.

Unlike AMD's Instinct MI300 and NVIDIA's Grace Hopper offerings, which might tie customers' product design to vendor solution configurations to a greater extent/degree though. But these have their own benefits and (dis)advantages.

Btw, that's some insane power consumption figure, 1500+ Watt ! Someone needs to regulate and stop this.
 
Last edited by a moderator:
  • Like
Reactions: cyrusfox and PEnns
I think this is the point where environmental agencies and regulatory bodies should intervene and set a power cap so they are all forced to start innovating power efficiency again.

I know this is for datacenters / HPC, but consumer desktop stuff is dangerously edging towards this sort of environmentally unfriendly power consumption and dissipation and the reason is exactly because it is too costly for AMD / Intel / NVIDIA to make two entirely different products for those two market segments -- HPC market is now negatively impacting the consumer market.
 
  • Like
Reactions: slightnitpick
Btw, that's some insane power consumption figure, 1500+ Watt ! Someone needs to regulate and stop this.
Let's hope regulators don't get involved. This is an area that will regulate itself. Generally, the bigger and more power hungry a single module, the more energy efficient the overall system will be. What's better, a 1500w module in 1 server or 5 servers with 300w modules? Yes, that simplistic example ignores a lot of nuance, but the general point is sound.
 
Also, current computing environment is not yet mature enough to achieve the initial goal of mixing CPU and GPU cores into the same Falcon Shores package. The "decoupling" of CPU and GPU could also provide more options for customers with different workloads.
Well, AMD and Nvidia are both doing it. AMD has the MI300A, while Nvidia has Grace + Hopper "superchips", which involves putting one of each on the same SXM board.

Btw, that's some insane power consumption figure, 1500+ Watt ! Someone needs to regulate and stop this.
The total power consumption doesn't tell you how efficient it is, and inefficiency is one major problem.

I think a better approach is to increase incentives for datacenters to reduce power consumption. That will encourage them to run power-hungry chips at lower clockspeeds, where they tend to be more efficient, and probably focus on buying more energy-efficient hardware than just the cheapest stuff.

Datacenter power consumption and cooling are already on track to become such big problems that governments probably won't even need to impose extra taxes or fines.
 
  • Like
Reactions: slightnitpick
I know this is for datacenters / HPC, but consumer desktop stuff is dangerously edging towards this sort of environmentally unfriendly power consumption and dissipation and the reason is exactly because it is too costly for AMD / Intel / NVIDIA to make two entirely different products for those two market segments -- HPC market is now negatively impacting the consumer market.
I'm not sure you've thought that through sufficiently. AMD's EPYC Genoa uses like 380 W for 96 cores. That works out to about 4 W per core, which should see the 7950X pulling only 64 W.
 
  • Like
Reactions: PEnns
Well, AMD and Nvidia are both doing it. AMD has the MI300A, while Nvidia has Grace + Hopper "superchips", which involves putting one of each on the same SXM board. The total power consumption doesn't tell you how efficient it is, and inefficiency is one major problem.

Of course, we know that. The point was to mention why INTEL is currently putting more efforts into a GPU-only solution, despite AMD and Nvidia going the other route (and Intel not jumping in the bandwagon).

That's why "decoupling" of CPU and GPU might help in specific use cases.

Unlike AMD's Instinct MI300 and NVIDIA's Grace Hopper offerings, which might tie customers' product design to vendor solution configurations to a greater extent/degree though. But these have their own benefits and (dis)advantages.


Btw, I will talk about the power "efficiency" topic in a different comment. I knew efficiency is a factor, but I didn't mention it in my OP.
 
Last edited by a moderator:
I'm not sure you've thought that through sufficiently. AMD's EPYC Genoa uses like 380 W for 96 cores. That works out to about 4 W per core, which should see the 7950X pulling only 64 W.
It probably does when it runs at the same clocks that the server runs at.
"Overclocking" the 7950x to 5.1Ghz all core, down from the dynamic ~5.3-5.4 that AMD is trying to get them to run at, already gets it down to 95W average.
https://www.techpowerup.com/review/amd-ryzen-9-7950x/24.html
power-applications.png
 
This is an area that will regulate itself.
Please stop with this capitalism myth about greedy corporations regulating themselves -- it has never worked so far and it will never, ever work as long as greed is a factor.
I'm not sure you've thought that through sufficiently. AMD's EPYC Genoa uses like 380 W for 96 cores. That works out to about 4 W per core, which should see the 7950X pulling only 64 W.
I don't see how that's relevant when those cores run at 2.4 GHz (max turbo all cores 3.55 GHz) which is like ~0.6x of what desktop cores are clocked.
 
I think this is the point where environmental agencies and regulatory bodies should intervene and set a power cap so they are all forced to start innovating power efficiency again.

I know this is for datacenters / HPC, but consumer desktop stuff is dangerously edging towards this sort of environmentally unfriendly power consumption and dissipation and the reason is exactly because it is too costly for AMD / Intel / NVIDIA to make two entirely different products for those two market segments -- HPC market is now negatively impacting the consumer market.
Power draw and efficiency are completely different things.

Consumer desktop stuff, whether we are talking about cpus or gpus, are incredibly efficient nowadays. Just don't run them at absurd power limits and voila. For example a 4090 at 220w is faster than a 3090 at 520 watts.
 
Bear in mind this is a multi-die package for high-density compute. Depending on the number of dies, 1200W could be spread amongst 4 or more full size GPU dies, each of which would be perfectly happy drawing several hundred watts when placed on individual boards, plus HBM packages, plus I/O dies.
 
Please stop with this capitalism myth about greedy corporations regulating themselves -- it has never worked so far and it will never, ever work as long as greed is a factor.
I mean this will be regulated by the natural laws of physics. I'm not indicating any expectation of corporate behavior. Taken in context of the rest of my original comment - adding an artificial 'regulated' cap of say 300w per module will potentially make the total deployment less efficient (more wasted power/heat) as the deployment would require that many more supporting servers to hit a given performance target.
 
  • Like
Reactions: slightnitpick
Power draw and efficiency are completely different things.

Consumer desktop stuff, whether we are talking about cpus or gpus, are incredibly efficient nowadays. Just don't run them at absurd power limits and voila. For example a 4090 at 220w is faster than a 3090 at 520 watts.
Underpower, ideally a combination of undervolting and underclocking, does wonders. AMD GPU's have been great undervolters for several generations now (also one reason why they were so popular during the cryptomining craze), and the same is pretty much true on many nVidia GPU's as well as you mentioned @TheHerald . Manufacturers should be optimizing more for power efficiency, but they are optimizing for performance for the sake of marketing and competition.

I think this is what others are trying to say in this thead -- that efficiency is still too backseat to performance on these high-end chips. The memory bandwidth and capacity is there with these huge alotments of HBM3, HBM4, etc. for AI training -- it's just a matter of clocks coming back down a little with voltage following and therefore power coming down to the square of the voltage drop. Intel in particular is a great offender of being too high on the power-voltage-clock speed curve as seen by the high TDP of 14th gen CPU's and especially the 14900K/KS. This is certainly to compete with AMD, but it comes at the cost of more consumption on the grid for what really doesn't matter at the end of the day in the real world: that last little squeeze of performance. AMD's X3D processors are an excellent example as they are purposely held back in clock speeds and peak power consumption, but they perform like champs.

Back to efficiency vs. raw perf, it's also true as some have said that 1500W in itself isn't an indicator of efficiency as we have to see what the performance level is at that rating. Fortunately, these chips and most don't run at peak power 100% of the time, but then chips for AI training are probably living harder lives than the rest of their datacenter counterparts.

Lastly, no, I don't see direct regulation being necessary at this time and agree rather with incentives and disincentives, especially taxing and/or higher electric rates on the biggest consumers. Microsoft probably already anticipates this, which is why it's worth it to them to go as far as spending huge dollars on self-owned nuclear power generation. And no, megacorps don't self-regulate -- they want and expect governments to set the bars because they want to know what the playing field is and for everyone to have that same playing field... well, they all like their carve-outs and loopholes, but that's for another time and place.
 
  • Like
Reactions: slightnitpick
Underpower, ideally a combination of undervolting and underclocking, does wonders. AMD GPU's have been great undervolters for several generations now (also one reason why they were so popular during the cryptomining craze), and the same is pretty much true on many nVidia GPU's as well as you mentioned @TheHerald . Manufacturers should be optimizing more for power efficiency, but they are optimizing for performance for the sake of marketing and competition.

I think this is what others are trying to say in this thead -- that efficiency is still too backseat to performance on these high-end chips. The memory bandwidth and capacity is there with these huge alotments of HBM3, HBM4, etc. for AI training -- it's just a matter of clocks coming back down a little with voltage following and therefore power coming down to the square of the voltage drop. Intel in particular is a great offender of being too high on the power-voltage-clock speed curve as seen by the high TDP of 14th gen CPU's and especially the 14900K/KS. This is certainly to compete with AMD, but it comes at the cost of more consumption on the grid for what really doesn't matter at the end of the day in the real world: that last little squeeze of performance. AMD's X3D processors are an excellent example as they are purposely held back in clock speeds and peak power consumption, but they perform like champs.

Back to efficiency vs. raw perf, it's also true as some have said that 1500W in itself isn't an indicator of efficiency as we have to see what the performance level is at that rating. Fortunately, these chips and most don't run at peak power 100% of the time, but then chips for AI training are probably living harder lives than the rest of their datacenter counterparts.

Lastly, no, I don't see direct regulation being necessary at this time and agree rather with incentives and disincentives, especially taxing and/or higher electric rates on the biggest consumers. Microsoft probably already anticipates this, which is why it's worth it to them to go as far as spending huge dollars on self-owned nuclear power generation. And no, megacorps don't self-regulate -- they want and expect governments to set the bars because they want to know what the playing field is and for everyone to have that same playing field... well, they all like their carve-outs and loopholes, but that's for another time and place.
The 14900ks is a great example, it's - due to binning - the most efficient CPU intel has ever produced. It's now down to the user to utilize that efficiency by running sensible power limits or go kamehameha mode at 450 watts. But you can't be doing the latter and then complain that hardware isn't efficient nowadays.
 
IMO, Intel might be the maker, but they wouldn't make it unless the customer demands it.

If you want to regulate something to change behavior, the only thing needed is a higher utility bill for mega consumers.
Charge a premium for electrical and water usage. If the giga data center corporation doesn't like that, they are welcome to build their own. No more free rides piggybacking off the infrastructure paid by the tax payers.

It's not just electricity usage that's to worry about. Water usage plays a critical role in cooling down data centers. Look no further than Google's insatiable and growing annual water usage. It was 5.6 billion gallons in 2022, and this number is expected to be even higher today. In fact, some speculate it's even higher, as some governments made deals with google to not collect their water usage data.
 
  • Like
Reactions: CmdrShepard
IMO, Intel might be the maker, but they wouldn't make it unless the customer demands it.

If you want to regulate something to change behavior, the only thing needed is a higher utility bill for mega consumers.
Charge a premium for electrical and water usage. If the giga data center corporation doesn't like that, they are welcome to build their own. No more free rides piggybacking off the infrastructure paid by the tax payers.

It's not just electricity usage that's to worry about. Water usage plays a critical role in cooling down data centers. Look no further than Google's insatiable and growing annual water usage. It was 5.6 billion gallons in 2022, and this number is expected to be even higher today. In fact, some speculate it's even higher, as some governments made deals with google to not collect their water usage data.
Yeah because the workplaces and imports and taxes the megacorps pay are all makebelieve and don't exist...governments aren't THAAAAAT stupid.
And no, megacorps don't self-regulate -- they want and expect governments to set the bars because they want to know what the playing field is and for everyone to have that same playing field... well, they all like their carve-outs and loopholes, but that's for another time and place.
They have to pay for the power they use (not just electricity but space, cooling, whatnot) so they will want to make the best use out of it, in that sense they do self regulate because they can only spend that much money on that much performance before it doesn't make any sense anymore.
 
Yeah because the workplaces and imports and taxes the megacorps pay are all makebelieve and don't exist...governments aren't THAAAAAT stupid.

They have to pay for the power they use (not just electricity but space, cooling, whatnot) so they will want to make the best use out of it, in that sense they do self regulate because they can only spend that much money on that much performance before it doesn't make any sense anymore.
Regulation implies outside imposed limits, no?
 
  • Like
Reactions: bit_user
Of course, we know that. The point was to mention why INTEL is currently putting more efforts into a GPU-only solution, despite AMD and Nvidia going the other route (and Intel not jumping in the bandwagon).
My guess is they decided to "pull out all the stops" for simplifying the product and getting their development team back on track. They can't afford another disaster like Ponte Vecchio.

Btw, I will talk about the power "efficiency" topic in a different comment. I knew efficiency is a factor, but I didn't mention it in my OP.
Efficiency is one of the main problems, though. If we have a hypothetical Intel GPU at 1.5 kW, and it's 4x as fast as an AMD GPU that burns 0.75 kW, then you'd be better off running that one Intel chip instead of four of the AMD chips, assuming you need the performance.
 
On its own, no. For example you are regulating the water flow of your taps all the time without any imposed limits, you are the outside limit imposer on the tap.
Also that's why we are calling it self regulation to avoid any confusion.
The context of the statement begs the question of corporate regulation, and self-regulating companies are uncommon if the said regulations hamper either their bottom line or the perceived competitiveness of their products. Intel is financially obligated to make their products perform at the highest level possible because of several reasons. This is at complete odds to energy efficiency because to get that last 5% of performance you need to double the power consumption.

To be completely fair, I understand your usage of the word, but because the word regulate has such implicit baggage associated with it like government oversight, there may be better word choice available.
 
Status
Not open for further replies.