News IBM boosts mainframes with 50% more AI performance: z17 features Telum II chip with AI accelerators

Admin · Apr 8, 2025

IBM's next-generation z17 mainframe combines the high-performance Telum II processor and accelerators to deliver secure transaction processing and advanced AI capabilities for mission-critical enterprise workloads.

IBM boosts mainframes with 50% more AI performance: z17 features Telum II chip with AI accelerators : Read more

bit_user · Apr 8, 2025

If you thought Nvidia's DGX systems were expensive...

I think the real story on the AI acceleration is that it's for people who need to use mainframes for regulatory reasons, or maybe they're just deep-pocketed and afraid to break with tradition. For the latter set, AI was probably the one thing that would lure them out of the mainframe world and into the cloud. IBM was probably worried that once they started using the cloud for AI, they might decide they could migrate their other computing tasks there, as well.

thestryker · Apr 8, 2025

I imagine the on die AI capabilities are mostly just standard gen on gen improvement, but the addin board certainly isn't. It would be interesting to see what the practical application ends up being given how these systems are typically used. Ever since I first saw the improvements IBM was doing each z series generation they've been fascinating to watch. They have so much design innovation it's easy to forget that at their core they're still about allowing native execution of code that goes back decades.

bit_user · Apr 8, 2025

thestryker said:
I imagine the on die AI capabilities are mostly just standard gen on gen improvement, but the addin board certainly isn't.

They said it uses the same AI cores as the CPUs. They don't say how many AI cores the CPU has, but if we assume it's only 1, then the Spyre should do up to 768 TOPS, which is still peanuts compared to Nvidia.

There's some more detail about non-AI parts of Telum II, here:

https://chipsandcheese.com/p/telum-ii-at-hot-chips-2024-mainframe-with-a-unique-caching-strategy

thestryker said:
They have so much design innovation it's easy to forget that at their core they're still about allowing native execution of code that goes back decades.

Each CPU only has 8 cores, which is sorta shocking in this day and age and considering how much they cost. That legacy code probably should've been rewritten ages ago. Leaving that aside, I wonder how fast it'd run in a JIT emulator on a modern x86 or ARM server CPU.

The part I find most interesting is that they still manage to pull enough revenue to fund the development of a custom microarchitecture and ...well, everything else.

Sam Hobbs · Apr 12, 2025

How does the performance of the Spyre AI accelerator cards compare to AI processors from other companies?

bit_user · Apr 12, 2025

Sam Hobbs said:
How does the performance of the Spyre AI accelerator cards compare to AI processors from other companies?

If we take the above figure of 768 TOPS and consider it to mean dense, int8 tensors, here would be the comparable scores for a few recent Nvidia products:

Model	TOPS (int8, dense)	Max Power (W)
L4	485	72
L40	1448	300
RTX 5090	1676	575
H200	1979	700
B200	4500	1200

I'm guessing the 768 figure is an overestimate, because they apparently use only 75 W each, according to this:

The IBM z17 Mainframe Brings AI with Telum II and Spyre

The IBM z17 mainframe brings new AI capabilities with the Telum II processor and its built-in AI accelerator and the PCIe-based Spyre AI cards

www.servethehome.com

Based on that, Nvidia's L4 is a rather useful point of comparison. I think they're also made on a similar process node and use similar memory (the Spyre appears not to use HBM-class DRAM). At 75 W, they're probably clocked a fair bit lower than the AI unit incorporated into each Tellum II processor. It could also have fewer AI cores than I assumed, since my estimate attributed one unit to each Tellum II. So, my guess of 768 might even be something like 3x of the real figure. Whatever the case, I think there's no way it's faster than a L4.

BTW, I just took another look at the die shot and it's quite clear the AI unit contains at least two units - maybe 4 or even more?

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2001f-57f5-4d61-bb2f-e8c311271b25_1915x1000.jpeg

Source: https://chipsandcheese.com/p/telum-ii-at-hot-chips-2024-mainframe-with-a-unique-caching-strategy

However many you think it is, divide the 768 TOPS figure by that. Then, maybe tweak it further for power/clock speed.

Search

News IBM boosts mainframes with 50% more AI performance: z17 features Telum II chip with AI accelerators

Admin

Administrator

bit_user

Titan

thestryker

Judicious

bit_user

Titan

Sam Hobbs

Distinguished

bit_user

Titan

The IBM z17 Mainframe Brings AI with Telum II and Spyre

TRENDING THREADS

Latest posts

Moderators online

Share this page