News Tachyum releases a 1,600-page performance optimization manual despite continued tape-out delays and no actual silicon

Any thoughts on this processor?
It's a new ISA because... why?? Their original announcement sounded much more like VLIW or EPIC, but then they pivoted towards something more like a traditional RISC.

As far as I can tell, what supposedly makes it good for AI and HPC is that each core has vector extensions and a matrix-multiply engine. You could do that with RISC-V, so why again is it a custom ISA? Even ARM now has SME (Scalable Matrix Extensions), though I'm not aware of any cores that yet implemented it.

It feels like every few years that I hear about a promising HPC processor, but it seems like they chronically underestimate the time to bring something like that to market, and once they do, the mainstream guys have caught up & usually even passed them.

The main thing they seem to have going for them is being an indigenous European project and backing from the Slovakian (?) government. However, the competition they're up against includes the European Processor Initiative, which is currently using ARM and has already planned to switch over to RISC-V.

Also, for those who don't mind buying a non-European ARM CPU, Fujitsu's Monaka should provide stiff competition in the HPC sector:

Lastly, with power consumption of up to 950 W, I'm skeptical how appealing they're going to be for general-purpose cloud workloads, which they also claim to be targeting:

Don't get me wrong, though. I'm not wishing to see them fail. I wouldn't even mind being proven wrong, because I think having a greater range of computing platforms is generally a good thing. I'm just trying to be realistic about their prospects. I appreciate the monumental amount of work they've put in, this far, and it saddens me to think it might all be for nought.

Anyway, what's much more intriguing to me is NextSilicon's Maverick-2:


The idea of dynamically building dataflow pipelines sounds like it has a lot more potential to maximize silicon utilization. That should ultimately lead to better perf/W and better perf/$ than Tachyum's approach, which is basically just following what ARM, Intel, and AMD are all doing, but with slightly wider vector pipelines and more general matrix extensions than AMX currently has.
 
Last edited: