News Jim Keller slams Nvidia's CUDA, x86 — 'Cuda’s a swamp, not a moat. x86 was a swamp too

Admin · Feb 19, 2024

Legendary processor architecture designer criticizes CUDA and compares it to x86.

Jim Keller slams Nvidia's CUDA, x86 — 'Cuda’s a swamp, not a moat. x86 was a swamp too : Read more

NinoPino · Feb 19, 2024

There is a typo, it is MIPS not MISC.

ezst036 · Feb 19, 2024

Maybe Jim should stop flapping his lips and see to it that Tenstorrent produces some motherboards.

Without some ATX boards, we can't use his RISC-V designs.

With that said, we're stuck in the swamp and he isn't supplying any rope so we can try to stop drowning in it. So what do I care what he says? He isn't producing any solutions to my problem that's for sure. Talk is cheap.

This is just another installment in the daily drama.

Argolith · Feb 19, 2024

ezst036 said:
Maybe Jim should stop flapping his lips and see to it that Tenstorrent produces some motherboards.

Without some ATX boards, we can't use his RISC-V designs.

With that said, we're stuck in the swamp and he isn't supplying any rope so we can try to stop drowning in it. So what do I care what he says? He isn't producing any solutions to my problem that's for sure. Talk is cheap.

This is just another installment in the daily drama.

Sorry if a single guy doesn't break up corporate monopolies on his own.

The Hardcard · Feb 19, 2024

Tenstorrent does have silicon out. I don’t think it’s high volume manufacturing yet.

Also by “we” you probably need to mean corporate. I think all of these AI startups are chasing data center dollars, so be surprised if there anything selling for less than $25K for a complete running system. And that’s for efficiency plays. Any system that they can show is faster than an H100 system and already has a useable software stack will cost far more.

Findecanor · Feb 19, 2024

People rooting for RISC-V have been rooting for Tenstorrent's announced 8-wide Ascalon cores.

But apparently, those are not actually what Tenstorrent is prioritising. They are mostly intended to be "companion processors" to TPUs.
Ascalon has been announced to support only RV64GCV, whereas to reach feature parity with x86 and ARM and for RISC-V software to kick off, you'd need RVA23 compliance.

So, IMHO, Tenstorrent is also wading around in the fragmented RISC-V swamp instead of helping build solid software foundations.

Pierce2623 · Feb 19, 2024

Findecanor said:
People rooting for RISC-V have been rooting for Tenstorrent's announced 8-wide Ascalon cores.

But apparently, those are not actually what Tenstorrent is prioritising. They are mostly intended to be "companion processors" to TPUs.
Ascalon has been announced to support only RV64GCV, whereas to reach feature parity with x86 and ARM and for RISC-V software to kick off, you'd need RVA23 compliance.

So, IMHO, Tenstorrent is also wading around in the fragmented RISC-V swamp instead of helping build solid software foundations.

Fully agreed. At least CISC computing is mostly unified behind one ISA. RISC V is useless until it has a software stack that actually does things. The tenstorrent CPU that people were excited about doesn’t even comply well enough with RISC V to run much of a what is a very thin software stack anyways. It basically turned out to be a driver for their AI solution and nothing else.

hsv-compass · Feb 19, 2024

Pierce2623 said:
Fully agreed. At least CISC computing is mostly unified behind one ISA. RISC V is useless until it has a software stack that actually does things. The tenstorrent CPU that people were excited about doesn’t even comply well enough with RISC V to run much of a what is a very thin software stack anyways. It basically turned out to be a driver for their AI solution and nothing else.

x86, Arm, MIPS all have unified ISA and mature software stack. Never understand the need for just another poorly supported ISA (YARI - yet another RISC ISA). Is MIPS ISA really royalty free (MIPS open)?

bit_user · Feb 19, 2024

NinoPino said:
There is a typo, it is MIPS not MISC.

When did he work on MIPS? He co-architected a DEC Alpha, but that was its own (RISC) ISA, and not MIPS-based.

I also thought this was a typo, but I'm not sure what was intended. Anyway, I looked it up and MISC is apparently a thing. I'm still unconvinced that was intended, as it doesn't seem very useful to mention in such a brief summary.

https://en.wikipedia.org/wiki/Minimal_instruction_set_computer

Honestly, it would be easier just to list the places Jim worked (and delivered!) as a chip architect:

DEC
AMD (twice)
Apple
Tesla

He also briefly oversaw developments at Intel, but that was higher-level and more forward looking than a chip architect role.

we might not see his name on the Nvidia roster any time soon.

I think that's a safe bet. He surely doesn't need another salaried position. With Tenstorrent, he's clearly decided to throw his hat into the startup game.

Plus, at this point, I'm sure Nvidia has quite a deep roster. They don't really need him like his other recent employers did.

bit_user · Feb 19, 2024

ezst036 said:
Maybe Jim should stop flapping his lips and see to it that Tenstorrent produces some motherboards.

Without some ATX boards, we can't use his RISC-V designs.

First, they have built boards, such as PCIe cards. However, that's for AI chips where you just need to add DRAM, a VRM, and then you have everything you need to plug it into a PC and start using it.

For RISC-V, Tenstorrent has focused on just the IP and building CPU tiles. Jim has given talks (you can find them on youtube), where he's extolled the virtues of chiplets. He's said a small company like his shouldn't have to deal with all of the various enablement IP, like memory controllers, PCIe, storage controllers, etc.

So, not only are they not building motherboards for their RISC-V cores, they're not even making complete SoCs. I think it totally makes sense. If he's right, you'll be able to use their cores in CPUs and SoCs from others. However, the lead time on such developments is usually a couple years from the point at which the IP is completed and validated. So, I wouldn't expect to see their RISC-V cores in any devices, just yet.

ezst036 said:
With that said, we're stuck in the swamp and he isn't supplying any rope so we can try to stop drowning in it.

Sure, he is! Tenstorrent makes AI accelerators backed by an open API & toolchain and supported in major AI frameworks.

I think you need to read between the lines, a little more. His statement is probably meant to counter the narrative Tenstorrent might be facing from potential customers. Maybe they keep hearing the refrain "...but CUDA!", in their sales calls? Just a guess.

bit_user · Feb 19, 2024

The Hardcard said:
I think all of these AI startups are chasing data center dollars, so be surprised if there anything selling for less than $25K

Last year, Tenstorrent inked a deal with a LG, where their AI chiplets will be used in future TVs for AI postprocessing and upscaling.

Tenstorrent Partners with LG to Build AI and RISC-V Chiplets for Smart TVs of the Future

TORONTO, CA – May 30, 2023 Tenstorrent and LG Electronics Inc. (LG) are pleased to announce they are collaborating to build a new generation of RISC-V, AI and Video Codec Chiplets to potentially power…

tenstorrent.com

While looking that up, I also found a partnership announcement in the automotive industry:

BOS and Tenstorrent Partner to Develop Automotive Semiconductor

SANTA CLARA, CA – On October 10th, BOS Semiconductors, a South Korean automotive semiconductor startup, announced that it is partnering with Tenstorrent, an AI semiconductor company based in North…

tenstorrent.com

bit_user · Feb 19, 2024

hsv-compass said:
x86, Arm, MIPS all have unified ISA and mature software stack. Never understand the need for just another poorly supported ISA (YARI - yet another RISC ISA). Is MIPS ISA really royalty free (MIPS open)?

Why no love for POWER? It's way more relevant than MIPS, these days. Still, that's not saying much...

https://en.wikipedia.org/wiki/OpenPOWER_Foundation

hsv-compass · Feb 19, 2024

bit_user said:
Why no love for POWER? It's way more relevant than MIPS, these days. Still, that's not saying much...

https://en.wikipedia.org/wiki/OpenPOWER_Foundation

ack!

yeyibi · Feb 19, 2024

CUDA is at reach of anybody with a nvidia GPU, because apart from AI, it is useful for gaming, video streams, graphic works, and other stuff.

So teenagers can start playing with CUDA, and developers know that their software requiring CUDA would had a large user base from day zero.

Tenstorrent doesn't has that advantage.

Findecanor · Feb 19, 2024

hsv-compass said:
Is MIPS ISA really royalty free (MIPS open)?

Within only a couple years, MIPS was opened, then it was closed again and then the company that owned MIPS (now itself also named "MIPS") announced the discontinuation of the ISA and that all their future processors would be RISC-V.

bit_user said:
When did [Jim Keller] work on MIPS?

He had been hired by SiByte in 1999 to work on a MIPS-based processor. SiByte was acquired by Broadcom in '00, and he left to found PA Semi in 2004. It is mentioned on the Wikipedia page about him.

bit_user · Feb 19, 2024

yeyibi said:
CUDA is at reach of anybody with a nvidia GPU, because apart from AI, it is useful for gaming, video streams, graphic works, and other stuff.

No, it's not. CUDA is not used by games and isn't supported on all Nvidia GPUs! IIRC, the GT 1030 didn't support it.

CUDA is a compute API. It's a completely separate software stack than what they use for interactive graphics!

yeyibi said:
So teenagers can start playing with CUDA, and developers know that their software requiring CUDA would had a large user base from day zero.

Tenstorrent doesn't has that advantage.

The advantage Tenstorrent has is that most AI developers never write a single line of CUDA. They use AI frameworks, like PyTorch. If that's your experience with AI, then you don't know or care what's underneath, as long as it works and it's fast.

So, in other words, teenagers can start playing with AI using their Nvidia (also AMD or Intel) dGPUs, but then migrate their AI models to the cloud, where it's processed by Amazon, Google, or other backends.

domih · Feb 19, 2024

All several decades old software stack (and ISA) accumulate layers of new stuff throughout the years. It just shows that they are being updated and improved according to new needs from the industry. If they were not, they would become obsolete and die. Yes, during these updates, one can try minimizing the "bazaar" factor, but it is not that easy. New features can also become obsolete much quickly than expected and become dead weight in the stack. It's like Mother Nature keeping adding to DNA or keeping adding new parts to the brain of mammals (the human brain is a mess of new parts added on top on old parts),

"Tabula rasa" is not possible if billions of systems and users expect compatibility.

One of the best examples is the Web (HTTP protocol, HTML/CSS/JS/WASM/... stack). Even its creator (Sir Berners-Lee) said that if he had known what it would become, he would have designed it differently to prevent the bazaar we see today. Redesigning the Web into a new incompatible stack would be an utopia. People would only accept to make a drastic switch if the new protocol would bring several order of magnitudes more possibilities, performance, power savings, etc.

There is a reason why "Wintel" has survived and leads all over the years: backward compatibility guarantying large adoption.

It's only with the advent of new devices, e.g. smart phones, that x86 got bypassed by ARM because ARM became powerful enough for these devices while consuming much less, thus making these devices possible.

bit_user · Feb 19, 2024

Findecanor said:
He had been hired by SiByte in 1999 to work on a MIPS-based processor. SiByte was acquired by Broadcom in '00, and he left to found PA Semi in 2004. It is mentioned on the Wikipedia page about him.

Thanks. I forgot about that.

bit_user · Feb 19, 2024

domih said:
"Tabula rasa" is not possible if billions of systems and users expect compatibility.

One of the best examples is the Web (HTTP protocol, HTML/CSS/JS/WASM/... stack).

Disagree. These standards go through major revisions, where cruft is deprecated. For example HTML 5 cleaned up a lot of legacy from before.

Yes, maybe you still have to support the old versions of the standard, but those can be segregated off from implementation of support for the newer stuff, at which point the maintenance burden is considerably lessened.

domih said:
There is a reason why "Wintel" has survived and leads all over the years: backward compatibility guarantying large adoption.

Well, the Windows part of "Wintel" sure isn't gaining market share.

As for the x86 part, neither is that. ARM is ascendant in the cloud. China is moving towards ARM and LoongArch, with RISC-V possibly set to displace their embrace of ARM. I'd say the main reason x86 has staying power isn't really backwards-compatibility, per se, but simply risk-aversion. It's a known quantity, so people feel some safety in that.

To overcome this inertia, you need a markedly better value proposition, but AMD and Intel have continued to improve x86 enough to keep it viable and narrow the gap for anything else to step in. Even so, in the cloud market, ARM has got its foot in the door and is prying it open.

CmdrShepard · Feb 19, 2024

He said "It was built by piling on one thing at a time" -- I dare you give an example of anything that wasn't made by iterating the design as the capabilities and demands changed.

Such a dumb comment should be ridiculed, not promoted.

Alas, journalism doesn't exist anymore. This isn't journalism, it's regurgitating crap from social media. Parrots can write better news.

bit_user · Feb 19, 2024

CmdrShepard said:
He said "It was built by piling on one thing at a time" -- I dare you give an example of anything that wasn't made by iterating the design as the capabilities and demands changed.

You're confusing different things. What you're talking about is iteration on a design where you're not burdened with the legacy of backward compatibility. In this case, you can remove or streamline things that don't work or are needlessly complex. You can find better ways to do things and optimize the design to emphasize them, etc.

What CUDA has to deal with is supporting 17+ years worth of just about every idea they ever had, whether it was good or bad and regardless of how it interacted with the other features of the language/API. In that sense, it's like the C++ of GPU Compute APIs (in the sense that C++ was heavily-evolved and needlessly complex for what it does).

CUDA is basically like a dirty snowball, rolling down a hill, accumulating random cruft as it rolls along. You can't pull anything out of it, for fear of breaking some codebase or another, so it all just accumulates.

I did a little CUDA programming around 2010 or so, before I picked up a book on OpenCL and started dabbling with it. OpenCL immediately seemed so much cleaner and more self-consistent. It's like they took all of the core ideas that had been proven in CUDA and other GPGPU frameworks and re-implemented them with a blank slate. That's one thing I like about OpenCL and SYCL.

CmdrShepard said:
Such a dumb comment should be ridiculed, not promoted.

And what have you done to elevate yourself to such a vaunted status where we should take your word over that of such an industry luminary?

CmdrShepard · Feb 19, 2024

bit_user said:
No, it's not. CUDA is not used by games and isn't supported on all Nvidia GPUs!

CUDA is used by games indirectly -- PhysX engine uses CUDA.

bit_user said:
CUDA is a compute API. It's a completely separate software stack than what they use for interactive graphics!

Not really, CUDA can be used for whatever you want. For example, OptiX SDK uses CUDA, Iray 3D ray-tracing plugin uses OptiX SDK, Daz Studio and some other 3D rendering and animation apps use Iray plugin. NVIDIA Video Codec SDK uses CUDA, NVIDIA Performance Primitives are built on CUDA and are used by a lot of apps. Heck, even JPEG encoding / decoding can be done fully on GPU with NVJPEG library (again CUDA).

bit_user said:
The advantage Tenstorrent has is that most AI developers never write a single line of CUDA.

That's true, but NVIDIA also supports other AI / ML frameworks and SDKs through CUDA. Not everyone wants to deal with Python you know.

TL;DR -- CUDA has been the first, it has the widest adoption, massive ecosystem of apps using it directly or indirectly, mature features, and superior support both in software and hardware. When it comes to GPGPU everything else is "also-ran".

Anyone dissing CUDA is therefore not being honest.

bit_user · Feb 19, 2024

CmdrShepard said:
Not really, CUDA can be used for whatever you want.

Sure, in the same sense that you can write interactive renderers using OpenCL or compute apps using OpenGL or Direct3D.

The point is that it's not designed for interactive rendering, as @yeyibi incorrectly stated. That's what Direct3D and Vulkan are for.

You can probably find some corner cases where people did use it for some sort of interactive demos or the like, but highlighting them would amount to a misrepresentation and muddy waters that really should be quite clear, just to satisfy your apparent contrarian urges.

CmdrShepard said:
NVIDIA also supports other AI / ML frameworks and SDKs through CUDA. Not everyone wants to deal with Python you know.

Even then, you still don't have to drop to the level of writing CUDA code. I integrated the Torch framework into a product by using its C++ API, and that didn't involve any CUDA, at all. That's because CUDA is used to support these frameworks as a back-end.

CmdrShepard said:
Anyone dissing CUDA is therefore not being honest.

This is a ridiculous assertion.

CUDA has its strengths and weaknesses. Just being first and dominant doesn't mean it's beyond reproach. We can find countless examples of dominant standards, technologies, APIs, etc. that have plenty of weaknesses and flaws.

CmdrShepard · Feb 19, 2024

bit_user said:
You can probably find some corner cases where people did use it for some sort of interactive demos or the like, but highlighting them would amount to a misrepresentation and muddy waters that really should be quite clear, just to satisfy your apparent contrarian urges.

I gave you an example of Iray which can perform interactive ray-tracing on RTX hardware. You can download a free copy of Daz Studio and try it out yourself -- on my RTX 4090 I can manipulate scene viewport with ray tracing on and a couple of professional RTX cards with NVLink can do even better than my single cards. Backend for Iray? CUDA.

bit_user said:
Even then, you still don't have to drop to the level of writing CUDA code. I integrated the Torch framework into a product by using its C++ API, and that didn't involve any CUDA, at all. That's because CUDA is used to support these frameworks as a back-end.

I never said you have to, just that there are people who don't want Python at all.

bit_user said:
CUDA has its strengths and weaknesses. Just being first and dominant doesn't mean it's beyond reproach. We can find countless examples of dominant standards, technologies, APIs, etc. that have plenty of weaknesses and flaws.

I never said it is beyond reproach or criticism -- what I said is that just dissing it without providing any context or examples is disingenious at best.

What you said above just proves my original point -- nothing is perfect in the first iteration, everything is iterative and has to deal with legacy cruft at some point.

If you are basing your opinion on CUDA on experience from 2010 you might be surprised if you revisited it.

CmdrShepard · Feb 19, 2024

Oh and by the way, he said "There is a good reason there is Triton, Tensor RT, Neon, and Mojo" -- too bad there isn't some search engine that crawls the Internet which he could have used to find Tensor RT GitHub repo and see that you need CUDA to build it.

He also said "If you do write CUDA, it is probably not fast" -- I wonder how those other frameworks that rely on CUDA work so well then?

IMO he should stick to designing CPUs.

News Jim Keller slams Nvidia's CUDA, x86 — 'Cuda’s a swamp, not a moat. x86 was a swamp too

Administrator

Reputable

Honorable

Prominent

Distinguished

Distinguished

Commendable

Prominent

Titan

Titan

Titan

Titan

Prominent

Great

Distinguished

Titan

Honorable

Titan

Titan

Prominent

Titan

Prominent

Titan

Prominent

Prominent

Share this page