News Jim Keller slams Nvidia's CUDA, x86 — 'Cuda’s a swamp, not a moat. x86 was a swamp too

CmdrShepard · Feb 27, 2024

bit_user said:
They have a software stack they call TT-Buda

If I am reading that right, they side-stepped adding support for their hardware in those frameworks (which would've been an open-source-y thing to do) with conversion of existing frameworks and models to their proprietary format in order to run them on their hardware?

And the second option is C++ access to some "kernel" (supposedly well documented) in a roll-your-own fashion without the benefit of open-source collaboration?

Provided I am not mistaken that's:

1. Trying to get a vendor lock-in on customers
2. Being open-source in name but not in spirit (leeching from other frameworks, not giving anything back)

Thanks for posting that, now I dislike him even more.

bit_user · Feb 27, 2024

CmdrShepard said:
If I am reading that right, they side-stepped adding support for their hardware in those frameworks

I had some similar thoughts. It's probably worth actually diving into the details, because some of their comments suggest otherwise.

Anyway, to the extent someone cares, they can investigate further. Since I have no stake in the matter, I'm done with this.

CmdrShepard said:
And the second option is C++ access to some "kernel" (supposedly well documented) in a roll-your-own fashion without the benefit of open-source collaboration?

You're referring to Metalium? I believe that's how you program the hardware, directly. ...since it doesn't support CUDA.

"The figure below shows the software layers that can be built on top of the TT-Metalium platform. With TT-Metalium, developers can write host and kernel programs that can implement a specific math operation (e.g., matrix multiplication, image resizing etc.), which are then packaged into libraries. Using the libraries as building blocks, various frameworks provide the user with a flexible high-level environment in which they can develop a variety of HPC and ML applications."

Source: https://tenstorrent.com/software/tt-metalium/

As for the "open-source" part, Jim had previously touted their intention to open source this stuff. I don't know if that's still the plan, and maybe time-to-market concerns just de-prioritized that aspect? Or, maybe they made a strategic decision to do otherwise. Would be interesting to know.

CmdrShepard said:
Provided I am not mistaken that's:

1. Trying to get a vendor lock-in on customers
2. Being open-source in name but not in spirit (leeching from other frameworks, not giving anything back)

Thanks for posting that, now I dislike him even more.

Okay, but what if you are mistaken? Why rush to a conclusion without lining up the facts?

Also, let's say it's no longer planned to be open-sourced. How would that be worse than CUDA?

Finally, let's not forget the context, here. Jim is concerned about AI accelerators, not HPC or other domains CUDA is intended to serve. CUDA is much more general than what you need for an AI accelerator, and that generality doesn't come without costs (i.e. in terms of performance, efficiency, and literal cost). I think that's another way to see the "swamp" analogy - that Nvidia is bogging down its AI accelerators with the generality needed to support CUDA.

dalauder · Feb 27, 2024

CmdrShepard said:
That doesn't guarantee that what he says will make sense.

He is supposedly an expert in hardware architectures, not software architectures.

For APIs like CUDA (or even Win32) it is crucial to have backward compatibility because applications and services depend on it.

For CPUs, you can do a RISC or any other architectural design inside as long as what you expose on the outside can still execute x86 code so you have more liberty.

Good point, thanks for reminding me to bring x64 up.

x64 has been "piled up on x86" in much the same way (by using an instruction prefix and widening the register file) like what Intel did when transitioning from 16-bit to 32-bit. Yet I don't hear anybody dissing Keller's own work on that particular dirty snowball.

True, Keller isn't infallible. But mostly we're probably reading into him complaining about legacy compatibility too much.

CmdrShepard · Feb 29, 2024

bit_user said:
I had some similar thoughts. It's probably worth actually diving into the details, because some of their comments suggest otherwise.

Talk is cheap.

bit_user said:
As for the "open-source" part, Jim had previously touted their intention to open source this stuff. I don't know if that's still the plan, and maybe time-to-market concerns just de-prioritized that aspect? Or, maybe they made a strategic decision to do otherwise. Would be interesting to know.

Or it is what I suggested. A cash grab and posturing in front of investors.

bit_user said:
Okay, but what if you are mistaken? Why rush to a conclusion without lining up the facts?

Because he presented no facts about CUDA and wanted me to take his word at face value? Why is that OK for him and not for me?

bit_user said:
Also, let's say it's no longer planned to be open-sourced. How would that be worse than CUDA?

Ever heard of hypocrisy? "Those who are without sin should throw the first stone" and all that?

bit_user said:
Finally, let's not forget the context, here. Jim is concerned about AI accelerators, not HPC or other domains CUDA is intended to serve. CUDA is much more general than what you need for an AI accelerator, and that generality doesn't come without costs (i.e. in terms of performance, efficiency, and literal cost). I think that's another way to see the "swamp" analogy - that Nvidia is bogging down its AI accelerators with the generality needed to support CUDA.

There's nothing wrong with CUDA being more general. I think it is an advantage. You were able to process AI workloads with CUDA cores even before NVIDIA added Tensor cores in hardware. You were also able to process raytracing workloads with CUDA cores even before NVIDIA added RT cores. If anything, having unified general architecture enables them to easily figure out which parts of the workflow would benefit the most from being hardware-accelerated and how and they are doing just fine so far by dedicating small bits of expensive silicon to get the biggest possible gains.

bit_user · Feb 29, 2024

CmdrShepard said:
Talk is cheap.

So are forum posts!
; )

CmdrShepard said:
Or it is what I suggested. A cash grab and posturing in front of investors.

If the investors care about their open source status & plans, I'm sure they can ask. I think it's probably customers who have more of a stake in the matter.

CmdrShepard said:
Because he presented no facts about CUDA and wanted me to take his word at face value? Why is that OK for him and not for me?

That feels like whataboutism.

Your concern about vendor-lockin via their proprietary API is certainly valid, for customers who need to access the hardware at that level. IMO, the main benefit of the tools being opensourced is just to help avoid the hardware turning into a brick, if Tenstorrent ceases operations or undergoes a strategic shift that results in them prematurely dropping support for existing products.

CmdrShepard said:
There's nothing wrong with CUDA being more general.

Nothing morally wrong, but the main concerns would be if it forces the hardware to be less optimized, cost-effective, or efficient for its intended use case.

CmdrShepard said:
I think it is an advantage.

It increases total addressable market size, but that only benefits the manufacturer and customers looking for a general solution. For those who have a very specific purpose and RoI case, the price of unnecessary generality might be significant.

CmdrShepard said:
You were able to process AI workloads with CUDA cores even before NVIDIA added Tensor cores in hardware. You were also able to process raytracing workloads with CUDA cores even before NVIDIA added RT cores.

CPUs could do these things and a whole lot more. See, generality has tradeoffs!

As for ray tracing, the only real value it had on GPUs without RT cores is as a development vehicle. Performance was generally too low to be playable, making it almost irrelevant for gamers.

CmdrShepard said:
If anything, having unified general architecture enables them to easily figure out which parts of the workflow would benefit the most from being hardware-accelerated and how and they are doing just fine so far by dedicating small bits of expensive silicon to get the biggest possible gains.

Generality is a win, if you're either doing a variety of stuff with the hardware or you aren't initially sure what your needs will ultimately be.

What matters for Tenstorrent's customers and investors is whether their products are successful and competitive in their ability to solve real customer problems. If vague concerns about CUDA are getting in the way of that, I think Jim's comments are justified.

CmdrShepard · Feb 29, 2024

bit_user said:
CPUs could do these things and a whole lot more. See, generality has tradeoffs!

Could, but not at GPU speed.

bit_user said:
As for ray tracing, the only real value it had on GPUs without RT cores is as a development vehicle. Performance was generally too low to be playable, making it almost irrelevant for gamers.

I feel like I should have clarified what I meant by ray tracing.

I was talking about professional 3D applications and renderers. Even with only just CUDA cores they were at least an order of magnitude ahead of CPU if not much more.

bit_user said:
Generality is a win, if you're either doing a variety of stuff with the hardware or you aren't initially sure what your needs will ultimately be.

Yeah, and if you decide you need some other form of number crunching in addition to tensors with Tenstorrent hardware you wake up to a painful realisation that you have bought into an expensive single-purpose brick.

bit_user said:
If vague concerns about CUDA are getting in the way of that, I think Jim's comments are justified.

His comment isn't justified because he didn't provide any justification to back up his claim.

I guess what I am trying to say but it doesn't seem to be getting through to you is this -- he doesn't have to push others into <Mod Edit> and stand on their shoulders in order for him to look cleaner.

bit_user · Mar 1, 2024

CmdrShepard said:
Could, but not at GPU speed.

Right. Hence, my point: the generality vs. efficiency tradeoff.

CmdrShepard said:
Yeah, and if you decide you need some other form of number crunching in addition to tensors with Tenstorrent hardware you wake up to a painful realisation that you have bought into an expensive single-purpose brick.

It's sold as AI hardware, though. You don't buy a sports car and then throw a tantrum when you discover it has no way to mount a snow plow on it!

CmdrShepard said:
His comment isn't justified because he didn't provide any justification to back up his claim.

Failure to provide supporting evidence doesn't make a claim wrong; it just makes the argument a bad one.

I happen to agree with him. It's fine if you don't. I really don't care, either way.

I know two people who work at Nvidia, both of whom I really like and respect. I also think they have the best-in-class hardware and software, as well as some leading AI researchers. I can believe all of those things and still take issue with some of their business practices, including their strategy around CUDA. As a matter of fact, I have fewer issues with CUDA, itself, than I do with Nvidia's foot-dragging on OpenCL support, which is why my next dGPU will probably be Intel.

BTW, I could even imagine Jim complaining that OpenCL is a swamp, and I could accept that. Just because I prefer it as my GPGPU standard doesn't mean it's the most appropriate way to program every AI accelerator.

CmdrShepard · Mar 1, 2024

bit_user said:
It's sold as AI hardware, though.

So are NVIDIA CUDA capable cards. Your point?

Regarding your car analogy it is more like buying a sports car that can only be driven on one race track, not others, and absolutely can't be driven on regular roads.

bit_user said:
Failure to provide supporting evidence doesn't make a claim wrong; it just makes the argument a bad one.

It also suggests that the argument has been made in bad faith. To me that's more important.

bit_user said:
I can believe all of those things and still take issue with some of their business practices, including their strategy around CUDA.

You are shifting the goal posts.

First you had issue with CUDA, now it's about NVIDIA's business practices. What's next, you'll take issue with Jensen's leather jacket or kitchen fetish?

bit_user said:
As a matter of fact, I have fewer issues with CUDA, itself, than I do with Nvidia's foot-dragging on OpenCL support, which is why my next dGPU will probably be Intel.

Apple has all but abandoned the OpenCL. Khronos Group has released 3.0 specification back in 2020 I think? What exactly is your complaint about NVIDIA and OpenCL? Is there a feature their implementation is missing or what? I hate vague unsubstantiated complaints like Keller's on CUDA and x86 and yours hence this discussion.

Maybe if he explained it I would have even agreed with him? Too bad I don't know with what I am supposed to be agreeing with. As you have seen with "big name in X says Y is bad, trust him" I vehemently disagree because I wasn't raised to trust people on basis of who they are (i.e. authority).

bit_user · Mar 1, 2024

CmdrShepard said:
You are shifting the goal posts.

First you had issue with CUDA, now it's about NVIDIA's business practices. What's next, you'll take issue with Jensen's leather jacket or kitchen fetish?

No, it's their business practices which I see underlying some of my key concerns surrounding CUDA - the fact that they've kept it closed source (unlike AMD's HIP and Intel's oneAPI) and the fact that (unlike Intel) they've dragged their feet on OpenCL support.

CmdrShepard said:
Apple has all but abandoned the OpenCL.

That's been true for more than a decade. I'm not an Apple fan either, BTW.

CmdrShepard said:
Khronos Group has released 3.0 specification back in 2020 I think? What exactly is your complaint about NVIDIA and OpenCL?

Until OpenCL 3.0, they remained stuck at 1.2, in spite of having a beta implementation of 2.x that they never pushed over the line into "general release" status. The supposed reason for them remaining stuck at 1.2 was OpenCL 2.0's requirement of SVM, yet CUDA had been implementing the same sorts of features and someone even demonstrated an adapter that could run OpenCL 2.x code atop CUDA.

Furthermore, the didn't support OpenCL on their SoCs, which is even more shady.

CmdrShepard said:
Is there a feature their implementation is missing or what? I hate vague unsubstantiated complaints like Keller's on CUDA and x86 and yours hence this discussion.

Even their 3.0 support is highly questionable, since what it does is essentially make a lot of previously-mandatory features optional.

It's kinda funny that you attack me for even mentioning Nvidia's business practices, but then you leap at OpenCL like catnip. Well, if we're going to stay on topic, then let's stay on topic.

CmdrShepard said:
"big name in X says Y is bad, trust him"

He said it's a "swamp", which has negative connotations but it's a richer analogy than you suggest. And I never said "believe it because it's Jim K.". I always said two things:

I agree that it's a swamp, as far as I understand the analogy (which I've explained).
(in response to attacks on him) I defend his legitimacy to even make such statements.

CmdrShepard said:
I vehemently disagree because I wasn't raised to trust people on basis of who they are (i.e. authority).

Well, you've made it abundantly clear that you don't consider him qualified to voice an opinion on CUDA, in spite of his experience with leading the self-driving chip development at Tesla and now running an AI startup. I accept you feel that way, so can we move on?

CmdrShepard · Mar 2, 2024

bit_user said:
No, it's their business practices which I see underlying some of my key concerns surrounding CUDA - the fact that they've kept it closed source (unlike AMD's HIP and Intel's oneAPI) and the fact that (unlike Intel) they've dragged their feet on OpenCL support.

Parts of CUDA toolkit (PTX in particular) are closely tied to their GPU architecture. You don't really expect them to open-source that and reveal all their trade secrets?

I agree they could've open-sourced say NPP libraries built on CUDA, but then again Intel didn't open-source IPP libraries either so... meh.

bit_user said:
Even their 3.0 support is highly questionable, since what it does is essentially make a lot of previously-mandatory features optional.

You still didn't name a single feature you need supported in their OpenCL implementation which isn't supported.

bit_user said:
It's kinda funny that you attack me for even mentioning Nvidia's business practices, but then you leap at OpenCL like catnip.

I am not attacking you at all.

I am just asking for citations, same I expected to see from Jim Keller.

bit_user said:
Well, you've made it abundantly clear that you don't consider him qualified to voice an opinion on CUDA

I never said that so please stop putting words in my mouth.

All I said is he never offered any PROOF that he is qualified to voice an opinion on CUDA.

You know, by giving us some actual examples of what he considers a swamp instead of relying on his authority in the AI field to * on two extremely popular general architectures.

bit_user · Mar 2, 2024

CmdrShepard said:
Parts of CUDA toolkit (PTX in particular) are closely tied to their GPU architecture. You don't really expect them to open-source that and reveal all their trade secrets?

AMD and Intel open sourced their entire GPU software stacks. AMD even publishes ISA documents for their GPUs, openly. I'm sure there's some non-public information that's only shared under NDA, but probably a lot can be gleaned by closely inspecting their drivers and toolchains.

CmdrShepard said:
I agree they could've open-sourced say NPP libraries built on CUDA, but then again Intel didn't open-source IPP libraries either so... meh.

That's a CPU library, so not directly equivalent. x86 is an open architecture, so anyone can program for it without using IPP. The same is not true of GPUs, in general, and especially Nvidia GPUs.

CmdrShepard said:
All I said is he never offered any PROOF that he is qualified to voice an opinion on CUDA.

You know, by giving us some actual examples of what he considers a swamp instead of relying on his authority in the AI field to * on two extremely popular general architectures.

Okay, there it is, in your words.

ex_bubblehead · Mar 2, 2024

And with that yet another thread has lost its way and must be closed. Again, thank you to those very few that actually kept their comments on topic.

Search

News Jim Keller slams Nvidia's CUDA, x86 — 'Cuda’s a swamp, not a moat. x86 was a swamp too

CmdrShepard

Prominent

bit_user

Titan

dalauder

Splendid

CmdrShepard

Prominent

bit_user

Titan

CmdrShepard

Prominent

bit_user

Titan

CmdrShepard

Prominent

bit_user

Titan

CmdrShepard

Prominent

bit_user

Titan

ex_bubblehead

Titan

TRENDING THREADS

Latest posts

Moderators online

Share this page