News Tiny Corp is 70% confident that AMD will make at least some of its GPU firmware open source

Status
Not open for further replies.

bit_user

Titan
Ambassador
Tiny Corp. (or perhaps its founder George Hotz) made these issues public on its X account yesterday and said, "It upsets me that the MES isn't open source" and said AMD "should immediately stop development of high end ML libraries and fix their basic sh*t [compiler and drivers]."
...
Tiny Corp. also opened a poll asking X users whether they believed AMD would make the firmware open source; at the time of writing, 52.8% of votes were cast for "no."
This guy is a real piece of work. With whiny, narcissistic partners like that, who needs competitors? I'll bet the best realistic case of his business plan would amount to less than 0.1% of AMD's 2024 GPU revenue, yet he acts like he's their savior or something. Maybe if this had happened a year earlier, there could be something to that notion.

Should AMD's decision land in that 30% zone, then at a minimum, it could cause Tiny Corp. to abandon the RX 7900 XTX for the TinyBox and choose another vendor, most likely Intel, as Nvidia's GPUs are expensive and even less open source.
Good luck with that. Intel GPUs don't have near as much performance and he knows it.

there could be wider repercussions for AMD beyond just losing a single company. Open source is a big selling point of AMD's AI hardware-software ecosystem. If AMD doesn't find a happy medium, it might discourage other companies (especially companies prioritizing open source) from using AMD's platform.
Please don't spread this FUD. These are gaming GPUs. If this was Nvidia we were talking about, they would send a Cease and Desist letter for violating the license terms of their CUDA EULA, which stipulates that you can't use their gaming GPUs in data centers. The box he's building is essential a datacenter-oriented AI appliance, as far as I understand.

It seems like it would be nice if AMD did open source these firmware components (which seem to have nothing to do with their display controller, before someone raises that specter), but in no way does AMD's GPU business hinge on this decision.
 

endocine

Honorable
Aug 27, 2018
103
109
10,760
This guy is a real piece of work. With whiny, narcissistic partners like that, who needs competitors? I'll bet the best realistic case of his business plan would amount to less than 0.1% of AMD's 2024 GPU revenue, yet he acts like he's their savior or something. Maybe if this had happened a year earlier, there could be something to that notion.

So users aren't allowed to complain about flaws in products, since when, and how is that whiny or even narcissistic, do you know what the latter word means? Its also not Hotz and his company having issues with these GPUs, AMD needs to up their game and compete with nVidia so there isn't a monopoly in AI hardware. AMD absolutely pushes QA onto their users, and now that they are making $, its time for them to fix that situation.
 
So users aren't allowed to complain about flaws in products, since when, and how is that whiny or even narcissistic, do you know what the latter word means? It’s also not Hotz and his company having issues with these GPUs, AMD needs to up their game and compete with nVidia so there isn't a monopoly in AI hardware. AMD absolutely pushes QA onto their users, and now that they are making $, it’s time for them to fix that situation.
refutable, Tiny Corp is lucky AMD is even entertaining their idea of CDNA workarounds using consumer cards.
 
  • Like
Reactions: bit_user

parkerthon

Distinguished
Jan 3, 2011
109
125
18,760
So users aren't allowed to complain about flaws in products, since when, and how is that whiny or even narcissistic, do you know what the latter word means? Its also not Hotz and his company having issues with these GPUs, AMD needs to up their game and compete with nVidia so there isn't a monopoly in AI hardware. AMD absolutely pushes QA onto their users, and now that they are making $, its time for them to fix that situation.
He’s not a user, he’s a reseller targeting businesses by repackaging consumer gpus to make a cheap solution. It’s a risky flawed idea frankly. It’s not that I look down on the idea either. Backblaze did this way back when there was a major hard drive supply shortage from the tsunami. They sourced consumer hdds from individuals, even shucking them from extetnal usb drive enclosures, to add capacity to their back up solutions and still do to this day. I just don’t think high end gaming gpus are a good fit here.

I personally don’t think it’s narcissism. I think it’s free publicity for an otherwise unknown start up at this point. Being provocative helps. This could be gorilla marketing as much as it is a whiney call to action.
 
This guy is a real piece of work. With whiny, narcissistic partners like that, who needs competitors? I'll bet the best realistic case of his business plan would amount to less than 0.1% of AMD's 2024 GPU revenue, yet he acts like he's their savior or something. Maybe if this had happened a year earlier, there could be something to that notion.


Good luck with that. Intel GPUs don't have near as much performance and he knows it.


Please don't spread this FUD. These are gaming GPUs. If this was Nvidia we were talking about, they would send a Cease and Desist letter for violating the license terms of their CUDA EULA, which stipulates that you can't use their gaming GPUs in data centers. The box he's building is essential a datacenter-oriented AI appliance, as far as I understand.

It seems like it would be nice if AMD did open source these firmware components (which seem to have nothing to do with their display controller, before someone raises that specter), but in no way does AMD's GPU business hinge on this decision.
This.

Dude I just love your post sometimes.
 
So users aren't allowed to complain about flaws in products, since when, and how is that whiny or even narcissistic, do you know what the latter word means? Its also not Hotz and his company having issues with these GPUs, AMD needs to up their game and compete with nVidia so there isn't a monopoly in AI hardware. AMD absolutely pushes QA onto their users, and now that they are making $, its time for them to fix that situation.
It's about ROI. If you have a limited number of resources and 80% of your profit comes from data center AI, then you shift 80% of resources to it.

Sorry a 7900XTX isn't a data center product. It's a consumer product. And this is the same reason they sell fire gl and quadro processors. And those are still only meant for desktop uses. Not data center.
 
  • Like
Reactions: bit_user
He’s not a user, he’s a reseller targeting businesses by repackaging consumer gpus to make a cheap solution. It’s a risky flawed idea frankly. It’s not that I look down on the idea either. Backblaze did this way back when there was a major hard drive supply shortage from the tsunami. They sourced consumer hdds from individuals, even shucking them from extetnal usb drive enclosures, to add capacity to their back up solutions and still do to this day. I just don’t think high end gaming gpus are a good fit here.

I personally don’t think it’s narcissism. I think it’s free publicity for an otherwise unknown start up at this point. Being provocative helps. This could be gorilla marketing as much as it is a whiney call to action.
Certainly not a good look. I'm not sure I would want to do business with a CEO who is incendiary about getting his way through petulant demands and threat of bad publicity.

Dealing with him during a conflict over support would be a nightmare.
 
  • Like
Reactions: bit_user

bit_user

Titan
Ambassador
So users aren't allowed to complain about flaws in products, since when, and how is that whiny or even narcissistic,
The way he publicly trash-talks AMD and issues ultimatums is incredibly unprofessional. If you actually read the quotes in the article and think that behavior is okay, then it seems we have a fundamental difference of opinion.

I'd be wary of encouraging this sort of behavior, if I were AMD.

AMD needs to up their game and compete with nVidia so there isn't a monopoly in AI hardware.
You should acquaint yourself with their MI300 series products. It seems they're actually competitive, for once!

AMD absolutely pushes QA onto their users, and now that they are making $, its time for them to fix that situation.
Be that as it may, there are normal and appropriate ways to raise these issues with them. Partners will be able to file priority bugs and even arrange meetings to discuss larger concerns and issues they're having. In the event he reaches an impasse with AMD and feels the need to make that public, it could be done much more diplomatically.
 

MatheusNRei

Great
Jan 15, 2024
55
38
60
Be that as it may, there are normal and appropriate ways to raise these issues with them. Partners will be able to file priority bugs and even arrange meetings to discuss larger concerns and issues they're having. In the event he reaches an impasse with AMD and feels the need to make that public, it could be done much more diplomatically.
Quite.

People don't like dealing with adults who throw tantrums when things don't get done their way and on their time frame.

Considering TinyBox's entire business model rests on denying AMD money by repurposing cheaper gaming hardware for data-center-focused AI applications instead of purchasing more expensive enterprise-grade hardware, they really shouldn't be biting the hand that feeds.
 
  • Like
Reactions: bit_user

edzieba

Distinguished
Jul 13, 2016
589
594
19,760
For all the whinging about "it's a consumer card, so you should just shut up and take it when there are bugs": the buggy scheduler at issue here applies across both consumer and datacentre cards, anything running on the RDNA 1, 2, or 3 architectures.
 
  • Like
Reactions: endocine

bit_user

Titan
Ambassador
For all the whinging about "it's a consumer card, so you should just shut up and take it when there are bugs"
Literally nobody said that. Willfully or not, your mischaracterizing our remarks.

I will only speak for myself, but what I actually said along those lines is that the level of support they should expect from AMD, on this issue, should be weighted by the fact that they're trying to use consumer cards. I did not say they should expect no support, especially since AMD now officially advertises support for ROCm on RX 7900 XTX cards.

However, if AMD didn't even officially support ROCm on that card, then this would be a different discussion. Fortunately, they do.

: the buggy scheduler at issue here applies across both consumer and datacentre cards, anything running on the RDNA 1, 2, or 3 architectures.
How do you know the scheduler and its firmware are the same across all of those architectures? And are you saying it's also shared by the CDNA products, too?

Also, he said is that its behavior needs to be finely tuned for his workload. That calls into question whether his patches would even be applicable to upstream, since it seems likely they'd adversely affect gaming performance.

I'm not against the idea of AMD opening parts of its GPU firmware. I can see why they might not want to, but I see how their doing it aligns with my interests as a potential user. What I'm really saying is that Hotz is asking for quite a lot, doing it from quite a weak position, and the icing on the cake is really the over-the-top way he's going about trying to get what he wants.

After caving to this jerk, what do you think will happen the next time an AMD partner has an issue? Do you think it'll be more or less likely they badmouth AMD and make threats in social media (i.e. as opposed to if they hadn't publicly caved to Hotz)? This sets a really bad precedent. If I were on AMD's PR team, I'd be pretty annoyed by the way this was handled.
 
  • Like
Reactions: helper800

edzieba

Distinguished
Jul 13, 2016
589
594
19,760
Literally nobody said that. Willfully or not, your mischaracterizing our remarks.

I will only speak for myself, but what I actually said along those lines is that the level of support they should expect from AMD, on this issue, should be weighted by the fact that they're trying to use consumer cards. I did not say they should expect no support, especially since AMD now officially advertises support for ROCm on RX 7900 XTX cards.

However, if AMD didn't even officially support ROCm on that card, then this would be a different discussion. Fortunately, they do.
They're not using ROCm, that's a complete red herring. They;re interreacting with the job scheduler, which is part of the firmware blob that runs on the cards themselves.
Can you add reference to what you are saying here?
e.g.
while they are paying for a bunch of USD 999 bucks GPUs and expecting data center quality treatment. They should be fortunate the CEO even bothered about them.
They should thank their lucky stars anyone at AMD even acknowledged them
Tiny Corp is lucky AMD is even entertaining their idea of CDNA workarounds using consumer cards.
etc.


Lets be clear here: The problem TinyCorp is having is not "wanting datacentre support" or a "workaround" or "hack" or anything else. The problem is a bug in the part of the card firmware that handles job scheduling (the MES) not behaving as AMDs documentation says it should, that was reported to AMD, that AMD did not fix, and that nobody other than AMD can fix because AMD do not make that part of the firmware available to fix. This is a standard and exposed function (across both RDNA and CDNA) being used as intended but not working as described.

If you tell a piece of hardware to add 1+1 and get 3, that's a bug, regardless of whether you are doing the addition in a box in a rack or under your desk.
 
  • Like
Reactions: endocine

bit_user

Titan
Ambassador
They're not using ROCm, that's a complete red herring.
I guarantee they're using ROCm. There's no way a "tiny" startup could write everything you'd need, in order to avoid going through it.

They;re interreacting with the job scheduler, which is part of the firmware blob that runs on the cards themselves.
That's neither mutually exclusive with using ROCm, nor does it follow that because MES is encountering errors during their jobs that they're interacting with it, directly.

Even the desire to tune MES doesn't mean that their software is directly talking to it. They just need it to behave properly, when faced with a workload like theirs.

"We aren't open source purists... But we need the scheduler and the memory hierarchy management to be open. This is what it takes to push the performance of neural networks"

Lets be clear here: The problem TinyCorp is having is not "wanting datacentre support" or a "workaround" or "hack" or anything else. The problem is a bug in the part of the card firmware that handles job scheduling (the MES) not behaving as AMDs documentation says it should,
Hardware and firmware has bugs all the time. The way to deal with these is by using AMD's bug-tracking system and filing tickets on the issues you encounter. AMD has lots of partners, but you almost never hear about them employing social media pressure campaigns to try and get their issue prioritized.

that was reported to AMD, that AMD did not fix,
When? And what did AMD tell him, in the time between when he filed the ticket and when he started whining on Twitter?

and that nobody other than AMD can fix because AMD do not make that part of the firmware available to fix.
At least 99% of the firmware out there is closed source. AMD is not a special case, here.

This is a standard and exposed function (across both RDNA and CDNA) being used as intended but not working as described.
What is the function? How was it described? What I read was simply that "AI training runs crashing with MES errors", which doesn't say they're interacting directly with it or exactly how it's misbehaving. If you have more details, please share.
 

edzieba

Distinguished
Jul 13, 2016
589
594
19,760
I guarantee they're using ROCm. There's no way a "tiny" startup could write everything you'd need, in order to avoid going through it.
They're not using ROCm:
View: https://twitter.com/__tinygrad__/status/1764737741693342128

Hardware and firmware has bugs all the time. The way to deal with these is by using AMD's bug-tracking system and filing tickets on the issues you encounter.
Which they did. Bug went unfixed:
View: https://twitter.com/__tinygrad__/status/1765085827946942923
 

bit_user

Titan
Ambassador
But they're still using AMD userspace components, even if they're not using the whole of ROCm.

Really? All I see is him publicly complaining about a crash after two training runs! No reference to any bug filed.

And at this point, it looks like he was still blaming the toolchain, rather than the firmware.

Again, AMD has many partners. They're not all working at such a low level as Tiny, but if they engaged in this sort of trash-talking, then Tiny's comments wouldn't even stand out. The only reason these comments are notable is due to how far out-of-line they are!

Furthermore, I don't know why you seem to have decided to defend Hotz. For all the legitimate complaints about AMD's GPU compute strategy, missteps, and general under-resourcing, he is not the poster boy I'd imagine you really want. The only thing particularly interesting about them is that they're really trying to push the envelope of what RDNA3 can do. However, that easily gets overshadowed by all this negativity and toxicity.

I really wish AMD would call their bluff. If they did follow through and try to use Intel or Nvidia, their product would be non-competitive and they'd go out of business and we wouldn't have to hear from them any more. Intel's GPUs lack the performance and Nvidia would force them to use far more expensive workstation GPUs that would defeat the value proposition they're shooting for.
 
Last edited:

edzieba

Distinguished
Jul 13, 2016
589
594
19,760
I really wish AMD would call their bluff. If they did follow through and try to use Intel or Nvidia, their product would be non-competitive and they'd go out of business and we wouldn't have to hear from them any more.
You realise "just give up on trying to get AMD to work and use Nvidia" is the default state of the industry, right?
 

bit_user

Titan
Ambassador
You realise "just give up on trying to get AMD to work and use Nvidia" is the default state of the industry, right?
Hotz didn't pick AMD for any altruistic reasons. He did it because it served his interests. I already explained why I think he's bluffing with his threats to switch.

AMD can be a good partner. Just ask Valve, Sony, or Microsoft. On this point, what do you think would've happened if Valve would've publicly lashed out in such an abusive and unprofessional manner, when they hit the first few problems? I can't imagine their relationship would be as healthy as it is now.

Understand that I'm not objecting to the notion of what Tiny is doing, nor that they should expect a certain level of support from AMD. Furthermore, I'm not defending how far behind AMD is on the GPU compute front. My concerns are entirely around the way that Hotz has chosen to approach things with AMD. This isn't the first time, either.

If you really want to advance the cause of trying to get AMD to improve its GPU compute support, I think you really shouldn't hitch that wagon to Hotz. He's really not a good champion for this cause.

On that note, there are others enjoying a fair bit of success with AMD's GPU Compute stack:
 
Last edited:

Pierce2623

Commendable
Dec 3, 2023
503
386
1,260
So users aren't allowed to complain about flaws in products, since when, and how is that whiny or even narcissistic, do you know what the latter word means? Its also not Hotz and his company having issues with these GPUs, AMD needs to up their game and compete with nVidia so there isn't a monopoly in AI hardware. AMD absolutely pushes QA onto their users, and now that they are making $, its time for them to fix that situation.
I’m just interested to know why you consider a dude selling a rig with 6 consumer gaming GPUs as a commercial AI development platform is a user? He’s a reselling huckster ripping people off (including AMD realistically). He’s able to sell $8000 worth of hardware for $15k because it’s still cheaper than buying hardware with commercial support. But then he expects commercial support. AMD needs to just stick to their game plan instead of dealing with all the crooks cashing in on the AI boom.
 
  • Like
Reactions: Peksha

bit_user

Titan
Ambassador
I’m just interested to know why you consider a dude selling a rig with 6 consumer gaming GPUs as a commercial AI development platform is a user?
Let's be fair: they are doing a fair amount of software development that you don't usually see from someone considered a mere "reseller". Generically, they should be a "partner", but I think the class of partner would be as much a development partner as an integrator.

He’s a reselling huckster ripping people off (including AMD realistically). He’s able to sell $8000 worth of hardware for $15k because it’s still cheaper than buying hardware with commercial support.
Hmmm... Based on the specs from their website, here's the cheapest system could price out via Newegg (sold by):

CategoryModelQty.PriceTotal
GPUsXFX Speedster MERC310
6​
$930​
$5580​
boot driveCrucial MX500 1TB
1​
$70​
$70​
Data drivesSamsung 990 Pro 1TB
4​
$120​
$480​
CPUEPYC 7502
1​
$1876​
$1876​
DIMMsKingston 32GB ECC DDR4-3200
8​
$83​
$664​
PSUSuper Flower 1600W
2​
$300​
$600​
MotherboardAsrock Rack ROMED8-2T
1​
$624​
$624​
Server ChassisCustom
1​
$500​
$500​
Grand Total
$10394​

Not sure about the case, but it has to be something with risers or riser cables, in order to support 6x GPUs. I didn't find anything like that, so I put a placeholder of $500, which might be low. Also, I just picked the cheapest 1500 W PSU, but perhaps it's worth going with a higher-grade model to support 24/7 loads.

Anyway, if we say it's about $10.5k worth of parts + a ~3 year support contract for $15k, that's actually pretty decent. Just try pricing out anything remotely comparable, from any of the bespoke workstation or server vendors!

However, if he had to use Nvidia RTX 6000 cards, his parts cost would go up by at least another $8k.

But then he expects commercial support.
I assume he has a partner agreement in place. I believe the level of support he's expecting is well above & beyond what a normal system integrator or even software developer would get. He wants to be supported like a big OEM (e.g. Dell, HP).

AMD needs to just stick to their game plan instead of dealing with all the crooks cashing in on the AI boom.
If AMD can benefit from the work he's doing to tune their Navi 31 GPUs for deep learning, it might be worth some time & trouble on their part. As I've mentioned, I believe AMD is probably way backordered on their MI300 GPUs, so this would be a way they could cash in more on the AI boom, themselves.

I'm currently seeing their RX 7900 XTX selling as low as 91% of list price (after rebate), so I'll bet they'd like to move some more of that inventory. Of course, I'm sure AMD will encourage people to use their $4k Radeon Pro W7900, instead!
 
Last edited:
Status
Not open for further replies.