News Nvidia engineer breaks and then quickly fixes AMD GPU performance in Linux

Admin · Apr 6, 2025

An Nvidia engineer pushed a fix to the Linux kernel, improving performance on Radeon GPUs by correcting the same bug he had introduced.

Nvidia engineer breaks and then quickly fixes AMD GPU performance in Linux : Read more

Rob1C · Apr 6, 2025

You'd need an 8-way CPU to practically have enough slots for 64TiB.

qxp · Apr 6, 2025

Rob1C said:
You'd need an 8-way CPU to practically have enough slots for 64TiB.

Not necessarily. On newer systems you can expand RAM by using PCIe 5.0 slots. You could also memory map SSDs, it takes only 8 8TB SSDs to need more than 64TB addressing space.

USAFRet · Apr 6, 2025

So he didn't "improve performance".
Rather, he undid the performance limiter he pushed last week.

It was fine, before he screwed with it.

bit_user · Apr 6, 2025

The article said:
In the open-source paradigm, it's an unwritten rule to fix what you break. The Linux kernel is open-source and accepts contributions from everyone, which are then reviewed. Responsible contributors are expected to help fix issues that arise from their changes. So, despite their rivalry in the GPU market, FOSS (Free Open Source Software) is an avenue that bridges the chasm between AMD and Nvidia.

It's not just about good manners. If a contributor is found to behave in a malicious or excessively reckless manner, they could face a ban. I'm not aware of a case where this has happened, but I think the potential is real.

bkuhl · Apr 6, 2025

bit_user said:
It's not just about good manners. If a contributor is found to behave in a malicious or excessively reckless manner, they could face a ban. I'm not aware of a case where this has happened, but I think the potential is real.

Its happened:

https://www.tomshardware.com/news/university-researchers-apologize-linux-community

bit_user · Apr 6, 2025

bkuhl said:
Its happened:

https://www.tomshardware.com/news/university-researchers-apologize-linux-community

Yeah, I knew of that incident. What I meant was one vendor acting maliciously towards another, in a strictly anti-competitive fashion.

Rob1C · Apr 6, 2025

Rob1C said:
You'd need an 8-way CPU to **practically** have enough slots for 64TiB.

qxp said:
Not necessarily. On newer systems you can expand RAM by using PCIe 5.0 slots. You could also memory map SSDs, it takes only 8 8TB SSDs to need more than 64TB addressing space.

I did say to be practical.

Using your suggested ideology you say we wouldn't need 8-way CPUs to get enough slots for the DIMM, because we could use 8 SSDs. With modern SSDs you'd only need one. Access and execution speed would be slow.

Similarly we could use fewer DIMM slots by simply using larger DIMMs:
https://www.tomshardware.com/news/samsung-talks-1tb-ddr5-modules-ddr5-7200

Large DIMMs like that are reserved for preferred customers, and available at eye watering prices.

So, going the extreme either way isn't practical, we'd need to land somewhere near the middle ground.

More to the point of the comment, which you completely missed, they submitted a change to support a configuration which is as unlikely for Intel systems as it is impossible for AMD systems.

nogaard777 · Apr 7, 2025

USAFRet said:
So he didn't "improve performance".
Rather, he undid the performance limiter he pushed last week.

It was fine, before he screwed with it.

And it was better after he fixed it. Don't let your hatred of a corporation make brainless assumptions of the individuals that work there. A large portion of Linux exists because of Nvidia engineers' contributions, and I'd wager far more than from AMD's much smaller team.

bit_user · Apr 7, 2025

nogaard777 said:
A large portion of Linux exists because of Nvidia engineers' contributions, and I'd wager far more than from AMD's much smaller team.

Why wager? If you know, you know. If you don't, well...

The latest data I found was from 2022:

By changesets

Employer	Number of Changsets	Percentage of total
Huawei Technologies	1281	9.2%
Intel	1254	9.0%
(Unknown)	1097	7.9%
Google	917	6.6%
Linaro	837	6.0%
AMD	750	5.4%
Red Hat	672	4.8%
(None)	564	4.0%
Meta	414	3.0%
NVIDIA	389	2.8%

By lines changed

Employer	Number of Lines	Percentage of total
Oracle	91852	12.0%
AMD	89761	11.7%
Google	56504	7.4%
Intel	44062	5.8%
(Unknown)	33765	4.4%
Realtek	33277	4.3%
Linaro	31234	4.1%
Huawei Technologies	27856	3.6%
NVIDIA	25441	3.3%
Red Hat	24073	3.1%

Source: https://lwn.net/Articles/915435/

So, AMD changed about 3.53 times as many lines as Nvidia, in 1.93 times as many changesets.

qxp · Apr 7, 2025

Rob1C said:
I did say to be practical.

Using your suggested ideology you say we wouldn't need 8-way CPUs to get enough slots for the DIMM, because we could use 8 SSDs. With modern SSDs you'd only need one. Access and execution speed would be slow.

If you use 8x 8TB PCIe 4.0 SSD you get read bandwidth of at least 56 GB/s - not stellar, but definitely usable. Using Sabrent Rocket 8TB, this will set you back less than $10K.

Rob1C said:
Similarly we could use fewer DIMM slots by simply using larger DIMMs:
https://www.tomshardware.com/news/samsung-talks-1tb-ddr5-modules-ddr5-7200

Large DIMMs like that are reserved for preferred customers, and available at eye watering prices.

So, going the extreme either way isn't practical, we'd need to land somewhere near the middle ground.

More to the point of the comment, which you completely missed, they submitted a change to support a configuration which is as unlikely for Intel systems as it is impossible for AMD systems.

The change was likely in response to customer request, as there are plenty of people that need systems with lots of RAM. And, of course, we will see these in wider uses as prices drop, and by this time the issue has been worked out.

Jaack18 · Apr 7, 2025

qxp said:
Not necessarily. On newer systems you can expand RAM by using PCIe 5.0 slots. You could also memory map SSDs, it takes only 8 8TB SSDs to need more than 64TB addressing space.

Not necessarily, you need pcie slots that support CXL to expand ram.

qxp · Apr 7, 2025

Jaack18 said:
Not necessarily, you need pcie slots that support CXL to expand ram.

Indeed, that is what I meant by "newer systems".

USAFRet · Apr 7, 2025

nogaard777 said:
Don't let your hatred of a corporation make brainless assumptions of the individuals that work there.

'hatred'?
Of NVidia?

Interesting, because there is an NVidia GPU in my PC about 18" away from my keyboard.

Don't make brainless assumptions on things you know nothing about.

snemarch · Apr 8, 2025

Rob1C said:
You'd need an 8-way CPU to practically have enough slots for 64TiB.

This isn't about physically installed memory – it's about extending where in the **linear** memory map you can map **physical** memory regions.

And the KASLR security feature shuffles the **linear** map locations on each boot, and you want this address space to be large.

Rob1C · Apr 10, 2025

Rob1C said:
You'd need an 8-way CPU to practically have enough slots for 64TiB.

snemarch said:
This isn't about physically installed memory – it's about extending where in the **linear** memory map you can map **physical** memory regions.

That's not what that means, and bit_user knows not to chime in if they're ESL.

bit_user · Apr 10, 2025

Rob1C said:
That's not what that means, and bit_user knows not to chime in if they're ESL.

Please don't drag me into this and don't infer anything from my lack of involvement.

Thank you.

Rob1C · Apr 10, 2025

qxp said:
If you use 8x 8TB PCIe 4.0 SSD you get read bandwidth of at least 56 GB/s - not stellar, but definitely usable. Using Sabrent Rocket 8TB, this will set you back less than $10K.

You could use a Graid card and get 80-260:
https://www.tomshardware.com/pc-com...raid-card-enables-mind-bending-storage-speeds

Rob1C · Apr 10, 2025

bit_user said:
Please don't drag me into this and don't infer anything from my lack of involvement.
Thank you.

You involved yourself, I see your thumbs up on their post.

USAFRet · Apr 10, 2025

OK.....enough back and forth sniping.
Deal?

bit_user · Apr 10, 2025

Rob1C said:
You involved yourself, I see your thumbs up on their post.

Okay, fine. @snemarch 's point made sense to me, but I haven't followed the issue closely enough to say more definitively whether I think that's what this was primarily about. That's all I'm going to say.

snemarch · Apr 11, 2025

Rob1C said:
That's not what that means, and bit_user knows not to chime in if they're ESL.

It looks like I bumped into past history, which I have no interest in – I only meant to comment on the physical memory vs logical address space difference.

nogaard777 · Apr 11, 2025

USAFRet said:
'hatred'?
Of NVidia?

Interesting, because there is an NVidia GPU in my PC about 18" away from my keyboard.

Don't make brainless assumptions on things you know nothing about.

If that's all you read then my assumptions are correct.

bit_user said:
Why wager? If you know, you know. If you don't, well...

The latest data I found was from 2022:

By changesets

Employer
Number of Changsets
Percentage of total
Huawei Technologies
1281
9.2%
Intel
1254
9.0%
(Unknown)
1097
7.9%
Google
917
6.6%
Linaro
837
6.0%
AMD
750
5.4%
Red Hat
672
4.8%
(None)
564
4.0%
Meta
414
3.0%
NVIDIA
389
2.8%

By lines changed

Employer
Number of Lines
Percentage of total
Oracle
91852
12.0%
AMD
89761
11.7%
Google
56504
7.4%
Intel
44062
5.8%
(Unknown)
33765
4.4%
Realtek
33277
4.3%
Linaro
31234
4.1%
Huawei Technologies
27856
3.6%
NVIDIA
25441
3.3%
Red Hat
24073
3.1%

Source: https://lwn.net/Articles/915435/

So, AMD changed about 3.53 times as many lines as Nvidia, in 1.93 times as many changesets.

I stand corrected. But that's odd how many Huawei have put in.

USAFRet · Apr 11, 2025

nogaard777 said:
If that's all you read then my assumptions are correct.

No, I read your whole comment.

Just wondering how you figured the 'hatred' aspect.

But, whatever.

bit_user · Apr 11, 2025

nogaard777 said:
But that's odd how many Huawei have put in.

I assume it's because they have their own server CPUs, AI chips, and phones (which have their own SoC). For their phones, at least the first version of their "Harmony OS" was based on Android/Linux.

News Nvidia engineer breaks and then quickly fixes AMD GPU performance in Linux

Administrator

Distinguished

Titan

Titan

Distinguished

Titan

Distinguished

Distinguished

Titan

Titan

Distinguished

Distinguished

Titan

Distinguished

Distinguished

Titan

Titan

Distinguished

Distinguished

Titan

Titan

Share this page