News AMD patents configurable multi-chiplet GPU — illustration shows three dies

Admin · Jun 14, 2024

AMD has a new patent that shows a three-die GPU. Next-generation graphics processors could adopt more complex chiplet designs.

AMD patents configurable multi-chiplet GPU — illustration shows three dies : Read more

Lucky_SLS · Jun 14, 2024

This will introduce latency between chiplets, interesting to see how it will affect the performance.

hotaru.hino · Jun 14, 2024

Even if they mitigate additional latency, there's the issue of how to make sure workloads get evenly distributed on the chiplets. The last thing a gamer wants is micro-stuttering like in multi-GPU setups. Considering AMD seems to avoid wanting to have software do this, that means a more sophisticated coordinator chip. Which to me means they're going to lose a lot in the efficiency game here.

hannibal · Jun 14, 2024

Lucky_SLS said:
This will introduce latency between chiplets, interesting to see how it will affect the performance.

Yes it does, but this should be ok in datacenters AI and computational usage!
For gaming… most likely not unless they develop really fast interconnection system between different dices.
But for computational usage. Sure! These could be real monsters without becoming extremely expensive!

vanadiel007 · Jun 14, 2024

This is the way to go as you can control die size, TDP and processing capabilities with this. You are already seeing gigantic chips from Nvidia that require a ton of power and cooling.

Making it in small sub sections that can operate independently will decrease power consumption as opposed to have a single gigantic die.

I am thinking Nvidia will run into a roadblock in the coming years with their design.

hotaru.hino · Jun 14, 2024

vanadiel007 said:
This is the way to go as you can control die size, TDP and processing capabilities with this. You are already seeing gigantic chips from Nvidia that require a ton of power and cooling.

The offset however is that you need to create a more robust and complex communication system between the chiplets. And communication busses can eat significantly into the power budget.

There's also the issue of how much granularity the controller chip has over resources in the GPU dies themselves. Otherwise, AMD is just stitching together Crossfire on a card and that'll come with its own problems for gaming. Real-time rendering does not like any extra latency.

vanadiel007 said:
Making it in small sub sections that can operate independently will decrease power consumption as opposed to have a single gigantic die.

You can power gate certain sections of the die as well. The main reason why Maxwell was so much more efficient than Kepler, despite being built on the same transistor process, was because the SM grouping was more granular, so you had a combination of SM being wasted less, more work being done because there's more available places to go, and any SMs that weren't used could be power gated

salgado18 · Jun 14, 2024

GPUs are highly parallel by design, so my guess is there will be little to no latency issues. It can cut costs by using a single chiplet design and producing in very large quantities, and joining variable number of them for each tier of gpu. Individual chiplets can be switched on or off based on a smart distribution of tasks to save even more power. And it can scale from entry level products (like an RX 8400) to something not even Nvidia can reach with a single chip. And, up next (my guess) is that raytracing will have its own chiplet someday, with the same benefits and even nore configuration options.

Downside: it has to work, because they are way behind the race by now.

hotaru251 · Jun 15, 2024

Lets hope this is the Zen moment for GPU that it was for their CPU's...we really need a rival to Nvidia.

Also I dont see latency being much of an issue given how they have experience w/ it from many yrs of zen.

jp7189 · Jun 15, 2024

salgado18 said:
GPUs are highly parallel by design, so my guess is there will be little to no latency issues. It can cut costs by using a single chiplet design and producing in very large quantities, and joining variable number of them for each tier of gpu. Individual chiplets can be switched on or off based on a smart distribution of tasks to save even more power. And it can scale from entry level products (like an RX 8400) to something not even Nvidia can reach with a single chip. And, up next (my guess) is that raytracing will have its own chiplet someday, with the same benefits and even nore configuration options.

Downside: it has to work, because they are way behind the race by now.

The problem is how do you split up the work of real-time rendering? ..and after each chunk of work is done how do you put it back together in to a single output (monitor) in a highly consistent, low latency manner?

SLI and Crossfire had various different strategies, but never got anywhere close to perfect scaling and had lots of latency issues. If you're not darn close to perfect scaling, then you end up with an efficiency nightmare on high-end cards.

Datacenter workloads are designed from the ground up to be tolerant of distributed workloads across cards/nodes/racks and have mechanisms to deal with timing/latency/coherency. Games histrorically have not, and really cannot be so forgiving.

hotaru.hino · Jun 15, 2024

salgado18 said:
GPUs are highly parallel by design, so my guess is there will be little to no latency issues.

How does being highly parallel mean anything with regards to meeting real-time latency requirements?

hotaru251 said:
Also I dont see latency being much of an issue given how they have experience w/ it from many yrs of zen.

And yet Ryzen 9 processors still have a problem with cross CCD latency.

KnightShadey · Jun 15, 2024

jp7189 said:
SLI and Crossfire had various different strategies, but never got anywhere close to perfect scaling and had lots of latency issues.

The chiplets have nowhere near the distance to cover as SLi/CF , even compared to multi-GPU cards like the 7950GX2 or R3870X2, we're talking mms not cms.

Also, SLi & CF latencies were measured in multiple milliseconds, UCIe latency is about 2ns chiplet to chiplet, and far faster data throughput.

So, this definitely isn't your father's Xfire.

One wouldn't expect perfect scaling, but they could very likely achieve a net benefit vs yield of large monolithic solutions result in better performance per W/$.

waltc3 · Jun 16, 2024

For those wondering where multi-GPUs & their X-fire support went, well, now you know...😉

KnightShadey · Jun 17, 2024

waltc3 said:
For those wondering where multi-GPUs & their X-fire support went, well, now you know...😉

Ironically it faded just after Mantle's implementation of Multi-GPU optimizations was rolled into DX12 , OpenGL ES and then of course became Vulkan, with better pipeline & resource management.

Of course , in the case of CPUs & VPUs... timing is everything. 🤪

JayNor · Jun 17, 2024

that's a patent application

JayNor · Jun 17, 2024

this is also a patent application.

Flexible partitioning of gpu resources

US US20230297440A1 David Cowperthwaite Intel Corporation

Priority 2022-03-18 • Filed 2022-05-27 • Published 2023-09-21

subspruce · Jul 5, 2024

hotaru.hino said:
Even if they mitigate additional latency, there's the issue of how to make sure workloads get evenly distributed on the chiplets. The last thing a gamer wants is micro-stuttering like in multi-GPU setups. Considering AMD seems to avoid wanting to have software do this, that means a more sophisticated coordinator chip. Which to me means they're going to lose a lot in the efficiency game here.

yeah, don't expect chiplet on laptops, but 72CU RDNA5 could still give Blackwell laptop a run for its money

News AMD patents configurable multi-chiplet GPU — illustration shows three dies

Administrator

Glorious

Glorious

Distinguished

Distinguished

Glorious

Distinguished

Splendid

Distinguished

Glorious

Reputable

Honorable

Reputable

Honorable

Honorable

Flexible partitioning of gpu resources ​

US US20230297440A1 David Cowperthwaite Intel Corporation​

Priority 2022-03-18 • Filed 2022-05-27 • Published 2023-09-21​

Prominent

Share this page

Flexible partitioning of gpu resources

US US20230297440A1 David Cowperthwaite Intel Corporation

Priority 2022-03-18 • Filed 2022-05-27 • Published 2023-09-21