News AMD patents configurable multi-chiplet GPU — illustration shows three dies

Even if they mitigate additional latency, there's the issue of how to make sure workloads get evenly distributed on the chiplets. The last thing a gamer wants is micro-stuttering like in multi-GPU setups. Considering AMD seems to avoid wanting to have software do this, that means a more sophisticated coordinator chip. Which to me means they're going to lose a lot in the efficiency game here.
 
  • Like
Reactions: jp7189

hannibal

Distinguished
This will introduce latency between chiplets, interesting to see how it will affect the performance.

Yes it does, but this should be ok in datacenters AI and computational usage!
For gaming… most likely not unless they develop really fast interconnection system between different dices.
But for computational usage. Sure! These could be real monsters without becoming extremely expensive!
 
  • Like
Reactions: jp7189

vanadiel007

Distinguished
Oct 21, 2015
237
233
18,960
This is the way to go as you can control die size, TDP and processing capabilities with this. You are already seeing gigantic chips from Nvidia that require a ton of power and cooling.

Making it in small sub sections that can operate independently will decrease power consumption as opposed to have a single gigantic die.

I am thinking Nvidia will run into a roadblock in the coming years with their design.
 
  • Like
Reactions: salgado18
This is the way to go as you can control die size, TDP and processing capabilities with this. You are already seeing gigantic chips from Nvidia that require a ton of power and cooling.
The offset however is that you need to create a more robust and complex communication system between the chiplets. And communication busses can eat significantly into the power budget.

There's also the issue of how much granularity the controller chip has over resources in the GPU dies themselves. Otherwise, AMD is just stitching together Crossfire on a card and that'll come with its own problems for gaming. Real-time rendering does not like any extra latency.

Making it in small sub sections that can operate independently will decrease power consumption as opposed to have a single gigantic die.
You can power gate certain sections of the die as well. The main reason why Maxwell was so much more efficient than Kepler, despite being built on the same transistor process, was because the SM grouping was more granular, so you had a combination of SM being wasted less, more work being done because there's more available places to go, and any SMs that weren't used could be power gated
 
  • Like
Reactions: jp7189

salgado18

Distinguished
Feb 12, 2007
954
409
19,370
GPUs are highly parallel by design, so my guess is there will be little to no latency issues. It can cut costs by using a single chiplet design and producing in very large quantities, and joining variable number of them for each tier of gpu. Individual chiplets can be switched on or off based on a smart distribution of tasks to save even more power. And it can scale from entry level products (like an RX 8400) to something not even Nvidia can reach with a single chip. And, up next (my guess) is that raytracing will have its own chiplet someday, with the same benefits and even nore configuration options.

Downside: it has to work, because they are way behind the race by now.
 
Lets hope this is the Zen moment for GPU that it was for their CPU's...we really need a rival to Nvidia.

Also I dont see latency being much of an issue given how they have experience w/ it from many yrs of zen.
 

jp7189

Distinguished
Feb 21, 2012
378
218
19,060
GPUs are highly parallel by design, so my guess is there will be little to no latency issues. It can cut costs by using a single chiplet design and producing in very large quantities, and joining variable number of them for each tier of gpu. Individual chiplets can be switched on or off based on a smart distribution of tasks to save even more power. And it can scale from entry level products (like an RX 8400) to something not even Nvidia can reach with a single chip. And, up next (my guess) is that raytracing will have its own chiplet someday, with the same benefits and even nore configuration options.

Downside: it has to work, because they are way behind the race by now.
The problem is how do you split up the work of real-time rendering? ..and after each chunk of work is done how do you put it back together in to a single output (monitor) in a highly consistent, low latency manner?

SLI and Crossfire had various different strategies, but never got anywhere close to perfect scaling and had lots of latency issues. If you're not darn close to perfect scaling, then you end up with an efficiency nightmare on high-end cards.

Datacenter workloads are designed from the ground up to be tolerant of distributed workloads across cards/nodes/racks and have mechanisms to deal with timing/latency/coherency. Games histrorically have not, and really cannot be so forgiving.
 
  • Like
Reactions: hotaru.hino
GPUs are highly parallel by design, so my guess is there will be little to no latency issues.
How does being highly parallel mean anything with regards to meeting real-time latency requirements?

Also I dont see latency being much of an issue given how they have experience w/ it from many yrs of zen.
And yet Ryzen 9 processors still have a problem with cross CCD latency.
 

KnightShadey

Reputable
Sep 16, 2020
99
49
4,570
SLI and Crossfire had various different strategies, but never got anywhere close to perfect scaling and had lots of latency issues.

The chiplets have nowhere near the distance to cover as SLi/CF , even compared to multi-GPU cards like the 7950GX2 or R3870X2, we're talking mms not cms.

Also, SLi & CF latencies were measured in multiple milliseconds, UCIe latency is about 2ns chiplet to chiplet, and far faster data throughput.

So, this definitely isn't your father's Xfire.

One wouldn't expect perfect scaling, but they could very likely achieve a net benefit vs yield of large monolithic solutions result in better performance per W/$.
 

KnightShadey

Reputable
Sep 16, 2020
99
49
4,570
For those wondering where multi-GPUs & their X-fire support went, well, now you know...;)
Ironically it faded just after Mantle's implementation of Multi-GPU optimizations was rolled into DX12 , OpenGL ES and then of course became Vulkan, with better pipeline & resource management.

Of course , in the case of CPUs & VPUs... timing is everything. 🤪
 

JayNor

Honorable
May 31, 2019
436
87
10,760
this is also a patent application.

Flexible partitioning of gpu resources



US20230297440A1-20230921-D00000.png



US US20230297440A1 David Cowperthwaite Intel Corporation​


Priority 2022-03-18 • Filed 2022-05-27 • Published 2023-09-21​