Could AMD use these connectors to actually have 2 layers of chiplets for processing there?
instead of vcache two full processing cores one on top of the other?
Thermals would seem to prevent doing something like that. If I'm right, then stacking logic dies is only something you'd do in embedded processors that run at low clockspeeds for the sake of power-efficiency, but need extremely large amounts of compute (e.g. vision & AI for robotics).
it would be cool to see processing-in-memory applied here.
So much of what you describe sounds exactly like what Samsung and SK Hynix have been doing with their processing-enhanced memory products.
You can find details of SK Hynix' GDDR6-AiM at a link I left in the comments of this article:
You can see a rundown of Samsung's efforts in this area, here:
AMD can leverage FPGAs from its Xilinx acquisition and accelerate certain edge functions
Indeed, it would be an interesting application of FPGAs, though AMD would probably have to partner with a memory maker. The reason being that it's most efficient to put the processing in the memory dies, themselves. Putting it in the MCDs wouldn't save you much over simply integrating it directly into the GCD.
As a memory accelerator, FPGAs can learn common patterns in memory accesses and accelerate them, just like in networking.
Exactly where are FPGAs being used to learn memory access patterns, in networking? I'm skeptical of this claim for several reasons, not least of which being that FPGAs take a long time to reconfigure -
many orders of magnitude longer than how quickly memory access patterns can change. Also, a high-performance memory controller is something you'd typically want to hard-wire, in order to keep its power, thermals, and area under control.