One nitpick with this way of phrasing: "That means the big Infinity Cache gave AMD a 50% boost to effective bandwidth".
The Cache on the GPUs doesn't make it so the card has a higher bandwidth, much like AMD's 3D VCache is not making DDR4 magically have more bandwidth. I know what the implied point is, but I think it shouldn't be explained that way at all. Preventing using the GDDR/DDR BUS to fetch data is not the same as increasing the effective bandwidth of it. You saturate that cache and you're back to using the slow lane. On initial load, you still use the slow lane. Etc...
Other than that, thanks for the information. I do not look forward to 600W GPUs. Ugh.
Regards.
In simple terms it's true, but it more complicated than that.
It's like how an SSD cache works in a NAS. The HDD pools can't reach high sustained rates, maybe 400-600MB/s depending on the NAS design, but the SSD cache can give the NAS a boost in performance to reach over 1GB/s. But that's only for the size that is available in the cache (or the config of the NAS), so it can save 10's & 100's of Gigabytes of data before the cache fills and the NAS resorts to the HDD pool, and in the background the data in the cache will be saved to the HDD pool.
Any NAS builder can have large and fast enough HDD pools, but it will come at a cost (HDD + infrastructure + space + energy), thats why SSD cache is a good option here.
The same goes for the IC, usually GPU designers resort to the VRAM to save cost in cache (which is expensive in silicon), VRAM on the other hand are okay until you reach 256bits, then the complication in PCB costs, engineering for all the traces becomes a problem, while 384bit is doable now, anything higher is questionable, that's why it's still rare to see anything higher, and it was only a few times we saw 512bit VRAM.
Its a point where having more enough cache in the GPU will eventually leads to less need for less need for higher bandwidth VRAM, because the need for high-enough bandwidth in the VRAM is only a small percentage of the total usage of the VRAM, the rest doesn't need that much bandwidth, so if you fulfil that small percentage of higher bandwidth with enough cache, then you can actually use less bandwidth VRAM, that's why increasing the cache so much will not give you a linear performance increase.