InvalidError :
[
HBM will likely get used across the whole mid-high-end range of GPUs starting with the 14/16nm parts next year since it eliminates the expensive 8+ layers PCB necessary to run 256-512bits-wide GDDR5 from the GPU BGA to
Of course they will use it anywhere they possibly can cut costs in the future, but I'm talking about this year in terms of using it on only one high end card. Frankly, I think AMD will be totally hosed if they don't get this to market within months because they will have lost their competitive advantage by 2016.
Maybe I'm misinterpreting where you think where redundant memory controlling logic will come into play but I see no reason for that to happen if you're using HBM only within a system. HBM is shouldn't be between the CPU and DRAM, it eliminates the DRAM entirely. That is why I find the concept interesting - it allows you to bake all the memory into the APU package. A DDR/HBM hybrid system would be more complicated than current computers, meaning you lose all the advantages in efficiency and mobo size. Which could be fine for the desktop workstation market where size and efficiency are not issues, but I think using this to build overcomplicated workstations is pretty far off from AMD's goals, or in the very least uninspired.
Other than that, and assuming that HBM never advances beyond the limit of 4 stacks/processor, that could still mean 8 stacks per APU with 4 stacks on the CPU and 4 on the GPU.
The way I envision this going, is each APU is basically a full-powered system on a chip, minus the bulky interfaces that have to live on the motherboard and things you would never need more of one of, like a sound card.
My question is, with the memory baked into the APU, what stops you from just adding more APUs when you need more RAM, or more GPU power? One of the biggest challenges to multiprocessor systems is the memory interface, and the sheer number of pins each processor needs to interface with said memory. All those interfaces need a lot of board space after all You also need pins for things like PCI, but that need is lessened considerably if you don't need that bandwidth to talk to a separate GPU. Regardless, I believe that even with a lot of pins, an APU module wouldn't need to be bigger than what is used for DDR.
So, lets assume your 32GB limit and you're building a desktop workstation, what can you do with a single computer with 128 GB of memory that couldn't be done equally well with what could be essentially be 4 or more computers, each with 32GB memory, within the same general form factor? I don't think the concept I'm laying out is much different than what is used in supercomputers and data-centers, just how you may be able to fit many small/inexpensive computing units on a single board with a shared power supply/usb controller/whatever
Granted the actual engineering to make all those modules/resources play together nicely like one big, easy to use bucket may be a big challenge. It might even require a dedicated and centralized processor just organize things between all the CPUs (I know what I just wrote sounds silly). I do know AMD and Directx12 are already making progress toward using mixed GPU resources. I know we aren't at the point this is possible yet, but AMD may be focusing on long term success here. I at least hope the delays to their next GPU are more in line with "We are going to get this new concept perfect" and less "this could never possibly be done, quickly come up with a plan B to ship".