Vogner16
Honorable
palladin9479 :
cdrkf :
The thing is, eventually the performance benfit of a discreet GPU will be so small over an integrated solution, it will no longer make sense to produce them.
Not for at least twenty years, if not longer. And that's assuming some sort of phenomenal never-before-seen increase in processing technology. We're talking borderline "I found it in a secret alien ship" type increase here.
Take the nVidia 980 Ti which has approximately 8bn transistors, not counting memory. Intel i7-6700K has somewhere around 1.75bn transistors, which includes it's small onboard iGPU. FX8350 has around 1.6bn transistors and the A10-7850K has 2.41bn transistors, including the iGPU.
That is the kind of power discrepancy that exists between dedicated vector processors (dGPU) and general purpose central processors (CPU/APU). There is almost an order of magnitude difference between the most powerful iGPU and the most powerful dGPU, not to mention dGPU's will have specialized ultra-wide, ultra-fast memory bus's dedicated while iGPU's will have to share with the central memory implementation.
We are decades away, at a minimum, from having "too much" vector processing power. We've barely scratched real time ray tracing and physics and have been experimenting with various implementations of 3D. Imagine what the graphics processing requirements will be once holographic displays become a consumer reality. There is just too much that you can do with a powerful vector co-processor available to the system.
For graphics, the PCIe bus presents zero issues. Programs simply upload their data sets to the dGPU's memory prior to execution, and as execution happens the program just keeps putting data into that memory before it's needed. The interesting thing about vector style processing is that it's incredibly predictable, compared to general processing and graphics memory is so large that there is never a problem of not having your data present prior to execution time. Thus the only issue from the PCIe (or any other) bus is latency, which is only an issue if your trying to use the dGPU has an integrated math co-processor instead of a dedicated graphics / physics co-processor. That is the only real advantage to having a local vector co-processor, and while it's a really good advantage it's not one that replaces the dGPU.
What your going to see isn't the iGPU replacing the dGPU, but rather the iGPU complimenting it. You will have a powerful CPU with a low to medium iGPU and a powerful dGPU. The CPU handles general computing with the iGPU acting as a co-processor while the dGPU acts as a graphics processor or physics when it's tied to graphics.
"ideal world" will software ever really get the full potential of this setup? your proposing dual graphics where the slave card "dgpu" is much more powerful than the master card like what was proposed to be supported with dx12, but what is gained by this is minimal compared to what is proposed to be gained by this because software is to far behind. the ideal solution is not a dual setup of anything but one single chip with transistors to do every calculation within a near 0 latency distance of each other.
to propose that an APU will be able to fit those 18 bil transistors to meet highend apu spec would require a die to be physically massive like seen on ibm power cpu's. this physical implementation of the ideal APU would be expensive to make and man while also requiring new motherboards coolers and educating the public how to build with this radical all in one APU. no ram slots proper cooler selections how to mount heights over hbm and die etc... BUT in theory could be as fast as the top end separate cpu dgpu or your proposed "slave and master gpu" system and cost less in the long run to make.
key component here is cost less to make. if amd does a study to find that a 4000 core apu monster is faster than a 3800 core dgpu and cpu setup but cost as much as a 5000 core dgpu and cpu then they will lose money to build it and wont waste their time, despite how cool it would be to see that tech become available.
I think its just to expensive to build at 16nm 10nm and maybe even 7nm.