Integrated graphics has really hit the point of diminishing returns. You need a lot more memory bandwidth to feed more shader cores, and with the mainstream desktops all on dual-channel DDR4, the best you can reasonably hope for is official DDR4-3200 support. That's 51.2 GBps of theoretical bandwidth, shared between GPU and CPU use. For comparison, even a relatively low-end dedicated GPU like an RX 560 has more than double that bandwidth (112 GBps), all for the GPU.
So on a theoretical performance level, the RX 560 has twice the bandwidth, with CUs running at 1175 MHz. That's 2.1 TFLOPS of compute. Performance is almost double that of the Vega 11 in the Ryzen 5 3400G. The 3400G is 11 CUs at 1400 MHz, or 1.97 TFLOPS, so the majority of the performance difference is thanks to memory bandwidth.
How do you improve integrated graphics, then? You need more cores and more bandwidth. The first is easier than the second. Could you do an HBM2 stack? Sure -- Intel teamed up with AMD for such a processor with Kaby Lake G and Vega M Graphics. Performance was relatively close to an RX 570 4GB ... and the cost for the processor was more than double the cost of a 3400G.
The next-gen consoles are doing massive GPUs with CPUs, plus 16GB of GDDR6 memory. But that's a special case, as console hardware has a static spec and sells tens of millions of units, so economies of scale come into play. No CPU or GPU upgrades, static platform, etc. But no company has even attempted a PC "APU" with anything close to console levels of GPU performance and memory bandwidth.