>Who is this even for?
People who want to localhost LLM on iGPU + system RAM. It's slow, but it's at least doable for laptops without dGPU. Laptops w/ upgradeable SoDIMMs can go up to 64GB, so 64 * .87 = 55GB, enough to run a 30B model. Laptop needs to be Strix Point or later, and Core Ultra 2 or later.
With 55GB VRAM, you can run, say, Qwen3-30B-Coder at full precision. Again, slow, but doable. That's more than what 5090 can do. (5090 mobile has only 24GB VRAM.)
It's less doable for current desktops, as they have vestigial iGPU and/or NPU (if any). This will change as desktop CPUs incorporate more powerful NPU.
Speaking of NPU, LLMs can be configured to run on NPU exclusively, or NPU+iGPU (AMD Lemonade), although NPU support is still very WIP. I think this will be the trend to hosting LLMs on general consumer PCs/laptops, for NPU to take point rather than iGPU.
View: https://www.youtube.com/watch?v=mcf7dDybUco
>Windows already allows iGPUs to use up to 50% of your system RAM by default without fussing with BIOS settings
The change is that you can now allocate up to 87% of system RAM (depending on total RAM size), and that you can do it in Windows rather than needing to reboot into CMOS settings. You still need to reboot for change to take effect, so one reboot instead of two.
>No amount of shared RAM is going to make running a large LLM on an iGPU worthwhile
LLMs come in various model sizes. 30B models are competent enough to be useful for specific tasks. Even smaller models can be useful. Running LLMs on iGPU isn't practical for actual productivity, but it's good enough to tinker with, and to learn about LLM without having to spend extra money. To be sure, local LLM is still very much in the enthusiast sphere.
>and games are held back by the iGPU and RAM speed, not the available shared RAM.
It's not about gaming.