edit: I'm betting the real reason is use of SoC tile across desktop and mobile (this is just a guess based on the fact that LPE cores are listed for NVL-S SKUs)
That sounds very reasonable, a minimum number of common parts to create a big range is how AMD got ahead.
The problem is one that AMD already has which is why their chipsets don't have the connectivity Intel's do and high speed SSDs are more problematic. It's not really a good idea to have connectivity that allows for a single device to completely saturate the DMI connection.
When AMD shifted to PCIe 4.0 x4 Intel matched the bandwidth with PCIe 3.0 x8, but since the platform was still PCIe 3.0 a single SSD cannot saturate the link. Intel carried that forward when moving to a PCIe 4.0 chipset platform, but now they're cutting the link in half when moving to PCIe 5.0. It's hard to see that as anything other than moving backwards. The only fortunate part is that it's unlikely to cause problems very often.
Starvation is never a good thing. But even the original PCI bus had learned that from ISA days and made sure that hardware arbitration prohibited true monopolisation: you only got a limited number of cycles as bus master before another arbitration was forced with round-robin allocation.
AFAIK PCIe inhertited that logic and should be just as resilient here and the ability to oversubscription is the raison d'etre for switch chips: the ratio is critical, though and perhaps Intel is overdoing it a bit with NVL. Still, it could match your use case.
Intel would say that they offer iso-bandwidth, so it's not a full degression and that there is too little of an incentive, too much of a cost to double that bandwidth.
Where I can see a degression is when the lowest common denominator of lane speed and count is always chosen, so that an PCIe v3 x8 peripheral which would get near 8Gbyte/s of bandwidth with an older chipset now only gets four v3 lanes on a bus that's capable of v5 speeds: PCIe devices and switches contain buffers, they
should be able to translate, but I don't see that happen and don't know if it's just 'lazy' configuration or I misunderstand PCIe capabilities.
If Intel's new chipset really operated by matching bandwidths with variable lanes, that would be a strong selling point, e.g. allowing a v3 x16 GPU to operate at full speed even with the v5 x4 uplink.
For the longest time the bandwidth increases from modern SSDs outstripped anything applications and even game designers imagined, it's only now that storage is increasingly seen as a GPU direct data delivery agent, which needs to conform to quality of service parameters in terms of latency and bandwidth to avoid game stutters.
But I see that mostly driven from consoles and thus a generation or so behind what the PC leading edge can provide even with somebody else also using some fat big port.
We'll see, I guess, but so far I see the risk that NVL will be far too expensive for a long time for me to even worry about it.
When I can get 16 Zen 4 cores (Ryzen 7945HX) with 24 PCI v5 lanes (no chipset) including a mainboard for €450 from Minisforum, I'm simply not looking at Intel's Nova Lake i9 at probably twice that or more.
I only count P-cores and consider E-cores a crutch for inferior power management, which makes the biggest Nova-Lake yet another 16 core in my eyes (yes, I know it's not quite true, but since I run Proxmox on many systems, E-cores are just a complication).
First thing I did was add a bifurcation adapter, to split x16 into x8+x4+x4 and that was perhaps €20 to recycle 10Gbase-T network, add 6x SATA and yet another NVMe drive.
There is no oversubcription in that build, because it's mostly a µ-server without a dGPU. But it's also the reason I'd rather like to do away with x16 slots and have everything use a variable count lf small x4 connectors,
potentially with a switch for fan-out ports.