Even if the SSD is connected to the chipset, it would need to travel the SSD-chipset-CPU-GPU either way since the GPU is still attached to the CPU. The only case where you'd spare a transfer is if the GPU and SSD are connected to the chipset.True. If they're both direct-connected to the CPU, then there are no unnecessary trips over PCIe.
When stuff is cached in the file system cache, the OS receives a file IO, looks up its table of cached data and if there is a hit, it returns that directly, no need to access the SSD and the IO completes in little more than the 1-2us it takes to switch from user-land to kernel-land and back, especially if the OS can just DMA it directly to the GPU from cache instead of copying it again to user-land first.If the SSD is fast enough to keep up with the game engine, then I think you don't really need to cache the assets in system RAM.
When an IO has to go all the way back to the SSD, the OS has to traverse the file system to locate the LBAs containing the data, then do however many IO operations are required to gather all of the bits which can take 10s of microseconds.
Using system memory for caching also means that as long as you have sufficient RAM, you should still be able to run DirectStorage games from HDD as long as the game is written to gracefully handle assets taking too long to load the first time through while the cache is being populated either with stutters waiting for assets or placeholders that will 'pop' when the data arrives.