In game reloads are much more common and the performance is much more important IMO than the performance increase cold loading a game when you first start it. You are already playing and are immersed and invested, not just getting set up for that. I wonder why nobody tests these times? They also seem much more random dependent. I bet most of what has to get in the GPUs memory is already there at that point and just bits and pieces have to be added.
What you are referring to would be very hard to test for.. that said it's what the direct access APIs mentioned are to help reduce.. By transferring data directly from storage to ram or the video card cutting out the need for going through the CPU...
The trouble in testing is it would really be artificial, since natural resource contention had a lot of variables. The speed and storage of the video card, CPU, cache, ram and device storage. Too many variables to really control.
The noticeable performance differences are also going to carry a lot. The difference from a 3s scene load to under a second is more noticable than 900ms to 300ms.... Even if the tattoo is the same the hard differences are less noticeable. Much like going from 60hz to 120hz vs 240hz display. It's diminishing returns.
For me, the difference is build times for large projects... My R9 5950X sometimes gets outpaced by my M1 Max MacBook and I'm pretty sure it's the improved storage access speed.
I don't feel a need for much more CPU, aside from video encodes... But if I can get back half my build time through a day, I'd be pretty happy.