Both of these demos raise some interesting questions. It's noted on the Intel T-Rex demo that the texture pass time (on an RTX 5090?) increases from 0.045 ms to 0.111 ms, but we don't know how much VRAM was being used. The Nvidia demo meanwhile notes a compressed texture size that goes from 272MB down to 98MB with BTC, and further drops to 11.37MB with NTC... but then we don't get a pass time.
So what happens if a game uses even 2GB of NTC compressed textures? That should run just fine in terms of VRAM on even 8GB cards like the 5060 Ti 8GB and 5060, and potentially AMD's 9060 XT 8GB as well. But if it takes 0.111 ms for a workload that uses a paltry amount of textures — if T-Rex is anything like the Flight Helmet demo, we could be looking at less than 50MB of textures when compressed — and that's on an RTX 5090! Then what happens when we shift to 2GB of textures on an RTX 5060?
We can guess. RTX 5090 offers 5.4X more AI compute than the RTX 5060. That means potentially the same T-Rex demo that was taking 0.111 ms on the 5090 might now require 0.60 ms on the 5060. And then if we were to just guesstimate that a full game is using 40 times as much texture data as these simplistic demos, we're now talking about potentially spending 24 ms just on the texturing pass.
If you can pipeline things so that the whole engine doesn't stall while waiting for texture decompression, that would still mean at best 40-ish FPS. Drop the resolution to 1080p or even 1440p and potentially we double that performance. But again, this is just rough estimates.
I suspect there's a good reason we haven't seen any of this tech in a shipping game yet. It will take a lot of work to create the assets, both the uncompressed and NTC variants, and games will still need to work on GPUs without NTC support. In that sense, it's the same story as ray tracing yet again. Game publishers and developers are waiting for the proverbial chicken to arrive before they start building eggs into their games.