News Nvidia's new tech reduces VRAM usage by up to 96% in beta demo — RTX Neural Texture Compression looks impressive

Hm... This feels like we've gone full circle.

Originally, if memory serves me right, the whole point of textures was to lessen the load of the processing power via adding these "wrappers" to the wireframes, because you can always make a mesh and add the colouring information in it (via shader code or whatever) and then let the GPU "paint" it according to what you want it to; you can do that today even, but hardly something anyone wants to do. Textures were a shortcut from all that extra overhead on the processing, but it looks like nVidia wants to go back to the original concept in graphics via some weird "acceleration"?

Oh well, let's see where this all goes to.

Regards.
 
So Nvidia is creating a tech that lets them build an RTX 5090 with 4GB of VRAM.

They also created a tech that makes DLSS useless.

And I'm really curious to see how it would run on an RTX 4060, or even lower. And on non-RTX cards.

That drop in 1% lows performance is a bit troubling.

All that said, I really like the tech. A 96% reduction in memory usage with a 10-20% performance hit is actually very good. Meshes, objects and effects will be traded for extremely high resolution textures, or even painted textures instead of tiled images.

Too many questions remain, but I like this.
 
Hm... This feels like we've gone full circle.

Originally, if memory serves me right, the whole point of textures was to lessen the load of the processing power via adding these "wrappers" to the wireframes, because you can always make a mesh and add the colouring information in it (via shader code or whatever) and then let the GPU "paint" it according to what you want it to; you can do that today even, but hardly something anyone wants to do. Textures were a shortcut from all that extra overhead on the processing, but it looks like nVidia wants to go back to the original concept in graphics via some weird "acceleration"?

Oh well, let's see where this all goes to.

Regards.
Not really, it is just compressing the textures in a different and more efficient way that does mean that they will need decompressed when they are used.
 
I like how the performance impact was measured with a ridiculous framerate.
Everything a GPU does takes time. Apparently neural decompressing takes less time than the frametime you would have at 800 something fps. That isn't very much time and I imagine my low vram 3080 could decompress a lot of textures without having a noticeable impact at 60 fps.
 
  • Like
Reactions: KyaraM
No matter what Nvidia does, it is always the same. Frame generation? Fake frames. Lossless scaling? AFMF? Soo amazing and game changing.

I think its become apparent that Nvidia owners spend most of their time gaming, and AMD owners spend more of their time all about sharper text and comment fluidity for their social media negativity. No a new AMD GPU will not load the comment section faster.
 
  • Like
Reactions: KyaraM
I'm a little unclear on whether there's any sort of overhead not accounted for, in their stats. Like whether there's any sort of model that's not directly represented as the compressed texture, itself. In their paper, they say they don't use a "pre-trained global encoder", which I take to mean that each compressed texture includes the model you use to decompress it.

However, what can say is fishy...

Nvidia be like: mp3 music, lossless flac music, it's music just the same! 🤓
Yeah, the benchmark is weird in that it seems to compare against uncompressed textures. AFAIK, that's not what games actually do.

Games typically use texture compression with something like 3 to 5 bits per pixel. Compared to a baseline 24-bit uncompressed, that's a ratio of 5x to 8x.

So, this benchmark is really over-hyping the technology. Then again, this is Nvidia and they love to tout big numbers. What else is new?

Edit: I noticed what Nvidia is telling developers about it:

"Use AI to compress textures with up to 8x VRAM improvement at similar visual fidelity to traditional block compression at runtime."

Source: https://developer.nvidia.com/blog/get-started-with-neural-rendering-using-nvidia-rtx-kit/
 
Last edited:
I like how the performance impact was measured with a ridiculous framerate.
Everything a GPU does takes time. Apparently neural decompressing takes less time than the frametime you would have at 800 something fps. That isn't very much time
Did you happen to notice how small the object is, on screen? The cost of all texture lookups will be proportional to the amount of textured pixels drawn per frame. A better test would've been a full frame scene with lots of objects, some transparency, and edge AA.
 
Yeah I have a feeling this is going to be one of those cases where "up to 96% reduction in memory usage!" is going to apply mostly to 3D modeling and other such programs, while in games typical reduction is going to be far lower, and by far not a reason to put less than 16 GB VRAM on any GPU in 2025, not given the prices of even the entry level models.
 
Didn't AMD have neural compressed textures before Nvidia even thought of this? They had their own paper about it and everything.
 
Isn't the whole point of using compression to lower VRAM usage to, you know, make the game run better, or to make it look better?

So it's a feature that lowers frame rate and maybe visual fidelity to free up memory.. I can then use that empty memory to do what, exactly?
 
This seems like another tech, like framegen, that works well and ads to your experience when you don’t really need it.

A 4090 isn’t short of VRAM and rarely needs DLSS to give playable 4K, the lesser GPUs that are and do look like they won’t have the processing power to do both DLSS and neural texture compression so you still won’t be better off (but at least you can choose between having too little vRam and not being able to get a decent frame rate at 4K).
 
Originally, if memory serves me right, the whole point of textures was to lessen the load of the processing power via adding these "wrappers" to the wireframes, because you can always make a mesh and add the colouring information in it (via shader code or whatever) and then let the GPU "paint" it according to what you want it to; you can do that today even, but hardly something anyone wants to do. Textures were a shortcut from all that extra overhead on the processing,
I guess you're talking about like tessellating the mesh down to pixel level (or texel, if we're being more pedantic), and drawing them as flat-shaded triangles? Yes, you could do that but it's not like people were doing that before texture mapping, because pre-texture mapping hardware had nowhere near the geometry processing horsepower or memory to hold such detailed meshes.

Anyway, this neural texture compression seems to work pretty much like other texture compression formats that came before it. They make it clear that it has no assumptions or constraints on the number of channels or their semantics, which means you can use it for opacity maps, bump maps, albedo, BRDF (Bidirectional Reflectance Distribution Function) parameters, etc.

At a conceptual level, you can think of the network used for compression and decompression a little bit like how JPEG uses DCT (Discrete Cosine Transform) + coefficient quantization. One big difference is that they train a separate encoder for each texture. I wonder if you can have multiple textures share the same encoder model and how that might benefit size and rendering performance.

Anyway, here's the paper. I just read a few bits and pieces. It touches on some interesting concepts in neural rendering, as well. I plan to spend a bit more time with it, later.

Supplementary info (including additional sample images):
 
  • Like
Reactions: P.Amini and -Fran-
Here is a sample compression ratio highlighted on the RTXNC github page:

Bundle Compression​
Disk Size​
PCI-E Traffic​
VRAM Size​
Raw Image​
32.00 MB​
32.00 MB​
32.00 MB​
BCn Compressed​
10.00 MB​
10.00 MB​
10.00 MB​
NTC-on-Load*​
1.52 MB​
1.52 MB​
10.00 MB​
NTC-on-Sample​
1.52 MB​
1.52 MB​
1.52 MB​

So, 6.58x better compression than the BCn (block compression) they used for comparison. I have no idea how realistic that is, but it's certainly more realistic than the numbers highlighted in this article.
 
  • Like
Reactions: P.Amini
BTW, I'm pretty sure Cooperative Vector just means that it's using packed SIMD arithmetic. What makes it "cooperative" is that it involves cross-lane synchronization, which means it interferes with their conceptual model of SIMD lanes as threads. The github page I mentioned above links here, if you want to dig into the details:

The github page also mentions:

"On Ada- and Blackwell-class GPUs this provides a 2-4x improvement in inference throughput over competing optimal implementations that do not utilize these new extensions."

I'd guess games won't even use neural texture compression on hardware older than that. So, they'll probably just always use the "cooperative vector" extension in the neural texture compression code path.
 
Hey I had just now this awesome great ideia that they didn't realize... what about just giving players more friggin VRAM in the GPUs???????
Are you willing to pay even more for more high-speed VRAM? It's not reasonable. Instead of increasing the amount of VRAM it would be much more logical to change the algorithms of compression of game textures. It doesn't mean that for example neural calculations will require less VRAM, but it will be critical for games.
 
  • Like
Reactions: KyaraM
Are you willing to pay even more for more high-speed VRAM? It's not reasonable. Instead of increasing the amount of VRAM it would be much more logical to change the algorithms of compression of game textures. It doesn't mean that for example neural calculations will require less VRAM, but it will be critical for games.
Also if you can change the algorithm in a way that functions on older cards you benefit everyone.

Plus I’ve never actually seen the “not enough VRAM” argument actually proven.
 
Last edited: