News VRAM-friendly neural texture compression inches closer to reality - enthusiast shows massive compression benefits with Nvidia and Intel demos

Admin · Jun 18, 2025

AI texture decompression promises better image quality, lower resource usage

VRAM-friendly neural texture compression inches closer to reality - enthusiast shows massive compression benefits with Nvidia and Intel demos : Read more

-Fran- · Jun 18, 2025

If this will be a DirectX (and Vulkan, I hope?) base feature, then it's a big W for everyone.

Regards.

thesyndrome · Jun 18, 2025

I hadn't heard of Intel's take on texture compression, but that gives me a lot of hope. Despite owning an Nvidia GPU (probably going to be my last one tbh based on how the company has acted over the last few years and AMD closing the gap) I thought this was a proprietary technique that relied on Nvidia hardware based on Jensen's comments whilst speaking about it at GDC 2025, but to know that it's something potentially every GPU company can do seems like a huge boon for the industry.

I have lamented texture issues since the days of Unreal Engine 3, with some games opting to go for low-fidelity textures to prevent pop-in from needing to load large textures taking time, and other games deciding they can barely be bothered to compress at all which leads to ludicrous install sizes. It would be nice if we could get games under 80GB again without sacrificing quality or causing performance issues (frankly I'd like to see them get below 50GB again, but I'm not holding my breath)

JarredWaltonGPU · Jun 18, 2025

Both of these demos raise some interesting questions. It's noted on the Intel T-Rex demo that the texture pass time (on an RTX 5090?) increases from 0.045 ms to 0.111 ms, but we don't know how much VRAM was being used. The Nvidia demo meanwhile notes a compressed texture size that goes from 272MB down to 98MB with BTC, and further drops to 11.37MB with NTC... but then we don't get a pass time.

So what happens if a game uses even 2GB of NTC compressed textures? That should run just fine in terms of VRAM on even 8GB cards like the 5060 Ti 8GB and 5060, and potentially AMD's 9060 XT 8GB as well. But if it takes 0.111 ms for a workload that uses a paltry amount of textures — if T-Rex is anything like the Flight Helmet demo, we could be looking at less than 50MB of textures when compressed — and that's on an RTX 5090! Then what happens when we shift to 2GB of textures on an RTX 5060?

We can guess. RTX 5090 offers 5.4X more AI compute than the RTX 5060. That means potentially the same T-Rex demo that was taking 0.111 ms on the 5090 might now require 0.60 ms on the 5060. And then if we were to just guesstimate that a full game is using 40 times as much texture data as these simplistic demos, we're now talking about potentially spending 24 ms just on the texturing pass.

If you can pipeline things so that the whole engine doesn't stall while waiting for texture decompression, that would still mean at best 40-ish FPS. Drop the resolution to 1080p or even 1440p and potentially we double that performance. But again, this is just rough estimates.

I suspect there's a good reason we haven't seen any of this tech in a shipping game yet. It will take a lot of work to create the assets, both the uncompressed and NTC variants, and games will still need to work on GPUs without NTC support. In that sense, it's the same story as ray tracing yet again. Game publishers and developers are waiting for the proverbial chicken to arrive before they start building eggs into their games.

nightbird321 · Jun 18, 2025

If NTC is vendor neutral and quick to implement for devs, then awesome. If it requires specific hardware to run, it may be 10 years before it can be more than another of many graphics options.

mitch074 · Jun 18, 2025

I have to read further into it, but I wouldn't be surprised to find that it's nothing more than procedural generation running on shaders units... Slap an AI badge on it and you get the buzz.
Still, if they could finally get that out the door, that would be fantastic.

blvckmvgic · Jun 18, 2025

cool, new tech made by very intelligent people that will remain unused in favor of standard practices such as leaving 200gb of uncompressed assets in the newest iteration of the CoD franchise /s

Trake_17 · Jun 18, 2025

Sounds to me like why Nvidia has been resisting the push for larger VRAM on GPUs

greenreaper · Jun 18, 2025

Trake_17 said:
Sounds to me like why Nvidia has been resisting the push for larger VRAM on GPUs

Pretty sure that's for market segmentation purposes. More VRAM starts to eat into the margin on cards for the AI marker.

nightbird321 · Jun 18, 2025

blvckmvgic said:
cool, new tech made by very intelligent people that will remain unused in favor of standard practices such as leaving 200gb of uncompressed assets in the newest iteration of the CoD franchise /s

What percent of current gaming GPUs have the AI cores to use NTC?

Steam Hardware & Software Survey

Thunder64 · Jun 18, 2025

greenreaper said:
Pretty sure that's for market segmentation purposes. More VRAM starts to eat into the margin on cards for the AI marker.

There's that. Also planned obsolence and upselling. Thankfully people are catching on. That's probably why the 4070 has been readily available. The 4070 Ti will likely be viable for much longer than the 4070 because it has more VRAM.

atomicWAR · Jun 18, 2025

JarredWaltonGPU said:
Both of these demos raise some interesting questions. It's noted on the Intel T-Rex demo that the texture pass time (on an RTX 5090?) increases from 0.045 ms to 0.111 ms, but we don't know how much VRAM was being used. The Nvidia demo meanwhile notes a compressed texture size that goes from 272MB down to 98MB with BTC, and further drops to 11.37MB with NTC... but then we don't get a pass time.

So what happens if a game uses even 2GB of NTC compressed textures? That should run just fine in terms of VRAM on even 8GB cards like the 5060 Ti 8GB and 5060, and potentially AMD's 9060 XT 8GB as well. But if it takes 0.111 ms for a workload that uses a paltry amount of textures — if T-Rex is anything like the Flight Helmet demo, we could be looking at less than 50MB of textures when compressed — and that's on an RTX 5090! Then what happens when we shift to 2GB of textures on an RTX 5060?

We can guess. RTX 5090 offers 5.4X more AI compute than the RTX 5060. That means potentially the same T-Rex demo that was taking 0.111 ms on the 5090 might now require 0.60 ms on the 5060. And then if we were to just guesstimate that a full game is using 40 times as much texture data as these simplistic demos, we're now talking about potentially spending 24 ms just on the texturing pass.

If you can pipeline things so that the whole engine doesn't stall while waiting for texture decompression, that would still mean at best 40-ish FPS. Drop the resolution to 1080p or even 1440p and potentially we double that performance. But again, this is just rough estimates.

I suspect there's a good reason we haven't seen any of this tech in a shipping game yet. It will take a lot of work to create the assets, both the uncompressed and NTC variants, and games will still need to work on GPUs without NTC support. In that sense, it's the same story as ray tracing yet again. Game publishers and developers are waiting for the proverbial chicken to arrive before they start building eggs into their games.

The article was interesting. This tech does look amazing but you're left you with a lot of blanks to fill in as a reader. Jarred's reply was very thought provoking on the subject. It really calls into question the viability of the tech in real world games, particularly on lower tier cards. It reminds me of the tech demos you got on disk with OG Geforce 256-5 cards. Those demos looked amazing but were far from a realistic portrayal of what the cards were actually capable of handling in game.

I understand we are hitting a wall in transistor density with the death of Moore's Law which maybe further complicated with the parallel compute core count 'ceilings' also being hit as they are restrained by serial parts of code (Amdahl's law...also predicted as roughly 20K GPU cores by a dev I believe was Todd Howard < maybe > around ten years ago but I couldn't find the article to link, sorry).

My point is I get why Nvidia has to get creative to continue to increase the image quality in games but when does there come a point, regardless if Nvida or it's competitors refuse to acknowledge it, where we simply cannot increase gpu core counts further. Are we there already/close in the high end and this why we are seeing the massive push to AI? I am very curious to see where things go over the next few generations of GPUs. Are we simply going to be forced to either swallow more latency or reduce image quality/frame rates? Can we squeeze more life into these 'laws', do we need to rewrite them or are we required to take entirely new approaches to manufacuring, production and hardware/software rendering to realize further gains in picture fidelity/frame rates.

I remember thinking as a young man how far off these worries felt but I knew they would likely come to a head in my lifetime. This was back when I had a PIII single core CPU on a 250nm node and your GPU was called a 3d accelarator. In my case with 2 pixel shaders, 2 rops and 2 tmus on my Nvidia Riva TnT 2 card. We knew these laws were coming for us back then, particularly the notable death of Moore's Law but they still felt unreal at that time. Now that we are basically there I am always curious how we'll side step them as I always suspected we would. Is Nvidia's AI the answer or will it be something else... I vote (or is it hope) for something else. Because more latency seems to be in AI's anwser.

Thunder64 · Jun 18, 2025

atomicWAR said:
The article was interesting. This tech does look amazing but you're left you with a lot of blanks to fill in as a reader. Jarred's reply was very thought provoking on the subject. It really calls into question the viability of the tech in real world games, particularly on lower tier cards. It reminds me of the tech demos you got on disk with OG Geforce 256-5 cards. Those demos looked amazing but were far from a realistic portrayal of what the cards were actually capable of handling in game.

I understand we are hitting a wall in transistor density with the death of Moore's Law which maybe further complicated with the parallel compute core count 'ceilings' also being hit as they are restrained by serial parts of code (Amdahl's law...also predicted as roughly 20K GPU cores by a dev I believe was Todd Howard < maybe > around ten years ago but I couldn't find the article to link, sorry).

My point is I get why Nvidia has to get creative to continue to increase the image quality in games but when does there come a point, regardless if Nvida or it's competitors refuse to acknowledge it, where we simply cannot increase gpu core counts further. Are we there already/close in the high end and this why we are seeing the massive push to AI? I am very curious to see where things go over the next few generations of GPUs. Are we simply going to be forced to either swallow more latency or reduce image quality/frame rates? Can we squeeze more life into these 'laws', do we need to rewrite them or are we required to take entirely new approaches to manufacuring, production and hardware/software rendering to realize further gains in picture fidelity/frame rates.

I remember thinking as a young man how far off these worries were felt but I knew they would likely come to a head in my lifetime. This was back when I had a PIII single core CPU on a 130nm node and your GPU was called a 3d accelarator. In my case with 2 pixel shaders, 2 rops and 2 tmus on my Nvidia Riva TnT 2 card. We knew these laws were coming for us back then, particularly the notable death of Moore's Law but they still felt unreal at that time. Now that we are basically there I am always curious how we'll side step them as I always suspected we would. Is Nvidia's AI the answer or will it be something else... I vote (or is it hope) for something else. Because more latency seems to be in AI's anwser.

So this is overly pedantic but are you sure you had a 130nm PIII? Those didn't come out until after the first P4's and IIRC weren't very common despite being quite good.

atomicWAR · Jun 18, 2025

Thunder64 said:
So this is overly pedantic but are you sure you had a 130nm PIII? Those didn't come out until after the first P4's and IIRC weren't very common despite being quite good.

You're probably right. I was going off my old and ever more faulty memory lol. 250nm is more likely. Corrected in my post.

Thunder64 · Jun 18, 2025

atomicWAR said:
You're probably right. I was going off my old and ever more faulty memory lol. 250nm is more likely. Corrected in my post.

That certainly fits more in line with a Riva TnT 2. By 2001 when the Tualitian P3's came out the Geforce 3 had already come out as well. Things sure did move fast back then.

thestryker · Jun 18, 2025

The question I'd posit as a user is what would you prefer more GPU power being used to save storage space and VRAM or manufacturers to just put more VRAM on the cards. I know what I'd rather have.

Comparing Blackwell to Ada it's very apparent that the only thing really improved from one to the other was AI performance. A lot of the software tricks being enabled by the new hardware are good, but I'd rather just get more performance. Aside from upscaling most of these added features seem to also add latency which makes the experience worse.

It would never happen because money, but I'd rather tensor cores were generationally fixed for consumer hardware rather than scaling with core counts. This would allow for a singular experience across the entire generation and in theory would allow for more raster/RT performance on higher SKUs.

nitrium · Jun 18, 2025

The key question is how much GPU resources does this technology use - e.g. will you be able to use this and still have DLSS working as normal on lower-end RTX GPUs?

Jeff Kampman · Jun 18, 2025

I would caution against extrapolating too much from this video. For perspective, I downloaded and compiled both of these tools/demos for my own interest and even on an RTX 5070 (going from 680 Tensor Cores to 192) the pass time I got in Nvidia's NTC renderer is essentially identical (0.18 ms on the RTX 5070 vs the 0.17 that Compusemble saw on the RTX 5090). The pass time for Intel's T-Rex demo is essentially the same as well. There may be a hardware wall that this technique hits at some point, but these tools/demos aren't hitting it even on a 5070.

baboma · Jun 18, 2025

Jarred>In that sense, it's the same story as ray tracing yet again. Game publishers and developers are waiting for the proverbial chicken to arrive before they start building eggs into their games.

I don't see that as an equal comparison. Yes, there's a cost-benefit calculation to be made for any feature/tech to be adopted. For RT's adoption, cost in compute power was high (is still high today) relative to fairly peripheral increase in aesthetics.

For NTC, I don't know the particulars of NTC requirement, but it can't be as high as RT, and the benefit is clear--lower VRAM usage for any given level of texture. That matters much more than the aesthetics increase via RT, given that the bulk of GPUs still have 8GB today, and presumably for the next gen as well. The 8GB VRAM limit is now arguably more of a bottleneck to RT than compute power.

So, yes, assuming the tech has progressed beyond the lab demo stage, I see NTC having a faster adoption rate by game vendors than RT.

Jeff>I would caution against extrapolating too much from this video.

From your investigation, do you have an insight as to how much progress NTC has made beyond this demo? Are gaming vendors talking about it in any capacity?

PS: Glad to see Jarred continuing his participation in the forums. Also, a welcome to Jeff for his first post here. Hopefully it will be a first of many.

Lucky_SLS · Jun 18, 2025

The way I see it? Will it be beneficial if the performance gains are worth it in higher resolutions.

Consider my use case: modded skyrimVR with a psvr2 and 4070TiSuper. If NTC gives better performance compared to running the game using high resolution textures. And 16gb Vram limit is real here in my use case.

JarredWaltonGPU · Jun 19, 2025

baboma said:
For NTC, I don't know the particulars of NTC requirement, but it can't be as high as RT, and the benefit is clear--lower VRAM usage for any given level of texture. That matters much more than the aesthetics increase via RT, given that the bulk of GPUs still have 8GB today, and presumably for the next gen as well. The 8GB VRAM limit is now arguably more of a bottleneck to RT than compute power.

From what I've heard at Nvidia in the past, NTC basically requires all the textures, bump maps, materials, etc. to be fed through a neural network to generate the final assets. Which means it does take time, and just like RT, there's not a simple switch you can flip.

Which is odd, because on the one hand it seems like there should be a relatively easy switch to do full RT. It never seems to work that way in practice. With NTC, it also seems like there should be a relatively easy algorithm to feed things into, but again the devs and publishers would want to ensure the final result is "correct" — so some fine tuning would potentially be required.

Ultimately, things that from the outside looking in seem like they should be super easy rarely are in practice. For game development, there's always a catch of some form. But I do hope that NTC sees rapid adoption in short order! I just am a serious doubting Thomas as far as graphics tech adoption goes. I'll believe it when I see it!

News VRAM-friendly neural texture compression inches closer to reality - enthusiast shows massive compression benefits with Nvidia and Intel demos

Administrator

Glorious

Great

Splendid

Distinguished

Splendid

Distinguished

Distinguished

Distinguished

Distinguished

Glorious

Distinguished

Glorious

Distinguished

Judicious

Distinguished

Respectable

Glorious

Splendid

Share this page