All of this is true...but it's also true that 8GB cards still exist...which was my point.
It seems you're just now discovering
the memory wall!
Note the distinct flattening over about the past decade. GB/$ was increasing exponentially, until then. Because the Y-axis is logarithmic, a fixed exponent shows up as a straight diagonal line.
So, one way to read what I said in my above posts is that the 1 TB configuration is achieved by way of combining 256 DDR5 dies, which contain 4 GiB (32 Gigabits) each. When DDR5 memory first started shipping, all of the dies were 16 Gigabits (2 Gigabytes), then came 24 Gib (3 GiB) and now 32 Gib (4 GiB) dies. So, there has been some density increase, but it's still not much improved. So, the only way to have lots of RAM is by using tons of chips. That's why it's so expensive.
GDDR6 memory is more complex, being highly optimized for speed, and therefore innately more expensive and slightly lower-density. It only comes in 16 Gigabits (2 GiB) capacity, and the highest number of dies you can address on the same 32-bit channel is 2. So, a 128-bit card is going to have either 4 or 8 chips, which gives you 8 or 16 GiB.
GDDR7 launched at 16 Gigabits (2 GiB) per die, but now there's 24 Gigabit (3 GiB) dies. That's how a 512-bit GPU like the GB102 (as seen on the RTX 5090 and RTX Pro 6000) can be configured for 32 GiB (16x 2 GiB), 48 GiB (16x 3 GiB), 64 GiB (32 x 2 GiB), or 96 GiB (32 x 3 GiB). It takes a big, fat GPU to host that much RAM, due to the limitation on the number of chips per channel. The combination of a big GPU and lots of GDDR memory becomes
very expensive and power-hungry.