Nope. Basically in traditional SLI and CrossFireX setups, the GPU's VRAM doesn't get stacked, which means if you have 2 cards with 4GB VRAM, then the game won't see it as a single 8GB Frame Buffer. Nope. Each card will still have it's own buffer as seen by the system.
But to answer your question regarding the RTX 3090 VRAM stacking in NVLINK mode, the VRAM "stacking" is only supported in DX 12's EMA (explicit multi adapter) mode. When EMA is used, each GPU's memory is accessed individually, carrying only what it needs for its specific task. So, the RAM doesn't "stack" per se, but rather, is assigned different tasks, but that's semantics. In an EMA situation, VRAM on all utilized cards is available.
VRAM from both of your cards can be utilized independently, but only when using DX12's EMA mode.
But, there are still many caveats with this EMA mode as well. I'm not 100% sure either, since I have not been following this news since long. It needs to be coded as well. Currently, few games exist that uses this feature, like for example, "Ashes of the Singularity"/AOTS. But, I haven't been following that game so I don't know for sure.
Okay, let me share this as well. We all know SLI days are over, but what about NVLink ? You must be aware that NVidia has introduced a new interface called NVLINK with the consumer Turing GPUs, instead of the old SLI. Obviously, it's the same multi-GPU bridge which can be used for gaming, but it has an interface with many times the bandwidth of an SLI connection.
Because NVLink can be used for direct memory access between cards, and
not through the PCIe slots as this was creating a huge bottleneck with SLI. So I think NVlink is the future, if we go by Nvidia's theory. But I could be wrong as well, because not many Games might be able to reap the full benefits of NVlinK, because the same thing happened with SLI.
SLI bridges mostly used to have a bandwidth of 1GB/s (normal bridge), and 2GB/s (for the HB bridge), with a rough estimate. NVLink on Turing cards for example, can do 25GB/s one way, and or 50GB/s in total. But according to Nvidia, total bandwidth is 50GB/s one way, and 100GB/s total.
But all of this will only help, if GAMES are going to take advantage of this new multi-GPU feature, provided the Game developers also implement this.
I think the main advantage of Nvlink is that it might help with peer-to-peer interface, VRAM stacking, because essentially the GPUs are much closer together now, also bringing the latency of a GPU-to-GPU transfer way down, IMO.
So unlike SLI, where the latency had to go through PCIe as well as memory, Nvlink behaves in a different manner. We can think of it an app that looks at one GPU, and then looks at another GPU and does something else same time. So it seems NVlink will be the future when it comes to multi-GPU setup, but sadly ONLY on the high-end GPU market segment, as other cards will lack NVLINK support. I'm still not fully sure about how this DX12 VRAM stacking will work in future.
But again, like I said before, all of this will actually depend on how well the Game's ENGINE benefits from a future multi-GPU setup. Also, assuming NVLINK will also help with VRAM stacking, the 2 GPUS should support Split Frame rendering/SFR. Unlike the previous AFR mode used mostly in SLI, Alternate frame rendering that is, in which each GPU used it's own frame buffer/VRAM, and it never got added/stacked.
According to theory,
In AFR, each GPU renders each of the other frame (either the alternate Odd or Even).
In SFR, each GPU renders half of every frame. (top/bottom, or plane division).
So I think NVLINK should also help with VRAM stacking, though we need to see how this gets implemented fully in most of the Games, either in DX12 or VULKAN API mode. Apart from this, even the price of an NVLINK bridge is kind of high, so this can be a very expensive multi-GPU setup, and not many gamers might be able to afford these. I STILL prefer having a SINGLE powerful GPU on my rig, because a lot of games don't scale well on SLI/CFX.