I have been running both Llama-2 and Mistral models in any size and combination available from Hugging face for since months so I don't expect this "packaged" variant to be all that different.
But it's still very interesting to see that Nvidia is offering this to the wide public, when there were times when non-gaming use of consumer hardware was something they fought with a vengance.
It is obviously something people have done privately or not so privately for many months and in China evidently a whole industry developed around being able to do that even with multiple GPUs at once, even if all of these consumer variants were originally made "extra side" so nobody could fit more than one into an ordinary desktop.
With 8GB as a minimum spec, I'd be expecting this to be 7B models, the old "golden middle" of 35B Llama[-1] models that used to just fit at 4 bit quantization into the 24GB of an 3090 or 4090 get left out for Llama-2... for some reason, while 70B only fits with something like 2-bit quantization, which hallucinates too much for any use.
Honestly, it's not been that much better with larger quantizations or even FP16 on 7B models like Mistral, which were hyped quite a bit but failed rather badly when I tried to use them.
For me only models you can run locally are interesting at all, I refuse to use any cloud model voluntarily or unvoluntarily (Co-Pilot).
I can't say that I'm extremely interested in models that are way smarter than I, for me the level of a reasonably trained domestic would be just fine: it's mostly about them doing the chores I find boring, not about them trying to tell me how to fix my life.
But the big issue there is that I found them to be relatively useless, prone to hallucinate with the very same high degree of confidence as when they hit the truth. They have no notion of when they are wrong and currently simply won't do the fact checking we all do when we come up with our "intuitive" answers.
And their ability to be trained on the type of facts I'd really want my domestics to not ever get wrong, that's neither there nor all that reliable, either.
I hope y'all go ahead and try your best to get any use out of this, the earlier this bubble of outrageous overestimation of capabilities bursts, the better for the planet and its people.
At least you can still game with your GPU, much more difficult to do something useful with bitcoin ASICs...