News Nvidia Chat with RTX runs a ChatGPT-style application on your GPU that works with your local data — RTX 30-series or later required

Admin · Feb 14, 2024

Chat with RTX is a new Nvidia application that brings locally running ChatGPT-like features to RTX PCs and laptops. You can also tune the responses with your own data.

Nvidia Chat with RTX runs a ChatGPT-style application on your GPU that works with your local data — RTX 30-series or later required : Read more

Notton · Feb 14, 2024

"The only real drawback with "Chat with RTX" is its sky-high system requirements and huge download size, particularly when compared with cloud solutions."

Not once does the article mention what that size would be.

salgado18 · Feb 14, 2024

I think this is the best future of AI as a personal assistant: running locally. Nvidia has tensor cores, AMD has AI accelerators in their CPUs, 8GB of VRAM is not much anymore, and "huge download size" feels like less than most newer games. In one or two years all of these will be common in most PCs and notebooks, and in a couple more they will be low specs.

edzieba · Feb 14, 2024

RTX 20-series owners are out of luck as well. It's not entirely clear why the original Turing architecture GPUs aren't supported, though Nvidia suggested time constraints were a factor and that it could make its way to 20-series cards in the future.

The same happened with DLSS video upscaling (RTX Video), Turing support was added 8 months after launch.

brandonjclark · Feb 14, 2024

Notton said:
"The only real drawback with "Chat with RTX" is its sky-high system requirements and huge download size, particularly when compared with cloud solutions."

Not once does the article mention what that size would be.

35.1GB

JarredWaltonGPU · Feb 14, 2024

Notton said:
"The only real drawback with "Chat with RTX" is its sky-high system requirements and huge download size, particularly when compared with cloud solutions."

Not once does the article mention what that size would be.

I've added a clarification. It's 35GB, plus some other stuff gets downloaded during the installation phase. It's not clear how much additional data was downloaded, probably at least 4GB (there were two ~1.5GB LLMs).

palladin9479 · Feb 14, 2024

This is the start of where AI will actually be useful, as a local assistant to the owner. We absolutely do not want it phoning mothership and telling some guy in India everything it's worked on and everything you've been doing with it.

salgado18 · Feb 14, 2024

JarredWaltonGPU said:
I've added a clarification. It's 35GB, plus some other stuff gets downloaded during the installation phase. It's not clear how much additional data was downloaded, probably at least 4GB (there were two ~1.5GB LLMs).

I wouldn't call that huge. That's half of what most new games require, and it's an entire AI system. Especially in preview, it's an acceptable size. Even Photoshop asks for 20Gb of free space (not the program size, but minimum requirement). With 1TB SSDs becoming cheaper and more available, it's large, but not huge.

And you can't fairly compare it to cloud, right? Everything cloud is not on your PC, so it's expected to be small, or even nonexistent, like web pages.

JarredWaltonGPU · Feb 14, 2024

salgado18 said:
I wouldn't call that huge. That's half of what most new games require, and it's an entire AI system. Especially in preview, it's an acceptable size. Even Photoshop asks for 20Gb of free space (not the program size, but minimum requirement). With 1TB SSDs becoming cheaper and more available, it's large, but not huge.

And you can't fairly compare it to cloud, right? Everything cloud is not on your PC, so it's expected to be small, or even nonexistent, like web pages.

For what it does, and for Nvidia's ~300Mbps speed cap (I swear, for such a big company, downloads should be much faster!), it's pretty huge. I suspect a big part of the issue is that it includes a bunch of Python repository data. I would expect only about 2GB is required once everything is finished compiling.

abufrejoval · Feb 14, 2024

I have been running both Llama-2 and Mistral models in any size and combination available from Hugging face for since months so I don't expect this "packaged" variant to be all that different.

But it's still very interesting to see that Nvidia is offering this to the wide public, when there were times when non-gaming use of consumer hardware was something they fought with a vengance.

It is obviously something people have done privately or not so privately for many months and in China evidently a whole industry developed around being able to do that even with multiple GPUs at once, even if all of these consumer variants were originally made "extra side" so nobody could fit more than one into an ordinary desktop.

With 8GB as a minimum spec, I'd be expecting this to be 7B models, the old "golden middle" of 35B Llama[-1] models that used to just fit at 4 bit quantization into the 24GB of an 3090 or 4090 get left out for Llama-2... for some reason, while 70B only fits with something like 2-bit quantization, which hallucinates too much for any use.

Honestly, it's not been that much better with larger quantizations or even FP16 on 7B models like Mistral, which were hyped quite a bit but failed rather badly when I tried to use them.

For me only models you can run locally are interesting at all, I refuse to use any cloud model voluntarily or unvoluntarily (Co-Pilot).

I can't say that I'm extremely interested in models that are way smarter than I, for me the level of a reasonably trained domestic would be just fine: it's mostly about them doing the chores I find boring, not about them trying to tell me how to fix my life.

But the big issue there is that I found them to be relatively useless, prone to hallucinate with the very same high degree of confidence as when they hit the truth. They have no notion of when they are wrong and currently simply won't do the fact checking we all do when we come up with our "intuitive" answers.

And their ability to be trained on the type of facts I'd really want my domestics to not ever get wrong, that's neither there nor all that reliable, either.

I hope y'all go ahead and try your best to get any use out of this, the earlier this bubble of outrageous overestimation of capabilities bursts, the better for the planet and its people.

At least you can still game with your GPU, much more difficult to do something useful with bitcoin ASICs...

JarredWaltonGPU · Feb 14, 2024

abufrejoval said:
I have been running both Llama-2 and Mistral models in any size and combination available from Hugging face for since months so I don't expect this "packaged" variant to be all that different.

But it's still very interesting to see that Nvidia is offering this to the wide public, when there were times when non-gaming use of consumer hardware was something they fought with a vengance.

It is obviously something people have done privately or not so privately for many months and in China evidently a whole industry developed around being able to do that even with multiple GPUs at once, even if all of these consumer variants were originally made "extra side" so nobody could fit more than one into an ordinary desktop.

With 8GB as a minimum spec, I'd be expecting this to be 7B models, the old "golden middle" of 35B Llama[-1] models that used to just fit at 4 bit quantization into the 24GB of an 3090 or 4090 get left out for Llama-2... for some reason, while 70B only fits with something like 2-bit quantization, which hallucinates too much for any use.

Honestly, it's not been that much better with larger quantizations or even FP16 on 7B models like Mistral, which were hyped quite a bit but failed rather badly when I tried to use them.

For me only models you can run locally are interesting at all, I refuse to use any cloud model voluntarily or unvoluntarily (Co-Pilot).

I can't say that I'm extremely interested in models that are way smarter than I, for me the level of a reasonably trained domestic would be just fine: it's mostly about them doing the chores I find boring, not about them trying to tell me how to fix my life.

But the big issue there is that I found them to be relatively useless, prone to hallucinate with the very same high degree of confidence as when they hit the truth. They have no notion of when they are wrong and currently simply won't do the fact checking we all do when we come up with our "intuitive" answers.

And their ability to be trained on the type of facts I'd really want my domestics to not ever get wrong, that's neither there nor all that reliable, either.

I hope y'all go ahead and try your best to get any use out of this, the earlier this bubble of outrageous overestimation of capabilities bursts, the better for the planet and its people.

At least you can still game with your GPU, much more difficult to do something useful with bitcoin ASICs...

It doesn't parse the tables for meaning, basically, so if I never did a text comparison of the 7900 XT and the 4080 Super, it just hallucinates an answer. LOL

Search

News Nvidia Chat with RTX runs a ChatGPT-style application on your GPU that works with your local data — RTX 30-series or later required

Admin

Administrator

Notton

Estimable

salgado18

Distinguished

edzieba

Distinguished

brandonjclark

Distinguished

JarredWaltonGPU

Senior GPU Editor

palladin9479

Splendid

salgado18

Distinguished

JarredWaltonGPU

Senior GPU Editor

abufrejoval

Reputable

JarredWaltonGPU

Senior GPU Editor

TRENDING THREADS

Latest posts

Moderators online

Share this page