News Nvidia and Mistral AI's super-accurate small language model works on laptops and PCs

Status
Not open for further replies.
I mean llama 3, 3.1 and most 8b models will run on cpu with 0 issues. I am literally running these on a orange pi 5 plus for fun. if they get model switching working so it can load whisperai, then unload it or be more efficient. I can then load a 8b model. process the data. unload the model. then load a xtts model output voice to the user and repeat. all on 8gb. my orangepi5plus has 16GB of ram. so I dont need to offload whisper the model or xtts but the cpu bottleneck even at 6TOPS is painfully slow at this time. [lets also be honest here, anyone can run a 4bit quantized model. even most toasters]
 
Status
Not open for further replies.