News Intel demonstrates PyTorch AI optimizations for accelerating large language models on its Arc Alchemist GPUs

Admin · Feb 26, 2024

Intel shows how you can run Llama 2 on an Arc A770 GPU, using its PyTorch optimizations.

Intel demonstrates PyTorch AI optimizations for accelerating large language models on its Arc Alchemist GPUs : Read more

H4UnT3R · Feb 26, 2024

So they put there which model? 13B? Edit - ok, 7B... And tokens per second? I was running 13B, quantified on Quadro P5000 with ~16t/s, looks like Intel is still far behind CUDA...

CmdrShepard · Feb 27, 2024

H4UnT3R said:
So they put there which model? 13B? Edit - ok, 7B... And tokens per second? I was running 13B, quantified on Quadro P5000 with ~16t/s, looks like Intel is still far behind CUDA...

I don't think CUDA has anything to do with that, you simply have faster GPU even if a bit long in the tooth.

Perhaps they can optimize that further, but a 3rd player in GPU landscape is sorely needed so I hope they succeed.

H4UnT3R · Feb 27, 2024

CmdrShepard said:
I don't think CUDA has anything to do with that, you simply have faster GPU even if a bit long in the tooth.

Perhaps they can optimize that further, but a 3rd player in GPU landscape is sorely needed so I hope they succeed.

Me too. I was thinking about ARC because of price, about AMD too, but they don't have as good and easy to use tools as nVidia, plus really that CUDA architecture is approx. 2.5x faster with same cores count than these two manufacturers. Intel has some openAPI attempts, been of few workshops, but still far behind nVidia and their tools and developer help.

Search

News Intel demonstrates PyTorch AI optimizations for accelerating large language models on its Arc Alchemist GPUs

Admin

Administrator

H4UnT3R

Prominent

CmdrShepard

Prominent

H4UnT3R

Prominent

TRENDING THREADS

Latest posts

Moderators online

Share this page