Me too. I was thinking about ARC because of price, about AMD too, but they don't have as good and easy to use tools as nVidia, plus really that CUDA architecture is approx. 2.5x faster with same cores count than these two manufacturers. Intel has some openAPI attempts, been of few workshops, but...
So they put there which model? 13B? Edit - ok, 7B... And tokens per second? I was running 13B, quantified on Quadro P5000 with ~16t/s, looks like Intel is still far behind CUDA...
Well you can't. There is some memory even on chip, they said somewhere smth about speeds around 900GB/s, which is comparable to nvlink, but still - it's limited and for applications like this, most people will prefer switchable GPU.
If is my Spanish correct, it can match 3050 with FSR (sarcasm, I don't understand <Mod Edit> in Spanish).
https://www.youtube.com/live/b3Uel4LbvCs?si=VDxLsWcAhEZNdIr5
Well... now just count consumption of banks... and compare with this one... or BTC ;-) Love these nonsense articles by some Greta lovers (and Greta supports terroristm).
So when I have 26gbps pcie throughout on 3060 12gb, how much will be missing for 4060ti which has more CUDA cores etc? :-D Well, seems I'll skip 4xxx and we'll see about 5xxx :-/ hope AMD will have better AI support, because CUDA still rules...