News Positron AI says its Atlas accelerator beats Nvidia H200 on inference in just 33% of the power — delivers 280 tokens per second per user with Llama...

ZzZzZ Give me a PCIe expansion card (not 16x as that is already used by the graphics card), that can do 1/10 of that, and used 1/10 of the power (or rater, under the power limit of the PCIe slot). And with an affordable price, then i may consider it.

No need for more than that for my desktop as i'm not intending to build a datacentre.

BUT for the datacentre people, i'm sure they will be interested in that!
 
For us consumers, not data centers, being able to fine tune is where the real gold is at. If you can only run a 32b model on your dual 3090s via SLI, it better be fined tuned for your task otherwise Cline or OpenHands isn't going to be much worth to you.

From what I read as a layman, it seems these chips are hyper optimized for inference only, and probably aren't useful for training/tuning.
 
The article makes no note of it, but this first generation accelerator from Positron is not actually an ASIC - it is an FPGA-based accelerator. I was trying to figure out how a company that was founded only 2 years ago with a couple dozen engineers is already sampling cutting-edge ASIC hardware - it makes sense knowing that it is actually an FPGA platform with physical hardware produced presumably by AMD/Xilinx or Intel/Altera.

If Positron's metric claims are true, it is interesting to see how much more efficient their logical architecture is for inference than Nvidia's, especially given that FPGAs have a significant physical handicap in density and efficiency versus true ASICs and mainstream GPUs+CPUs.
 
If Positron's metric claims are true, it is interesting to see how much more efficient their logical architecture is for inference than Nvidia's, especially given that FPGAs have a significant physical handicap in density and efficiency versus true ASICs and mainstream GPUs+CPUs.
The newer ones have hard-wired arithmetic pipelines, which goes some ways towards eliminating the efficiency deficit of FPGA vs. ASIC.

Furthermore, the XDNA NPUs in AMD's current laptop chips are descended very closely from the design lineage of Xilinx Versal cores. You can see them discussed and analyzed down in the middle of this post:
 
...hands Positron nail and hammer.. while handing Nvidia a coffin.

ASICs will win in the end.
Here's hoping that the current brain-dead brute force approach eschewed by Nvidia, Microsoft, Meta, Telsa etc, is soon outlawed just on the basis of electricity consumption, allowing a new bunch of smarter players to thrive.
 
This market is just too capital rich for me to trust any claims of such a huge leap over the leader in the market. I will wait to see how it works, and if it can scale. i bet that will be the issue. Per system it's there but the whole infrastructure you need to run it I wonder if it has the good. Also, every time I see one of these it turns out to be a niche case or something.

I hope the efficiency is true, but I just don't trust this type of claim in this type of market.