Pliops claims its XDP LightningAI card and FusIOnX software accelerate large language model inference by offloading context data to SSDs, reducing redundant computation, and boosting vLLM throughput by up to eight times while avoiding the need for additional GPUs.
Pliops expands AI's context windows with 3D NAND-based accelerator – can accelerate certain inference workflows by up to eight times : Read more
Pliops expands AI's context windows with 3D NAND-based accelerator – can accelerate certain inference workflows by up to eight times : Read more