Question What is the best GPU now for NLP analysis?


Oct 11, 2015
I want to create a desktop environment where I can handle Natural Language Processing with large data.

I have 256GB RAM and motherboard enough to handle that RAM

Motherboard: ROG RAMPAGE VI Extreme Encore

Current GPU: NVIDIA GeForce GTX 1660 Super

ChatGPT said the following for the best GPU for NLP as of Sep 2021. What is the best GPU for NLP as of Sep 2023?

As of my last knowledge update in September 2021, selecting the "best" GPU for NLP (Natural Language Processing) can depend on various factors, including your budget, specific NLP tasks, and whether you also plan to use the GPU for other machine learning or gaming purposes. However, I can provide you with some recommendations based on GPUs available up to that point:

  1. NVIDIA GeForce RTX 3090: At the time, the RTX 3090 was one of the most powerful consumer GPUs available. It offered a substantial amount of VRAM (24GB), which is essential for handling large NLP models and datasets.
  2. NVIDIA GeForce RTX 3080: The RTX 3080 was another high-performance GPU with 10GB of VRAM, making it a more budget-friendly option compared to the RTX 3090 while still providing excellent NLP performance.
  3. NVIDIA A100: While primarily designed for data center and enterprise use, the A100 was a powerhouse GPU for AI workloads, including NLP. It had a massive 40GB of high-bandwidth memory (HBM2) and offered exceptional performance.
  4. NVIDIA Titan RTX: The Titan RTX was known for its combination of gaming and professional-grade capabilities, making it a solid choice for NLP tasks. It had 24GB of VRAM.
  5. AMD Radeon RX 6900 XT: Although NVIDIA GPUs are often favored for NLP, AMD's Radeon RX 6900 XT offered competitive performance and 16GB of GDDR6 memory, making it suitable for many NLP tasks.
Please note that the GPU market can change rapidly due to new releases and availability. It's essential to check for the latest GPU models and reviews to see which one best suits your needs and fits your budget. Additionally, consider factors like driver support, compatibility with deep learning frameworks (e.g., TensorFlow, PyTorch), and any specialized hardware features that might benefit NLP workloads.
Last edited:


Depends on your budget. The best system I know of would be the DGX A100,
Each DGX A100 provides:

Five petaflops of performance
Eight A100 Tensor Core GPUs with 40GB memory
Six NVSwitches for 4.8TB bi-directional bandwidth
Nine Mellanox Connectx-6 network interfaces with 450GB/s bi-directional bandwidth
Two 64-core AMD CPUs for deep learning framework coordination, boot, and storage
1TB system memory
2x 1.92TB M.2 NVME drives for OS storage and 15TB SSD storage.

That's important in its design because a high powered gpu and large gpu memory and bandwidth is useless without supporting equipment. It runs on Obuntu Linux, so ordinarily would not need huge amounts of system ram for OS usage, however it ends up needing a lit of system ram because of the file sizes in use. Ram is on-demand memory, a file gets uploaded to ram, which holds it until the cpu needs it, so the only real lag is the time it takes for the ram to dump the file into the cpu. If the file is too large, it gets queued, so now there is a reliance on the storage speed to ram, ram processing, ram dumping to cpu, just to get the file in. That's why games generally don't require huge ram, the individual files are tiny, a few Kb to a few Mb at most. Legal documents and photos etc can easily reach Gb sizes per individual file.

So yes, you'll need a bunch of system ram, or the NLP will start taking far longer per process to complete.

With gpus and large file use, core speed is not as important as core count. Cpu cores generally work in series, limited cores pushing lots of data through, Gpu cores work in parallel, lots of cores simultaneously processing the data in small chunks, so with a cpu speed is important because 1 core is dealing with the entire thread, whereas with a gpu a thousand cores are dealing with 1/1000th of a thread.

Think of it like 1 man has to move 10 rocks 20ft away, the completion time is dependent on how fast he can make those 10 trips with the rocks and back. That's a cpu. A gpu is 10 guys each picking up a rock and moving it the 20 ft, the completion time depending on who made the slowest single trip. It's going to be exponentially faster with the 10 guys unless one of them is grandpa on a walker.

Which is why it's important to have the supporting hardware or no matter how much potential the gpu actually has, if it's bandwidth or transmission limited, those 10 guys just became 10 guys with walkers.

For just deep learning, the A100 or 4090 or 7900XTX are the best as they have the highest amount of vram and the highest core counts, by default processing more data simultaneously, which shortens processing time. The 4090/7900XTX has the advantage in smaller file type use, the A100 has the advantage in larger file type use, but that's only at extreme use limits,.
Last edited:
  • Like
Reactions: mujmuj