[SOLVED] Maximum GPUs a CPU can handle concurrently

invulnarable27

Distinguished
Jan 26, 2011
76
0
18,630
Assuming risers can be used if necessary to stack as many GPU's onto the motherboard as the pcie lanes. The design (ml training) is as follows:

CPU: AMD EPYC Rome 7302P​
-128 Lanes PCIe 4.0​
Mobo: AsRock Rack ROMED8-2T​
-Supports 7 x PCIe 4.0 x16​
-GPUs can be any model​

FIRST: Is it safe to say we can run 7 GPU's (max pcie lanes on mobo) and utilize ALL of them at max capacity speeds (x16 at pcie 4.0)? Are there any bottlenecks concerning the hardware?
7 gpus * x16 pcie speeds = 112 lanes

Is this equation an accurate estimate of the capacity of a CPU?

SECOND: If I run the GPU's at a slower speed, say pcie x8, can we double the amount of GPU's theoretically, provided a motherboard had enough pcie x8 slots (x8 to x16 converter used), so the total sum of pcie lanes still equaled 112 like above?
 
Solution
It seems MS says there is no limit. At least for displays.

https://answers.microsoft.com/en-us...-support/d0501ed3-a4bd-44c0-98f9-95f091223748

You can go and 7 pcie cards by using usb ones. You can also use display port hubs. Or splitters if the info can be the same.

What makes you ask? I've been taking about displays while you mentioned cards. If you are wanting them to compute only, then yes, risers are the way to go. Doesn't matter if you go from 16x to 8x, or even 4x. You are limited by the number of slots. If you are doing compute work there won't be a lot if difference between 16x and 4x. But i really feel the answer depends on what you are trying to do

4745454b

Titan
Moderator
It seems MS says there is no limit. At least for displays.

https://answers.microsoft.com/en-us...-support/d0501ed3-a4bd-44c0-98f9-95f091223748

You can go and 7 pcie cards by using usb ones. You can also use display port hubs. Or splitters if the info can be the same.

What makes you ask? I've been taking about displays while you mentioned cards. If you are wanting them to compute only, then yes, risers are the way to go. Doesn't matter if you go from 16x to 8x, or even 4x. You are limited by the number of slots. If you are doing compute work there won't be a lot if difference between 16x and 4x. But i really feel the answer depends on what you are trying to do
 
Solution

invulnarable27

Distinguished
Jan 26, 2011
76
0
18,630
The hardware specs are purely for computing/training models for machine learning. I wanted to avoid expensive 10G+ switches to pass data back and forth and incurring network overhead when building multiple servers (software handles the coordination).

Just wanted to make sure my math was adding up in regards to the CPU addressing pcie lanes to GPU ratio, and running at full capacity. Avoid a potential situation where CPU could only address 64 pcie lanes (4 gpu's max) and I added too many gpu's that they were just idle.