I need to build several deep learning servers for a combined CPU + cuDNN workload. Each server needs to have as much CPU power as possible and at least one Nvidia Turing card, so I have my eyes on dense Epyc 2u4n servers such as the Gigabyte H261-Z60, the Supermicro 2123BT-HNR, or the Cisco USC C4200:
https://www.gigabyte.com/Hyper-Converged-System/H261-Z60-rev-100#ov
https://www.cisco.com/c/en/us/produ...s-c4200-series-rack-server-chassis/index.html
Each of these servers has 4 nodes, with each node having 2 low-profile half-length PCI-e x16 slots near the rear. For example, this is what each node on the Gigabyte H261-Z60 looks like:
https://static.gigabyte.com/Product/107/6666/2018061913461032_src.png
My question is: can I put one or two Tesla T4 cards in those rear PCI-e x16 slots? These are low-profile passively cooled GPGPU cards with a TDP of 75W:
https://www.nvidia.com/content/dam/...ter/tesla-t4/t4-tensor-core-product-brief.pdf
In theory these cards should work there as there is enough space, PCI-e lanes and power for it. My main concern is with cooling - there should be reasonable airflow through each card but the air will be warm from having passed through two 170W cpus.
For what it's worth, Gigabyte, Supermicro and Cisco are all careful to avoid any mention of using those slots for GPUs. I emailed Gigabyte and got the following predictable answer:
Can someone here tell me if this setup will work, albeit unsupported, or if there are real concerns about cooling or something else?
Also, more broadly, I wonder why the Tesla T4 isn't certified for these types of systems. Isn't that the entire point of it? Every use of the Tesla T4 which I've seen was in spaces where bigger graphics cards would have worked just as well, such as:
https://www.anandtech.com/show/1361...-t4-a-supermicro-solution-with-320-pcie-lanes
What is the purpose of those Tesla T4 cards if not for use in dense servers?
https://www.gigabyte.com/Hyper-Converged-System/H261-Z60-rev-100#ov
2123BT-HNR | 2U | A+ Servers | Products | Super Micro Computer, Inc.
The BigTwin AS-2123BT-HNR system is a multi-node server design optimized for hyper-converged infrastructure, flash storage with reduced data center TCO.
www.supermicro.com
Each of these servers has 4 nodes, with each node having 2 low-profile half-length PCI-e x16 slots near the rear. For example, this is what each node on the Gigabyte H261-Z60 looks like:
https://static.gigabyte.com/Product/107/6666/2018061913461032_src.png
My question is: can I put one or two Tesla T4 cards in those rear PCI-e x16 slots? These are low-profile passively cooled GPGPU cards with a TDP of 75W:
https://www.nvidia.com/content/dam/...ter/tesla-t4/t4-tensor-core-product-brief.pdf
In theory these cards should work there as there is enough space, PCI-e lanes and power for it. My main concern is with cooling - there should be reasonable airflow through each card but the air will be warm from having passed through two 170W cpus.
For what it's worth, Gigabyte, Supermicro and Cisco are all careful to avoid any mention of using those slots for GPUs. I emailed Gigabyte and got the following predictable answer:
Dear Customer,
We cannot support using T4 inference card in our Computing designed system.
It is of course applicable but please understand we have not done any thermal testing and cannot provide support if issues arise related with the T4 card inside a H261-Z60.
Test at your own risk.
Best Regards,
Can someone here tell me if this setup will work, albeit unsupported, or if there are real concerns about cooling or something else?
Also, more broadly, I wonder why the Tesla T4 isn't certified for these types of systems. Isn't that the entire point of it? Every use of the Tesla T4 which I've seen was in spaces where bigger graphics cards would have worked just as well, such as:
https://www.anandtech.com/show/1361...-t4-a-supermicro-solution-with-320-pcie-lanes
What is the purpose of those Tesla T4 cards if not for use in dense servers?