[SOLVED] What Are The Major Differences Between Mellanox VPI And Mellanox Ethernet NICs?

Feb 12, 2022
38
0
30
Hello,

My company wanted to purchase a NIC with dual 100 GbE ports utilizing QSFP28 that must work with Windows 11 Pro.

From what I can see, Nvidia's Mellanox ConnectX-5 cards seem to meet my company's criteria.

However, the card that I was initially planning to purchase : MCX516A-CDAT


...is completely out of stock (and won't be in stock for months).

Whereas the following card (which seems similar) : MCX556A-EDAT


...is currently in stock.

That said, what are the major differences between the two cards, as it looks like the EDAT, which supports VPI, should work with both Ethernet and Infiniband.

Whereas the CDAT, only works with Ethernet (plus uses on PCIe 3.0 x16).

It seems like the EDAT offers more functionality/throughput, for the same price, but I'm worried that I'm missing something (as Nvidia would never offer more functionality for the same price 😉 ).

Thank you,
Nelson
 
Solution
Hello,

My company wanted to purchase a NIC with dual 100 GbE ports utilizing QSFP28 that must work with Windows 11 Pro.

From what I can see, Nvidia's Mellanox ConnectX-5 cards seem to meet my company's criteria.

However, the card that I was initially planning to purchase : MCX516A-CDAT


...is completely out of stock (and won't be in stock for months).

Whereas the following card (which seems similar) : MCX556A-EDAT


...is currently in stock.

That said, what are the major...
Hello,

My company wanted to purchase a NIC with dual 100 GbE ports utilizing QSFP28 that must work with Windows 11 Pro.

From what I can see, Nvidia's Mellanox ConnectX-5 cards seem to meet my company's criteria.

However, the card that I was initially planning to purchase : MCX516A-CDAT


...is completely out of stock (and won't be in stock for months).

Whereas the following card (which seems similar) : MCX556A-EDAT


...is currently in stock.

That said, what are the major differences between the two cards, as it looks like the EDAT, which supports VPI, should work with both Ethernet and Infiniband.

Whereas the CDAT, only works with Ethernet (plus uses on PCIe 3.0 x16).

It seems like the EDAT offers more functionality/throughput, for the same price, but I'm worried that I'm missing something (as Nvidia would never offer more functionality for the same price 😉 ).

Thank you,
Nelson
The only difference is exactly what you said. You can run Ethernet on the Infiniband card whereas the Ethernet card can only do Ethernet. For dual port 100GbE you will want to make sure the cards are PCIe 4.0 x16 if they aren't then you cannot get full bandwidth on both ports at once. PCIe 3.0 x16 = 128Gb bandwidth whereas PCIe 4.0 x16 = 256Gb bandwidth. Back with the Connect X 2 & 3 you needed to do a little work to change from Infiniband to Ethernet operation but it isn't very difficult. https://www.servethehome.com/change-mellanox-connectx3-vpi-cards-infiniband-ethernet-windows/

The better question is why do you need dual port 100GbE on a workstation. This level of bandwidth is for TOR switches, Hyperconverged storage, etc...not something you see on a desktop OS.
 
Solution
For background, this user has 40Gb switch they are trying to connect to -- https://forums.tomshardware.com/thr...d-nic-mellanox-compatibility-question.3749824
I assume a single port, at 40Gb will be used.
Ah good to know thank you. As someone who manages a data center that uses 100GbE TOR switches I was confused as to why someone would need that switch bandwidth on a desktop OS. I guess if you are using it as a test bed that can work.

I would like to give some notes to that thread in here since the other thread is solved. There was some bad information in that thread that needs to be sorted out so that OP has the correct information.
  1. You can use a 100GbE or 200GbE in a 40GbE environment as the higher speeds are backwards compatible. I am running a couple 100GbE ports with QSFP+ optics in it without issue. Actually using a QSFP+ > 4x SFP+ breakout cable. The NIC should auto negotiate by transceiver. If the switch was 100GbE connecting to 40GbE card you might have to set the port speed on the switch manually. However, that takes all of a couple seconds usually.
  2. Transceivers can be very expensive. If this is going to be less than 7m (23ft) away from the switch use Direct Attached Copper (DAC) cables instead. You actually get lower latency (albeit this isn't hugely different) than fibre with transceiver but is much less expensive. The biggest thing to remember is that Arista might have vendor locked the optics. Mellanox cards are not vendor locked and can use any branded optics. Therefore go to like fs.com and get DACs that are coded for Arista and you won't have any problems.
  3. The Intel x710 is not a good NIC in 2022. Even in 2017, when it was released, it wasn't very great. Intel has been behind the 8 ball in networking for almost a decade. First the card is PCIe 3.0 x8 which will not allow for full bandwidth dual port 40GbE operation (64Gb bandwidth with 80Gb needed). Second that card doesn't come with a lot of newer feature like RDMA Over Converged Ethernet (ROCE). The Mellanox Connect X 3 or later cards are all vastly superior to the Intel cards. In my DC I mainly have Mellanox Connect X4-LX cards with dual port 25GbE connecting via 100GbE > 4x 25GbE DAC breakout cables. Never had an issue with any of the Mellanox cards whereas I have had issues with some of the Broadcom 57414 cards.
 
Ah good to know thank you. As someone who manages a data center that uses 100GbE TOR switches I was confused as to why someone would need that switch bandwidth on a desktop OS. I guess if you are using it as a test bed that can work.
There are many people on this board that would not know that "TOR" stand for Top Of Rack ...
You want to talk expensive, genuine Cisco 100GE single mode !!! The 9500 chassis is relatively inexpensive until you put optics in it ....
I also recommended the Intel NIC to this user in another thread.
 
There are many people on this board that would not know that "TOR" stand for Top Of Rack ...
You want to talk expensive, genuine Cisco 100GE single mode !!! The 9500 chassis is relatively inexpensive until you put optics in it ....
I also recommended the Intel NIC to this user in another thread.
Anything Cisco ends up being a rip off. I highly recommend the Mellanox (now nVidia Networking) switches and cards. You do initial setup via the CLI and then any small management can be done with a very easy to use GUI (if you use their Onyx OS). You will probably end up spending much less overall on all components and have something of equal performance. One thing people don't realize is that Mellanox was the only 3 level manufacturer of networking equipment. They built the System on Chips for logic, sold their own branded stuff, and white labeled their stuff for people like Dell to resell. When you buy a data center switch, odds are the internals are going to be either Broadcom or Mellanox. There are a few other smaller players but not many.
 
Anything Cisco ends up being a rip off. I highly recommend the Mellanox (now nVidia Networking) switches and cards. You do initial setup via the CLI and then any small management can be done with a very easy to use GUI (if you use their Onyx OS). You will probably end up spending much less overall on all components and have something of equal performance. One thing people don't realize is that Mellanox was the only 3 level manufacturer of networking equipment. They built the System on Chips for logic, sold their own branded stuff, and white labeled their stuff for people like Dell to resell. When you buy a data center switch, odds are the internals are going to be either Broadcom or Mellanox. There are a few other smaller players but not many.
We have gotten this thread off topic. I will stop here.
 
Ah good to know thank you. As someone who manages a data center that uses 100GbE TOR switches I was confused as to why someone would need that switch bandwidth on a desktop OS. I guess if you are using it as a test bed that can work.

I would like to give some notes to that thread in here since the other thread is solved. There was some bad information in that thread that needs to be sorted out so that OP has the correct information.
  1. You can use a 100GbE or 200GbE in a 40GbE environment as the higher speeds are backwards compatible. I am running a couple 100GbE ports with QSFP+ optics in it without issue. Actually using a QSFP+ > 4x SFP+ breakout cable. The NIC should auto negotiate by transceiver. If the switch was 100GbE connecting to 40GbE card you might have to set the port speed on the switch manually. However, that takes all of a couple seconds usually.
  2. Transceivers can be very expensive. If this is going to be less than 7m (23ft) away from the switch use Direct Attached Copper (DAC) cables instead. You actually get lower latency (albeit this isn't hugely different) than fibre with transceiver but is much less expensive. The biggest thing to remember is that Arista might have vendor locked the optics. Mellanox cards are not vendor locked and can use any branded optics. Therefore go to like fs.com and get DACs that are coded for Arista and you won't have any problems.
  3. The Intel x710 is not a good NIC in 2022. Even in 2017, when it was released, it wasn't very great. Intel has been behind the 8 ball in networking for almost a decade. First the card is PCIe 3.0 x8 which will not allow for full bandwidth dual port 40GbE operation (64Gb bandwidth with 80Gb needed). Second that card doesn't come with a lot of newer feature like RDMA Over Converged Ethernet (ROCE). The Mellanox Connect X 3 or later cards are all vastly superior to the Intel cards. In my DC I mainly have Mellanox Connect X4-LX cards with dual port 25GbE connecting via 100GbE > 4x 25GbE DAC breakout cables. Never had an issue with any of the Mellanox cards whereas I have had issues with some of the Broadcom 57414 cards.

Hello Jeremy_83,

You've been a huge help, and have answered all my questions.

Just to confirm:

1. With the VPI card, I get both Infiniband and Ethernet (which is a bonus). If I need to switch to Ethernet, I just simply switch the card to Ethernet in Device Manager (as detailed in the following guide):

https://www.servethehome.com/change-mellanox-connectx3-vpi-cards-infiniband-ethernet-windows/

2. The Mellanox VPI (once switched to Ethernet) can work with QSFP+ tranceivers (at a max of 40 GbE - as my Arista switch has 6x 40 GbE ports).

3. If I want to switch to higher speeds (in the future), all I need to do is just put in some new QSFP28 tranceivers (and switch the card back to Infiniband - once my company purchases a new switch).

If my understanding is correct for items 1 - 3, then I'll stick with Mellanox ConnectX-5 Infinband/VPI card.

A huge thanks for all your help,
Nelson
 
Hello Jeremy_83,

You've been a huge help, and have answered all my questions.

Just to confirm:

1. With the VPI card, I get both Infiniband and Ethernet (which is a bonus). If I need to switch to Ethernet, I just simply switch the card to Ethernet in Device Manager (as detailed in the following guide):

https://www.servethehome.com/change-mellanox-connectx3-vpi-cards-infiniband-ethernet-windows/

2. The Mellanox VPI (once switched to Ethernet) can work with QSFP+ tranceivers (at a max of 40 GbE - as my Arista switch has 6x 40 GbE ports).

3. If I want to switch to higher speeds (in the future), all I need to do is just put in some new QSFP28 tranceivers (and switch the card back to Infiniband - once my company purchases a new switch).

If my understanding is correct for items 1 - 3, then I'll stick with Mellanox ConnectX-5 Infinband/VPI card.

A huge thanks for all your help,
Nelson
All of those are correct. For #3 Odds are you won't be getting an Infiniband switch, Infiniband is mainly used in HPC/Super Computers, so I would keep it at Ethernet.
 
All of those are correct. For #3 Odds are you won't be getting an Infiniband switch, Infiniband is mainly used in HPC/Super Computers, so I would keep it at Ethernet.

Hello JeremyJ_83,

I read a few articles online that seemed to contradict one of your statements.

I wanted to verify with you if you thought the article below was correct or not.

https://medium.com/@julydd/difference-between-qsfp-qsfp-qsfp28-8df90f3b69a0

You had mentioned that QSFP+ transceivers can be input into QSFP28 ports (e.g., inputting a QSFP+ transceiver into the port of the Mellanox NIC - so that the NIC can connect to my Arista switch which at most supports QSFP+), yet the article above states the following:

"Usually QSFP28 modules can’t break out into 10G links. But it’s another case to insert a QSFP28 module into a QSFP+ port if switches support. At this situation, a QSFP28 can break out into 4x10G like a QSFP+ transceiver module. One thing to note is that you can’t put a QSFP+ transceiver into a QSFP28 port to avoid destroying your optics."

I could've sworn you mentioned that QSFP+ transceivers can be plugged into QSFP28 port (on the Mellanox NIC), but the article above seems to contradict this.

Is this correct (as this would be a big deal for my company - as we don't yet have a switch capable of handling QSFP28 optics - only QSFP+)?

Thank you,
Nelson
 
Last edited:
Hello JeremyJ_83,

I read a few articles online that seemed to contradict one of your statements.

I wanted to verify with you if you thought the article below was correct or not.

https://medium.com/@julydd/difference-between-qsfp-qsfp-qsfp28-8df90f3b69a0

You had mentioned that QSFP+ transceivers can be input into QSFP28 ports (e.g., inputting a QSFP+ transceiver into the port of the Mellanox NIC - so that the NIC can connect to my Arista switch which at most supports QSFP+), yet the article above states the following:

"Usually QSFP28 modules can’t break out into 10G links. But it’s another case to insert a QSFP28 module into a QSFP+ port if switches support. At this situation, a QSFP28 can break out into 4x10G like a QSFP+ transceiver module. One thing to note is that you can’t put a QSFP+ transceiver into a QSFP28 port to avoid destroying your optics."

I could've sworn you mentioned that QSFP+ transceivers can be plugged into QSFP28 port (on the Mellanox NIC), but the article above seems to contradict this.

Is this correct (as this would be a big deal for my company - as we don't yet have a switch capable of handling QSFP28 optics - only QSFP+)?

Thank you,
Nelson
That article from medium is incorrect or worded incorrectly. You can plug QSFP+ optics into QSFP28 port and it will run at 40Gb, might need to be set via the switch. However, you cannot plug a QSFP28 into a QSFP+ port. This is exactly the same as an SFP+ port taking SFP but not an SFP port taking SFP+.
https://www.qsfptek.com/article/sfp-sfp-plus-sfp28-qsfp-qsfp28-compatibility

I personally have some QSFP+ DAC Breakout cables plugged into my Mellanox SN2100 switch. The 100Gb port was set to run in 4x 25GbE mode and each lane is set to run at 10Gb. Our servers all have 25Gb NICs but the router and NAS only have 10Gb NICs. The 10Gb works without issue.
 
That article from medium is incorrect or worded incorrectly. You can plug QSFP+ optics into QSFP28 port and it will run at 40Gb, might need to be set via the switch. However, you cannot plug a QSFP28 into a QSFP+ port. This is exactly the same as an SFP+ port taking SFP but not an SFP port taking SFP+.
https://www.qsfptek.com/article/sfp-sfp-plus-sfp28-qsfp-qsfp28-compatibility

I personally have some QSFP+ DAC Breakout cables plugged into my Mellanox SN2100 switch. The 100Gb port was set to run in 4x 25GbE mode and each lane is set to run at 10Gb. Our servers all have 25Gb NICs but the router and NAS only have 10Gb NICs. The 10Gb works without issue.

Hello JeremyJ_83,

So just to recap; my setup is the following.

1. I have an Arista switch with QSFP+ ports.

2. I am building a new system which will have the Mellanox VPI card (with QSFP28 ports).

3. From you've said, can I assume that in order to connect the Mellanox to my Arista switch, I can simply plug in a QSFP+ transceiver into the Mellanox VPI card (making sure to first switch the Mellanox to Ethernet).

4. And then connect the Mellanox to the Arista switch (as both will now be using QSFP+ transceivers), and everything should work?

If so, I just wanted to be absolutely certain (as I'm not an Network Engineer (really just a QA Engineer)) and no one in my company knows for certain whether this will work.

In fact, you're pretty much 1 of 2 people I could find online (after posting on multiple forums), that has provided a definitive answer.

If this works and my understanding is correct, I can't thank you enough.

If this doesn't work, I still thank you for all your help, but will likely need to search for an alternative solution.

Regards,
Nelson
 
Hello JeremyJ_83,

So just to recap; my setup is the following.

1. I have an Arista switch with QSFP+ ports.

2. I am building a new system which will have the Mellanox VPI card (with QSFP28 ports).

3. From you've said, can I assume that in order to connect the Mellanox to my Arista switch, I can simply plug in a QSFP+ transceiver into the Mellanox VPI card (making sure to first switch the Mellanox to Ethernet).

4. And then connect the Mellanox to the Arista switch (as both will now be using QSFP+ transceivers), and everything should work?

If so, I just wanted to be absolutely certain (as I'm not an Network Engineer (really just a QA Engineer)) and no one in my company knows for certain whether this will work.

In fact, you're pretty much 1 of 2 people I could find online (after posting on multiple forums), that has provided a definitive answer.

If this works and my understanding is correct, I can't thank you enough.

If this doesn't work, I still thank you for all your help, but will likely need to search for an alternative solution.

Regards,
Nelson
Your steps are correct. Be sure that the transceivers you get are coded for Arista though as they might vendor lock. Mellanox did not vendor lock before nVidia bought them and I believe they still do not since they are all about Open Ethernet. Do remember that if distance between switch and NIC is going to be less than 7m (22ft) it will be cheaper and easier to get Direct Attached Copper (DAC) cables. I have had very good luck with fs.com (https://www.fs.com/c/40g-qsfp-dac-1117) DAC in my data center and they do allow for vendor encoding.

When your company eventually goes to either 100GbE or 200GbE, I highly recommend the Mellanox switches running their Onyx OS. Onyx is the best switch GUI I have used so far. The only thing I had to do on the CLI (which uses Cisco commands) was initial setup, run the command to split our 100GbE ports into 4x 25GbE, and setup RoCE/lossless Ethernet (used for 25GbE iSCSI hyper-converged storage). Everything else is fast and easy from the GUI.
 
Your steps are correct. Be sure that the transceivers you get are coded for Arista though as they might vendor lock. Mellanox did not vendor lock before nVidia bought them and I believe they still do not since they are all about Open Ethernet. Do remember that if distance between switch and NIC is going to be less than 7m (22ft) it will be cheaper and easier to get Direct Attached Copper (DAC) cables. I have had very good luck with fs.com (https://www.fs.com/c/40g-qsfp-dac-1117) DAC in my data center and they do allow for vendor encoding.

When your company eventually goes to either 100GbE or 200GbE, I highly recommend the Mellanox switches running their Onyx OS. Onyx is the best switch GUI I have used so far. The only thing I had to do on the CLI (which uses Cisco commands) was initial setup, run the command to split our 100GbE ports into 4x 25GbE, and setup RoCE/lossless Ethernet (used for 25GbE iSCSI hyper-converged storage). Everything else is fast and easy from the GUI.

Hello JeremyJ_83,

That's great news to hear that the article was incorrect!

Also, will do on the switch. If I can push my company to a Mellanox switch, I will (though that upgrade is likely a few years away).

Thank you for all your help,
Nelson