• Happy holidays, folks! Thanks to each and every one of you for being part of the Tom's Hardware community!

[SOLVED] Understanding PCIe Lanes and Performance

Jan 16, 2014
7
0
10,510
Hello, I've tried googling around for this answer and specifically checking these forums for information on my situation. The results have been at best conflicting and at worst unhelpful. Specs: i9-9940X, Gigabyte X299 Designare EX, Gigabyte RTX2080 Ti, Blackmagic Design Decklink Quad 2, Sonnet 10G network card.

These machines (we have 4 of them) are used as local video servers in our video production studio. We use software that handles video transport to/from different cast members/clients etc. I am constantly on the search for ways to maximize their performance (though I don't venture into overclocking, it's unfamiliar territory for me and I can't risk my career on learning it. Stability is more important than raw power).

In the past we have had situations where our production software bogs down and starts to behave badly. In doing a little research for something unrelated, I came to learn that it appears our GPUs are running at x8 instead of x16. I don't have a thorough understanding of how PCIe lanes work, but I do know that items like the decklink card/10G network card will demand some of those lanes. However, I also was under the impression that I could divide those cards up between appropriate PCIe slots on the mobo in order to get maximum performance from each. I had also read that it's possible HWiNFO is only displaying the current usage and that it might increase under load. I tried to simulate that and didn't have any change in results. I know this CPU supports 44 lanes and that some of those are reserved for SATA and other things like that... but if that's the case, what is the point of a motherboard that purportedly supports (2) PCIe 3.0 x16 slots? Shouldn't my GPU in one slot be running at x16 and the Decklink Quad (which is x8) be running at x8 in the other 3.0 x16 slot? Do I have a fundamental misunderstanding of how all this works?

I have also read that there would be no noticeable difference between a GPU running at x8 v x16, which I'm sure may be true for a gaming application, but would that also be true in my use case of encoding/decoding multiple video streams simultaneously?

I know there is a lot here, and I kinda word-vomitted my confusion at you all... but I would really like to understand how all this works and what is going on here. Thanks!
 
Solution
Ok, so this is along the lines of what I thought. Where do the lanes for slot 2 and 4 come from then? Shouldn't only having a GPU in slot 1 and the other cards in slots 2 and 4 allow slot 1 to run at x16? Thanks for responding.

EDIT: I didn't understand the graph in the manual completely. If three cards are detected in slots 1, 3, and 5, it will operate in 8/16/8 no matter which card is where. My GPU is in Slot 1, which is why it was in x8. I moved the Network card in slot 5 up to slot 4 (which based on my understanding, doesn't take lanes from the CPU), and now the GPU in slot 1 and Decklink Card in slot 3 are running 16/16. I know I"m not really explaining this to the people you helped me with info, but I figured it...
https://en.wikipedia.org/wiki/PCI_Express#PCI_Express_3.0
PCI Express 3.0's 8 GT/s bit rate effectively delivers 985 MB/s per lane,
At 8x that would be 8 lanes for a total of 7.9Gb per second, if you have a hard drive array capable of these speeds and all your video files are always defragged on your storage, because that's the only way for a drive to give its max speed, then maybe your performance would take a hit.
 
It looks like your board uses the CPU lanes in PCIe slots 1, 3, and 5. If all three are populated, 1 and 5 become x8 slots.

As far as PCIe bandwidth not mattering, it depends on how much data is being shoved in and out of the card and how fast the storage drives can deliver that data.
 
https://en.wikipedia.org/wiki/PCI_Express#PCI_Express_3.0

At 8x that would be 8 lanes for a total of 7.9Gb per second, if you have a hard drive array capable of these speeds and all your video files are always defragged on your storage, because that's the only way for a drive to give its max speed, then maybe your performance would take a hit.
Thanks for the info. We're not at that level of transfer. In our application, we decode and re-encode for broadcast, hard drive array wouldn't even be a bottle neck for us, the internet itself would be. But even with our two gigabit pipes at the office, we're multiples shy from reaching 7.8GB/sec.
 
It looks like your board uses the CPU lanes in PCIe slots 1, 3, and 5. If all three are populated, 1 and 5 become x8 slots.

As far as PCIe bandwidth not mattering, it depends on how much data is being shoved in and out of the card and how fast the storage drives can deliver that data.
Ok, so this is along the lines of what I thought. Where do the lanes for slot 2 and 4 come from then? Shouldn't only having a GPU in slot 1 and the other cards in slots 2 and 4 allow slot 1 to run at x16? Thanks for responding.

EDIT: I didn't understand the graph in the manual completely. If three cards are detected in slots 1, 3, and 5, it will operate in 8/16/8 no matter which card is where. My GPU is in Slot 1, which is why it was in x8. I moved the Network card in slot 5 up to slot 4 (which based on my understanding, doesn't take lanes from the CPU), and now the GPU in slot 1 and Decklink Card in slot 3 are running 16/16. I know I"m not really explaining this to the people you helped me with info, but I figured it might help somebody down the line.
 
Last edited:
Ok, so this is along the lines of what I thought. Where do the lanes for slot 2 and 4 come from then? Shouldn't only having a GPU in slot 1 and the other cards in slots 2 and 4 allow slot 1 to run at x16? Thanks for responding.

EDIT: I didn't understand the graph in the manual completely. If three cards are detected in slots 1, 3, and 5, it will operate in 8/16/8 no matter which card is where. My GPU is in Slot 1, which is why it was in x8. I moved the Network card in slot 5 up to slot 4 (which based on my understanding, doesn't take lanes from the CPU), and now the GPU in slot 1 and Decklink Card in slot 3 are running 16/16. I know I"m not really explaining this to the people you helped me with info, but I figured it might help somebody down the line.
If it doesn't come from the CPU, it comes from the chipset, which ultimately talks to the CPU at x4 speeds. This may not be a problem unless the peripheral hammers the CPU or RAM.

But it looks like the PCIe slot the video card is in is running at max capacity. Alternatively you could put the video card in slot 3 and the other cards in 1 and 5.
 
Solution
If it doesn't come from the CPU, it comes from the chipset, which ultimately talks to the CPU at x4 speeds. This may not be a problem unless the peripheral hammers the CPU or RAM.

But it looks like the PCIe slot the video card is in is running at max capacity. Alternatively you could put the video card in slot 3 and the other cards in 1 and 5.
Yes I understand now. Thanks! Ultimately, it's easier to just move the 10gig network card to the slot that uses the chipset. My next question, if you don't mind: I now have a situation where one card (2080 Ti Turbo OC) is running at x16 (@2.5GT/s) and another (2080 Ti Gaming OC) is running at x16(@8GT/s). What accounts for that difference? I'm using HWInfo to get this information... does the card just need to be underload in order to run @8GT/s?
 
Yes I understand now. Thanks! Ultimately, it's easier to just move the 10gig network card to the slot that uses the chipset. My next question, if you don't mind: I now have a situation where one card (2080 Ti Turbo OC) is running at x16 (@2.5GT/s) and another (2080 Ti Gaming OC) is running at x16(@8GT/s). What accounts for that difference? I'm using HWInfo to get this information... does the card just need to be underload in order to run @8GT/s?
Yes. Video cards will run at a slower PCIe speed if they're not doing anything to save power.