Hot Chips 2017: We'll See PCIe 4.0 This Year, PCIe 5.0 In 2019

Status
Not open for further replies.
Looking at what AMD has been doing, I wouldn't be surprised if AMD skipped 4.0 and jump right to 5.0. Similar to how they are jumping from 14nm to 7 and ignoring 10nm altogether. But, we will know for sure when we see what Navi and Ryzen 2 (possibly Ryzen 3) come with.

On a side note, I'm hoping AMD updates their roadmap, I haven't seen anything about what's after Navi.
 

Rob1C

Distinguished
Jun 2, 2016
111
20
18,685
I don't know and Search Engines provide no Source that AMD (or Intel) is "waiting", they simply don't have PCIe 4.0 now; a reasonable speculation is that they will wait for it to be ratified to 1.0, and after that, then they will decide. So no announced decision yet.

IBM POWER9 has it already, along with NVLink 2.0 (for Volta), there's no way Intel wants 'pants down' again.

With AMD's huge number of PCIe Lanes, and a limited number of Slots on most MBs, they have to go somewhere.

The IP is available: https://www.synopsys.com/designware-ip/interface-ip/pci-express.html .

The HotChip's Conference even had a Demo of 5.0: [video="https://www.youtube.com/watch?v=J65LypX2cQM"][/video]

It's possible for 5.0 to be available by year's end, just buy the IP and send it to your Fab. - It's not like 5.0 is a big leap from 4.0, because it's designed not to be a leap.

The SoC or Bridge has to provide it so others will make 4.0 Cards to plug into it, it's not like Mfgs are going to start making 5.0 SSDs and Graphics Cards with nothing to plug them into.

AMD and Intel will likely be onboard by early next year. With AMD's 7nm Epyc scheduled for early next year does it make sense to have Lanes and Fabric but nothing to blow their nose on.

It's coming, naysayers!
 
G

Guest

Guest
It took 7 years to go from 32GB/s to 64GB/s, and two years later we are going to have 128GB/s? I'm skeptical.
 

bit_user

Titan
Ambassador

NVMe only happened in the last year or so. Plus, the cloud wants multi-GPU deep learning and > 100 Gbit networking.

If this is what it takes to head off more proprietary alternatives like NVLink, it definitely has my support.
 

Rob1C

Distinguished
Jun 2, 2016
111
20
18,685
Andy Chow said:
It took 7 years to go from 32GB/s to 64GB/s, and two years later we are going to have 128GB/s? I'm skeptical

Here is a Motherboard with one PCIe x32 Slot: http://www.supermicro.com/products/motherboard/Xeon/C600/X9DRW-CTF31.cfm .

If we had PCIe 4.0 that x32 Slot could be x16 instead, and have the same bandwidth.

The Samsung 960 PRO uses x4, and it would be easy to stick 8 of those on a PCIe Card. You'd need that x32 Slot for a Card like that.

A dual Volta would benefit from the bandwidth.

Latency is going to drop on a PCIe 5.0 Slot even if the Data were only coming out at 3.0 rates, seek times could be more than 3x faster.

Five PCIe 5.0 Slots at x8 would only be 40 Lanes, leaving some spares on some CPUs.

There's not of use cases to offer you, this being a "chicken and the egg" problem (for which we know the answer).

 

bit_user

Titan
Ambassador

Keep in mind there might be some additional lag, before it shows up in desktops & consumer-oriented GPUs.
 
While it's nice to dream about this massive new increase on bandwidth/operations on an 'individual' basis, the most practical applications are in the AMD 'UMI' and Intel 'DMI' (or, whatever they call the platform 'interlinks' these days ...) on the desktop.

It seems to me (and I can always swing and miss) that there has only been a significant performance gain using PCI Express 3.0 instead of PCI Express 2.0 "lanes" in limited areas over the last 12 months or so (as B-U pointed out with NVMe and NVLink), and I would tend to be skeptical overall that any system performance hurdles or bottlenecks would simply move somewhere else in the 'big picture' ... especially on the consumer side of things.

BUT, as we all know, Gen 5.0 is always GREATER than Gen 4.0, or 3.0 or 2.0, or 1.0 ... They're obsolete!

:lol:

 

msroadkill612

Distinguished
Jan 31, 2009
204
30
18,710
"So you wanna double bandwidth huh?

No problem, we simply reduce trace lengths & ...

No, no, we cant be bothered discussing that ...

Ah, well thats different."

I dunno at what point, but eventually the laws of physics must come into play, and hence the delay I suspect.

In amdS fabric, what we have seen to date is a miniaturised sub bus linking for key HB resources independently of pcie. The traces we see are a fraction the length the pcie bus must work with on a mobo. The mcm/interposer has the footprint of a card deck, yet could stretch to; hbm2 ram, nvme, multiple gpu, multiple cpu on one ~Epyc size mcm. System memory is the only important HB resource missing.

Fabric's most advanced manifestation is on the radeon vega pro ssg gpu card, where Fabric hosts the; gpu, hbm2 gpu cache/ram and 2 TB of raid nvme ssd drives,all interacting independently of pcie, and much faster. The imminent zen/vega apuS will take the further fundamental step of combing a cpu & gpu on a Fabric bus.
I think a mobo w/ pcie4, will have to similarly adapt to much shorter traces for the faster links.

If the system bus does become balkanised, as some here suggest (nvlink/fabric etc.), then intel/nvidea would be playing a dangerous game with closed off environments, as only amd can bring both vital cpu & gpu processors to their Fabric.

Arguably, amd are offering the benefits of pcie4 now. At many price points, amd offer 2x+ the pcie3 lanes, ie. 2x+ the link bandwidth.

NB btw, Fabric is pcie3 compatible. Pcie3 devices can connect (eg. the ssdS on the above vega ssg card are stock pcie3 panasonic ssdS), so amd is fine with it, just not beholden to it.

As others say, intel just want to give you the same bandwidth, but using fewer of their meager lane allocations. Nvme ssdS are still 4GB/s max, but use 2 lanes, not four.
 

modeonoff

Distinguished
Jul 16, 2017
1,386
13
19,285
I will need to build a powerful multi-GPU machine for AI research within a few months. Probably can't wait for a year. Given that PCIe 4.0 products could arrive this Fall and 5.0 products a year later, when would be the best time to buy the components to build the machine?
 

bit_user

Titan
Ambassador

The only thing that could possibly change by then is the introduction of Volta consumer cards. I haven't checked the rumor mill on the latest, but it shouldn't be hard to get a couple months' notice before they launch, given all the moving parts involved in getting such products to market. Tom's doesn't trade (much) in rumors, so you'll have to look to other sites.


No, I don't expect the next generation of Volta cards to support it. AMD and Intel both launched new server and workstation platforms, this year, neither of which support PCIe 4.0. You can be reasonably certain none of the components you need will have PCIe 4.0 alternatives on the market, within the next year or so.

Given that AMD relies on PCIe for inter-GPU connectivity, I think they might be the first to market with PCIe 4 support. But, if you're doing GPU-based AI research, it's probably not with AMD hardware. And we're still talking about sometime next year - at the earliest - when Navi hits the market.

In short: wait until you actually need it, then buy the fastest HW that will fit your budget. If you were building it today, probably go for GTX 1080 Ti, unless you really need more GPU memory. Then, you'll probably have to go all the way up to Quadro P6000. See:

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units
 

modeonoff

Distinguished
Jul 16, 2017
1,386
13
19,285
Thanks. I will need hardware that supports CUDA. That means Nvidia GPU. Yes the best option now is the GTX 1080 Ti. If I buy 1-4 of these now and PCIe 4.0 based motherboards and GPUs out by Jan or Feb, I waste my investment as by waiting for a few months, the performance will be doubled. The question is it is hard to predict whether or not such PCIE 4.0 components will be out within the next five months.
 

bit_user

Titan
Ambassador

Like I said, the motherboard picture is the easy part. Intel just launched LGA 2066 and (officially) LGA 3647. AMD just launched Socket SP3 and Socket SP3r2. It would be highly unprecedented for either of them to revise these within a year (or even two). No one can promise it won't happen, and you can plumb the rumor mill to see if there's any sign, but you can be reasonably assured it won't.

As for GPUs, it sounds like you want to try for Volta. Just wait until the Volta cards launch, if you can. I think it's highly unlikely they'll be PCIe 4.0, but you'll just have to take what's on offer.
 

bit_user

Titan
Ambassador
BTW, if you want to scale up to 4 GPUs, then you'll probably want to look at either a dual-CPU LGA 3647 (Intel) setup or a SP3 (AMD) setup. These are the only options that can deliver full x16 throughput to 4 GPUs. On a budget, I'd go with Epyc (AMD).
 

modeonoff

Distinguished
Jul 16, 2017
1,386
13
19,285
The Xeon Phi CPUs supported by LGA 3647 alone cost thousand of dollars. It is out of budget. The Epyc supported by SP3 cost could also be that expensive. Besides the clock is 2-3G Hz. Seems a bit slow. I guess a cpu of about 10 cores would be sufficient. Over 20 may be a bit overkill at least for present's applications.

When PCI 4.0 motherboard is out, running GPU at x16x16x16x16 would be easier I guess.
 

bit_user

Titan
Ambassador

Xeon Phi doesn't have the PCIe lanes you need. You would need 2x or more of the Scalable Processor series Xeons, as they each provide only 48 lanes (and some of those will be needed for storage, etc.).

https://ark.intel.com/products/series/125191/Intel-Xeon-Scalable-Processors


Not necessarily. You can get 8 cores & 128 PCIe 3.0 lanes for as little as $475 (list).

https://www.anandtech.com/show/11544/intel-skylake-ep-vs-amd-epyc-7000-cpu-battle-of-the-decade/3

That's probably even cheaper than the motherboard.

https://www.newegg.com/Product/Product.aspx?Item=N82E16813145034
...except don't use this board because it will only fit 3 x16 dual-width cards. If you could settle for only 3 GPUs, then Thread Ripper would be a much more sensible choice.

Don't forget to budget for 8x DIMMs. A dual-Xeon SP system could use up to 12x (but can run with less).


For deep learning research, the GPUs should be doing most of the work. All the CPU cores really have to do is feed the beasts.


No, probably not. When PCIe 3.0 hit, I don't recall seeing motherboards that would mux 2x the PCIe 2.0 lanes into each PCIe 3.0 lane. Besides, wouldn't you want to pair PCIe 4.0 GPUs with a PCIe 4.0 CPU?
 
Status
Not open for further replies.