AMD CPUs, SoC Rumors and Speculations Temp. thread 2

Page 10 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Heh not any time soon (especially when you consider one of his later points was all about VR being a big focus). That said developing their graphics products basically applies to almost all of AMD's future plans given that the game consoles (likely including the rumored new Nintendo product) are all APU, as are the majority of their desktop offerings *and* they are planning up coming servers- so even if you take dGPU out of the equation you still need good (critically as efficient as possible) graphics IP for all the integrated products....
 


Here you have an FPGA with HMC

EX-800_Board_Image.jpg


Here you have a supercomputer board with HMC

post-fx10-cpu-memory-board_tcm100-958139.jpg


The black chips are the HMC modules. There are eight modules per CPU. Note there is no DIMMS for DRAM. HMC has been designed for main memory, unlike HBM.

HBM is the JEDEC standard that replaces GDDR5. HMC is not JEDEC standard, but this is rather irrelevant to me, because it is backed up by the HMC consortium, which includes the biggest companies in the industry: ARM, Intel, IBM, Google, Samsung,... There are about 100x more companies adopting HMC than companies adopting HBM.

HBM is poor than HMC on any technical metric. HMC provides more bandwidth, better latency, more capacity, better expansion, RAS,... The only point where HBM wins is on cost, because the spec is simpler.

HMC is aimed at both CPUs and GPUs. In fact Intel is using it on the KNL Phi which will be used to run GPU-like workloads.
 


That is the reason for Nolan/Amur canceled. Skybridge canceled. K12 delayed by one year...



They will no go away tomorrow!
 


Cost is king. HPC can afford the luxury of HMC. It's called picking the right tool for the job, which AMD has done. It's why AMD has consumer product on the shelves today. The HMC is available in sample quantities on development kits and prototypes in the $20,000+ range. Volume products are not expected until 2017/2018 time frame.

You can go to Micron today and see it is just sampling. You need to sign an NDA to even get the datasheet for it and this is HMC 1.0. http://www.micron.com/products/hybrid-memory-cube/short-reach-hmc/4GB#/
 


I was about to point out the "prototype" marker in the picture as well. Beat me to it, Cazalan.

But let's not get any deeper to HMC vs HBM. They're intended for different markets and I don't see them competing 1:1 anytime soon. I will stand by my words that being technically superior is not indicative of success.

Cheers! 😛
 


I'd also add that if you look at the comparison I posted a little earlier, HBM provides *more bandwidth* (albeit slightly) than HMC (that is a comparison of HMC vs 1 vs HBM vs 1). From what I've read HMC 2 doesn't surpass HBM 2 and as I've said before since no *actually shipping products* even feature HMC you can't really use it to discredit HBM vs 1 or 2. It's silly. Yes there are many new things in development that are faster / better / *insert metric of choice here* than things that are available right now, which is how it should be.
 


Cost depends on volume. You cannot compare current HMC pricing for HPC products or for FPGAs with pricing that Micron would give you if you were to order 10x more modules for using them on a higher volume product.

I agree that AMD has selected an acceptable tool for 'today'. The problem that I mentioned was that AMD will be at a clear disadvantage 'tomorrow' when the hundred companies have adopted HMC start using it for CPUs (some already did start using it) and AMD has nothing competitive.



No. The marker says "Post FX-10 prototype". The term prototype refers to the supercomputer that Fujitsu is developing to replace the FX-10. The CPUs and the HMC modules in the board are not prototypes. Moreover, that picture was taken two years ago on a stand at SC13 Conference.

To illustrate the confusion about the term "prototype" consider the ARM prototype that a famous supercomputer center here at Spain is building. You can see they use the term prototype to refer to the architecture

Mont-Blanc-system.jpg


But the CPUs, and GPUs, and memory are using are not prototypes but products you can purchase. Get it?
 


Since when 128 is bigger than 160?

dram_table_med.png


http://www.techdesignforums.com/practice/technique/mobile-networking-dram-interfaces/
 


I have info from different source that shows different:
http://www.extremetech.com/computing/197720-beyond-ddr4-understand-the-differences-between-wide-io-hbm-and-hybrid-memory-cube/2

DRAMs-640x801.jpg


HMC max bandwidth listed as 240 gb/s, HBM listed as up to 256 gb/s...?
 


That's another misnomer because HMC controller IP can be bought off the shelf if not already covered under AMD's relationship with Synopsys. It's a non-issue if AMD want's to deploy a product that supports HMC. When 100+ companies are adopting HMC that means IP is readily available, inexpensive and can be plugged in at any time.

As you already pointed out elsewhere the HPC geared slides from AMD say 3D memory, which is non-specific. They can use HBM/HMC or a hybrid if they choose.
 


Extremetech got it wrong. Check my table, which gives the correct one-directional bandwidths. Extremetech guys took the 128GB/s for HBM and multiplied by 2x to get the IO throughput of 256GB/s. That part is right. The part about HMC is wrong. They took 120GB/s instead taking 160GB/s (check my table). Multpliying by 2x the correct value of 160GB/s one obtains HMC maxes at 320 GB/s. In fact the guys at Extremetech wrote an old news about HMC achieving 320GB/s of total bandwidth

http://www.extremetech.com/computing/152465-microns-320gbsec-hybrid-memory-cube-comes-to-market-in-2013-threatens-to-finally-kill-ddr-sdram

Thus they don't even read their own news!!!

The fact is that HMC provides 25% higher bandwidth than High Bandwidth Memory, which is ironic. I will repeat it once again: HMC is a superior technology to HBM in any aspect. HBM is only cheaper, but not by much.
 
This HMC vs. HBM argument is starting to get pretty pointless without any real comparisons (Benchmarks) between the two, especially considering the fact that HBM is gonna be more prevalent* with Nvidia jumping on it with pascal. So can we drop it until then, and stop before people get too upset and sent on vacation / thread getting nuked :b

*(In the consumer market, which, I assume, is more important/interesting to us)
 


There are no slides, but a research paper that AMD engineers/scientists published in the IEEE journal. The APU they described in the paper targets a 2020--2023 timeframe. Anything we know for earlier timeframes, including the "server APU" for 2017 use HBM (v2). Therefore, my complain about the technological disadvantage remains.
 


FTFY!
 


Interesting how the doom and gloom rolls in amid news of a 13% hit...that came shortly after a 25% up tick days before that no one posted enthusiasm for.

So...I guess it is a bad thing they are still net/net ahead of where they were a week ago?
 


I guess investors are bit worried because some time ago when AMD presented bad numbers Lisa Su promised to investors that R&D wouldn't be affected and that they would cut money from other parts. That was before AMD published last quarter results which didn't meat the goals. Now Papermaster is saying investors that R&D will be reduce further. That is why investors are so worried now.
 


It's interesting to watch AMD stock over a month or two timespan; it goes through periods where it looses a good 20% of it's value, then gains it all back within a day or two.

But yeah, "Most Shorted Stock on the Nasdaq" is not a title to be proud of. Investors are starting to bank on losses, which is not good for AMDs long term plans. Which is why I can't help but chuckle when I read about post 2020 product designs.

AMD is totally dependent on Zen selling. It's really that simple at this point. Right now, I'd go as high as 75-25 AMD doesn't exist in 5 years, at least as constructed.
 


That is explained by the alternatives sequences of "confrontation to reality" followed by some amount of hype and leaks.



Their EHP project is a concept and doesn't appear in any roadmap. AMD official roadmaps finish at 2017. Their long-term strategy is a castle in the air.



I would like to be so optimistic as you. Unfortunately, I think that Zen will not save AMD. The lack of any public design win for Zen is a symptom: Top customers are choosing Intel and IBM.
 
HBM by its nature has better latency than HMC. The underlying memory is quite similar but the interface requires substantially more work.

Memory is inherently parallel. With HMC the CPU has to take parallel data, convert it to serial packet data, transfer it across the bus, HMC chip processes the packet request, HMC pulls parallel data out of the memory cells and turns it into serial packet data, CPU receives data back from the bus, CPU converts the serial packet data back to parallel data and stuffs it in the caches. Any request goes through these multiple PISO/SIPO (Parallel In Serial Out, Serial In Parallel Out) stages (at least 4, round trip) which induce latency. This is just one reason why HMC memory controllers require more logic. It's like having memory sitting on a very fast PCIe bus. (per lane bandwidth comparable to: HMC1 ~ PCIe 4.0, HMC2 ~ PCIe 5.0)


The bandwidth listed for HMC is also the peak (naturally). If you use one of the features of HMC which involves increasing capacity by chaining several HMC together, each link reduces your bandwidth and incurs additional latency. In the worst case scenario of chaining a CPU to 4 HMC modules (star topology) the bandwidth is reduced to 25% of the peak. This is because 3 of the HMC links have to be re-purposed as pass-thru links.


The reason HMC has more capacity and potentially higher bandwidth is because each HMC die is like having 2 HBM sitting side by side in the same package (possibly 4). This also makes it a much larger die. The initial samples are 31mm x 31mm which is HUGE. That made sense for HMC because it is intended to be near a processor, but not as near as HBM.

As seen in this picture the 16GB of HMC for Knights Landing is absolutely MASSIVE compared to the die, which is already around 600mm^2.
Knights_Landing_front-617x465.jpg

And as shown in this picture the CPU + HMC takes up considerable board real estate. The size there dwarfs the sockets for 6 full DIMMS.
Knights-Landing.jpg

Now compare that to HBM sitting on the Fiji die. The memory is just a small portion of the package. The die take up the size of an Asprin. Which is going to work better as a mobile APU or future gaming console?
amdfijidie.jpg

Which brings us to the next limitation of HMC which is the far mode operation. Physical board layout limitations make it so not all HMC can be right next to the processor (called short reach mode). In far reach mode the bandwidth is immediately reduced by 33%. The 15G lanes can only operate at 10G. If you combine far mode with chaining (worse case) scenario leaves a bandwidth below 20% of peak. This ultimately limits how much full rate HMC can be attached to a CPU.

In summation, HMC has great potential, but it is not without it's own limitations and trade offs. And certainly not as black and white as some here are suggesting. We've had this duopoly in parallel and serial memory for 50+ years. It's not ending anytime soon.
 




The disparity is likely due to the difference between what is in the spec and what is being produced/sampled. The max for HMC 1 is actually 320GB/s but the problem is no one is making them because it requires 8 links and a lot more pins. Only 4 link HMC is being made which maxes at 240GB/s, but what is being sampled from Micron only supports 160GB/s. Note that is the raw data which includes the packet overhead, so the true data throughput is less.

Likewise with HBM 1 the spec allows 4GB and 8 layer devices but no one is making them. The HBM in Fiji is 4 layer and 1GB which is only 1/4 the max for the spec. They are however using the full bandwidth of 128GB/s.

Then there is the disparity with HBM allowing max bandwidth in both directions, where HMC has fixed input and output lanes. This means the peak read bandwidth for HMC is only half the peak bandwidth, and likewise for the write bandwidth.

To compare the sampling Micron HMC with the shipping Fiji HBM:

Read bandwidth, HMC 80GB/s , HBM 128GB/s.
Write bandwidth, HMC 80GB/s , HBM 128GB/s.
Read/Write bandwidth, HMC 160GB/s, HBM 128GB/s

In the case of doing primarily reads or primarily writes, the bandwidth for HBM is an advantage.
When doing a more balanced mix of reading and writing, the fixed lanes of HMC has an advantage.

Now add the packet overhead, extra latency, cost of the logic controller of HMC. It's fairly clear to me why AMD chose HBM for Fiji.
 


Thank you for the detailed information. As usual there are advantages and disadvantages to both. I guess time will tell if there is a decided 'winner' or if things will evolve more like the current memory standards with HMC being geared towards CPU memory and HBM being focused on the GPU side.

The other thing that occurs to me, HBM is currently running at pretty low data rates compared to more standard memory types, this suggests that the data rate of the memory could well increase quite a lot over time (e.g. going from 2gbps to 4 ~ 5gbps once it's mature like how GDDR5 started at around 4gbps and finished at roughly 7 ). That isn't to say that HMC won't also develop in similar fashion of course.

As for others comments, yes I agree, AMD are riding everything on Zen. I just totally disagree that it is a forgone conclusion that it's going to fail. We already have rumors of Facebook looking to AMD for an integrated, zen based APU for their servers, which I'd say constitutes a pretty significant design win. Just because the design isn't ideal for HPC doesn't mean it can't work in other scenarios.
 
I think we need to end the HMC vs HBM argument. I will only say that I see HMC as the winner for replacing DDR on desktops/servers (more support from multiple companies including major players in the memory market such as Samsung, Micron) and I see HBM being a viable replacement for GDDR5 on GPUs. It is much like current. GDDR5 is great for GPUs as it has super high bandwidth but also high latency. GPUs do not care about that though. CPUs however require low latency to be efficient and suffer from higher latency. HMC seems to be geared towards that while HBM seems to be geared towards GPUs and possibly as a cache for APUs (I do not see HBM becoming a on package memory for CPU and GPU).

The future will tell.

For now, let us turn this ship back around to the main topic of AMDs CPU, SoCs and other rumors before it gets way too far of course and gets squashed.
 


Do you mean this rumor?

According to a source with knowledge of the project, AMD is developing a custom Opteron server processor for Facebook. Few details are known about the chip today. What is known is that it is based on 64-bit ARMv8-A architecture and is designed by AMD’s Enterprise Embedded and Semi-Custom (EESC) business group.

I don't see Zen mentioned. In fact I can see "ARMv8-A architecture". And I am rather sure it is a CPU, not an APU. Then goes this interesting job position at Facebook

https://www.facebook.com/careers/department?req=a0IA0000006cPTMMA2
 
Status
Not open for further replies.