-Fran- :
I think Intel is using the CISC side if X86 to their advantage with Knights Landing. I don't think they need to call it an APU at all, since it doesn't have a "GPU" as a co-processor like AMD has (such that you could slap it as a stand alone card and it will work with minor tweaks).
As mentioned before, the concept of APU is not restricted to having a GPU inside
http://en.wikipedia.org/wiki/Accelerated_processing_unit
The HSA specification considers explicitly the general kind of TCUs, which satisfy HSAIL-ISA.
8350rocks :
@juanrga:
AMD said that 3 years ago...
Think about what you are trying to tell me...that an APU will have 4x the GPU compute capability of a stand alone R9-290X in uber mode.
What process node do you expect them to use? 300 picometer? We are talking about something that would have to house ~8,000 SPUs + CPU cores...
That is little more than a pipe dream...
APUs will not be advantageous for HPCs in my lifetime.
Drop the charade and look at AMDs current plans for HPC...
http://www.extremetech.com/computing/155941-supercomputing-director-bets-2000-that-we-wont-have-exascale-computing-by-2020
No one is looking at single die solutions anymore...they are looking at memory systems and buses to be able to
better connect specialized discrete components.
They realized the APU for Exascale is actually a poor solution as the interconnects between APUs would
still be the weak link. You are thinking 3-4 years ago on HPC designs...get into 2014.
AMD said that 3 years ago and has not changed the plans for exascale supercomputers, because the laws of physics are the same now than 3 year ago...
I don't know how you got the 4x ratio, but it is wrong. The AMD chief engineer is using the common HPC practice of mentioning DP performance. The APU he mentioned gives 14x more theoretical performance than the R9-290X (which only offers 790 GFLOPS). Moreover, as I already said to you before, you cannot just compare raw FLOPS. In practice the single APU will be much faster than 14 discrete cards "R9-290X in uber mode" working together.
Let me mention that the APU designed by Nvidia engineers is a 20 TFLOPs beast. I will leave you as homework to get how many discrete cards "R9-290X in uber mode" working together you need to match the performance of the Nvidia APU for supercomputers.
I already said before that the process node is 10nm.
The link you provide doesn't say what you pretend. At contrary he confirms stuff I have said and you and others negated. He clearly says that current supercomputer architecture doesn't scale up. He mentions some of the challenges of the design of exascale supercomputers.
What he says is all well-known. The exascale designs from AMD, Nvidia, and Intel provide explicit solutions to the problems. Regarding interconnects, the AMD design includes a NIC of 40--100GB/s. The Nvidia design includes an interconnect of 150GB/s. This is beyond the buses used in current supercomputers based in the outdated CPU+dGPU architecture...
I have given you before a slide from SC13 (aka Supercomputer 2013), explaining how the supercomputers of
tomorrow will use APUs instead discrete cards used today:
Can you see the words "Today" and "Tomorrow"? Intel expect the supercomputers of 2015 to use the APU at the right of the slide. What you propose (CPU+dGPU) is outdated and terribly slow.