AMD CPU speculation... and expert conjecture

Page 589 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Its supply and demand. If people are willing to pay that much extra for those features, then under our economic system, its economically justified to sell then at those prices.
 

8350rocks

Distinguished


Actually, what they told you was SOI did not do this or that...

What they have admitted to me was, if they could have remained profitable by assisting with the development for FD-SOI @ 28nm they would have.

Additionally, AMD engineers concede without contest the ENTIRE REASON Kaveri is lower clocked was because bulk did not pan out as they had hoped. The early ES tape outs were showing promise, and then when they got to full ramp to see what a production sample would look like, they could not even get to Richland clocks without having to skew TDP to 125W and the part had to be a 95W part. The substrate is ENTIRELY the culprit.

Also, AMD have conceded that SOI may come back into play around 10-14nm because FinFET on FD-SOI has better thermal properties and would likely offer better performance with lower power consumption at higher clocks...(as illustrated by Richland being 100w parts with 10-12% higher clock rates over the 95W Kaveri).

Therefore your FUD about substrate is completely off.

Additionally, the FF interconnects are capable of FAR more than 10 Gb/s per socket. It is proprietary, and they do not advertise above and beyond what is necessary because the technology is vastly superior to anything available in the server space, and is also one of the most energy efficient interconnects available period.

As far as their plans for HPC go, I asked about APUs, and the answer I got was..."we have not fully decided that will be the best route yet, we are still experimenting..."

So, unless you know something that people at AMD do not, I cannot see how you can sit there and say with certainty that is what it will be.

Things are just too undefined at the moment to make assumptions, and you are making very strong assumptions without critical information. You cannot possibly have the information to make that decision, because the companies trying to execute it themselves do not yet have that information, and you are not an insider.

Just stop while you are this far behind. Please. You keep digging deeper holes, and you are just shoveling dirt onto your own back at this point.
 

jdwii

Splendid


Lol do you work at Amd for marketing just wondering ha ha
 

blackkstar

Honorable
Sep 30, 2012
468
0
10,780
You guys thinking APU GPU will keep up with dGPU are forgetting one big technology that game developers have yearned for for a very long time, ray tracing.

Here's a good example of the needs of ray tracing:

https://www.youtube.com/watch?v=m5EDorhuFuo

If you don't like watching, 4 balls with one light on a GTX 660 Ti with CUDA gives 280fps. 6 balls with 10 lights gives 70fps. If you scale that to an actual game, it'd cripple four 290xs.

To give you an idea of ray tracing, this is the best we can do right now: https://www.youtube.com/watch?v=3yH8OQTkpy8 And that's a bunch of low poly demos (or a high poly object surrounded by nothing) with a few light sources on an fPGA designed specifically for ray tracing. And most of these demos are running at awful frame rates.

GPU and graphics have an extremely long way to go. We are at an awkward phase though where there are diminishing returns with what we have (adding polygons and upping texture quality) and we don't have the power to make ray tracing happen on a scale large enough for a game. Right now we basically cheat how light would work in real life and it causes us to miss a lot of things in games that we would have in real life. Again, you can cheat with shaders and what have you, but it's just faking how light would actually work.

The way Mantle and HSA work by allowing things to be split up between components yet share data is perfect for these kinds of things.

So no, dGPUs are not going anywhere. Unless you think ray traced video games are never going to happen.

There's not a lot of info on Freedom Fabric, but 10 gigaBITs per second is about 2 gigaBYTEs of bandwidth, which is close (but a little better than) PCIe 4.0 single lane. But you can use multiple controllers for multiple lanes so it can scale. Right now FF is only really used for hard disk storage. Meaning the demand is not as high as something like CPU to GPU communication. But as 8350 said, there is no reason for AMD to make claims about FF.

Freedom Fabric will play a huge role in what AMD is doing with their next platform. You are going to see something like HT over FF with a lot of links.

Some of you are forgetting that Keller basically designed HT and now he has Freedom Fabric to play with. If you think AMD's best interconnect is going to cap out at 10gb/s, I just have to shake my head and laugh.

With Mantle, a future where the APU GPU is strong enough to handle geometry and you have a big dGPU to do raytracing is a very real possibility. And that's the kind of situation I think we will see things evolve into if AMD can leverage enough software wins. But it will probably be a slow climb with things like additional GPUs calculating things like global illumination first.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810

Wafer on wafer is about as close as you can get to being on the same die. But yes they will be multiple die as the DRAM process is optimized differently than CPU processes. The main point being they don't need 512 pins with high voltage and high drive strength to drive fat traces going several inches over a PCB.

I don't think anyone is saying 16GB is enough. That's just the limit of what they can do for 2016. This CPU will also have quad-DDR4 if they have a need for more RAM, and an improved QPI interface to grab memory attached to another CPU. I'm not sure they even know the optimum configuration yet, just Cray has announced the design win for delivery in 2016. It's the first fully funded project I'm aware of.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


The only "narcissism" is from those whose predictions were proven to be plain wrong and are trying hard to negate that benchmarks verified my predictions about the A10 7850k up to the single digit percent.

My predictions about future APUs will be verified as well.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


You are who started again by mentioning me in a reply to another poster. You accusations were plain wrong as are the new ones.

You always accuse me of lying when you try to hide your mistakes. I make detailed and sound posts and occasionally bring slides that confirm my point. The last slides that I added to the discussion were not promo slides. I explained they are research slides from a professional forum related to a research initiative with gov. funds.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


FD-SOI cannot provide the needed density for Kaveri and lack a migration path to finfets.

The reduction from 100W to 95W almost entirely explains the reduction in clocks. The other part of the reduction is from the use of HDL.

No main foundry is planning high-performance SOI for 10 or 14nm. SOI will only account for about 5% of total foundries volume, because it is massively rejected by engineers. Moreover one friend of mine says that AMD has selected TSMC 16nm finfet for K12.

Current FF interconnect gives 10 Gb/s per socket. I mentioned that future versions can give 10--15x more throughout and that is good as APU--APU interconnect but slow and inefficient for CPU-GPU.

Sure AMD is experimenting, but their plans about HPC are based in APUs, as AMD has explained in public.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


LOL. No. I am not working at that. Moreover, I am discussing technological possibilities and how the industry will evolve, not some company marketing.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Ray tracing was mentioned before. Several novel algorithms are being developed by the same game developers who claim that GPUs will be killed by 2020.

Several current ray tracing algorithms are inefficient. By rewriting them and by avoiding GPUs limitations they can run much faster. I will revise my notes to see if I can find something that I can share here.

FF 10Gb/s (= 2 GB/s) is terribly slow for what is being discussed here. The interconnect to stacked RAM will be much faster with ~1TB/s. You cannot build an exascale system using CPU--GPU connected by such slow interconnect. A scaled up version of FF with 100--150Gb/s will be used for APU--APU communication.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


First part: no. Second part: yes, data is here to everyone to see it.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Your argument continues being silly, because you are mixing different time spans and then ignore history of computing. Then the FPU was so large as the CPU

640px-80386with387.JPG


However by virtue of integration FPUs are now inside the CPU. A 290X is too big at current 28nm node. But at 14nm a 290X can be integrated on same die than a CPU. And at 10nm you can integrate a much larger GPU.
 

i mentioned your actions after obtaining proof, not before. looks like an example is due, since you claim i was wrong about your blanket statements devoid of any specifics. to prove that you're lying, and are wrong, here's rather .... "bold" one(of numerous), if you will:
which apus? die size? yield rate? process node, tech, substrate, clockrate? which gpus? what ISA? what kind of uarch? any specifics on anything(not just the factors i mentioned)? the only thing barely specific is the year 2020 - even that's not really specific because we're supposed to see such apus by 2020. "about 10x"? what's the specific multiplier then?

None.


lol i am not immune to making mistakes but in the instances i've pointed out so far, you indeed turned out to be lying and/or have consisntently failed to refute such accusations. i don't really care whether you trust me or not, but trust me, you have made long winded and detailed(!) posts, but they're almost never sound. that's why they get debunked near instantly. however, if i am indeed mistaken about your lying, my sincere apology is ready the moment my accusations get refuted.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


All the data was mentioned before. You are asking me for info that I gave before:

The APUs currently designed by AMD, Nvidia, and others... Nvidia HCN die size is estimated to be 290mm2. Intel has bigger die of 400nm2... Yield rate is standard for mature process... Process node is 10nm FF standard for laptops and high-performance node for desktop/server for Nvidia, Intel uses 7nmFF... Nvidia mentions 2GHz on basic node and 2.5GHz on high-performance node (those are the GPU core clocks), Intel mentions 4.61 GHz... GCN for AMD, CUDA for Nvidia... substrate is bulk with FinFETS, nobody uses SOI in its designs... AMD will use both ARM and x86 ISAs, Intel uses x86 (plus AVX-like extensions), Nvidia uses ARM, ARM ISA includes NEON extensions... uarch have no name, but we know some details about them, we know lanes, topologies, working voltages, FLOPS/core, cache sizes...

The current multiplier depends on lots of factors such as exact clocks, final process node tech and so on... it can 8.9x or can be 10.7x. This is why I write "about 10x".

The exact year is also unimportant. It is based in standard silicon projections and evolution of current research project. More optimist guys say that technological target could be achieved before: year 2018. More pessimists people say technology will be not ready at 2020. Again this is unimportant. It does matter if it happen by February of 2020 or by march of 2022 due to delays. It will happen.



It doesn't work as you believe. It is not you making unfair accusations, then I having to refute them, and the you apologizing. It is the person who accuse who has to prove his accusation. You are not proving anything and I am mentioning just that.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


AMD will not. AMD will use 28nm planar this year. Then 20nm planar the next year (except Carrizo APU that will use 28nm still). Then 14/16 nm FinFET at the year 2016. A friend of mine said me that AMD has selected TSMC 16nm for the K12 arch of the year 2016.

Things can change if foundries roadmap change, if there is unexpected delays and all that...
 

colinp

Honorable
Jun 27, 2012
217
0
10,680


Here was your prediction from that BSN article that you were relentlessly spamming at the time:

This would put the multi-threaded performance of the CPU of the new quad core Kaveri APU at the same level than an Intel quad core i5 or a six-core AMD FX with traditional software.

Since we know a Bulldozer-type module loses some performance when both cores in a module are in use (not something that afflicts an i5), then that must mean that single-threaded performance must be at least as good as an i5, right?

So, did the reviews back up those two points: that both single and multi threaded performance is at the level of an i5?

Tomshardware

No

Guru3d

No

Anandtech

No

xbitlabs

No

bit-tech

No

Hexus

No
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Did you see the price? $2999 for a 4 core dev kit. Yikes!
 

actually, you posted only a few of your specifications haphazardly before. never compiled like anything close to the following:

ooh, memory subsystem. i totally missed asking about memory like L1, L2, L3(in/exclusive, shared etc.)..., stacked(capacity, type, die size and power use, interface, bandwidth, clockrate et al)/embedded(same things)/none, system memory, processor-processor and processor-i/o interconnects and specs, system memory and it's specs and so on. inside the cores, alu, agu, pipeline stages (e.g. stock 4.61GHz might require rather long ones), load/store capability, how wide, etc. the more dimensions you add, the clearer it will become. what are the lanes, topologies, working voltages, FLOPS/core, cache sizes for those chips?
i didn't quite understand some of the bits (hopefully others will) like, are amd and nvidia using the same process node and same substrate? s.o.i. seems out of speculation. are the gpu core clocks base clockrates or boost clockrates?

....and these apus (where are the specs for the cpu part, or is it even there?) supposed to be about 10x faster than which gpus? can i take the exact range be 8.9x to 10.7x? because 7-8x isn't "about 10x". i was thinking of baselines along the lines of tahiti based r9 290x's 20nm successor or nvidia's big gm210-something maxwell gpu. may be your baseline is different? you never mentioned any of that along with the 2020 apu specs. for example, if the baseline gpu is brazos's igpu... :) i forgot to ask about transistor count and lithography, and likely some other important bits like prices. it seems that you're keen on replying only the ones i typed, instead of an overall complete picture. because the comparison stays vague without sufficient info.

why is the year unimportant if you keep spamming the 2020? 2 year is a Long time in tech world, you're using a very big buffer. i thought you were confident. i don't doubt future processing devices being inevitably faster and more powerful than current ones. but it looks like you're dismissing your own claim here.


in my example, i showed why i said your statements were blanket statements lacking specifics and subsequently got you to finally compile at least some of the specifics. compare your previous post to the one i posted as example and compare the content. that pretty much proves my side. :) that was just one of numerous examples though.

i should point out that others have been too eager to argue the possible existence of the device within the claimed time limit instead of simply asking for specifics like i did. without specifications you leave room for huge inaccuracies.
 

price for early admission i guess. i also assume that the system memory in that kit is rather costly. stil... 4 cores for 3 grand, certainly worth poking fun at, imo. ;)
for example: why is the cpu just 4 cores? wasn't arm bragging about high yield rates for their socs?
the cache specs are missing. from the sdk specs, it seems like the cpu/soc is split into 2 clusters of 4 cores and shared L2 cache. i thought it'd be split as 4 clusters of 2 cores sharing L2 cache while 8MB L3 cache is available to all cores. i think the reason amd is going custom so quickly may be hidden in the a57 soc designs.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


As mentioned before, things can change a bit if there are unexpected delays, it the final nodes are underpromising, etc. For instance if TSMc 16FF+ is delayed three quarters, then AMD could delay release of K12. If 16FF+ is on track but poor than expected, then AMD could switch to Glofo/Samsung 14FF.

So far as I know, TSMC is shipping 20nm processors to Apple

http://www.extremetech.com/computing/186080-tsmc-is-finally-making-20nm-parts-for-apples-next-gen-iphone-ipad

http://appleinsider.com/articles/14/07/10/apple-begins-receiving-shipments-of-a-series-processors-from-tsmc---report
 
Status
Not open for further replies.