AMD CPU speculation... and expert conjecture

Page 296 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Yes, we are talking about heterogeneous supercomputers: the Mont Blanc project uses ARM+CUDA.

The CPU in current x86 + (GPU/accelerators) supercomputers accounts up to a 40% of the total power consumption . Substituting the x86 CPUs by more efficient ARM CPUs is a needed step to scale up.

A note: Nvidia Parker APU will use custom cores more advanced than A57 and it will be made in a FinFet process. The node size is not known, probably 14nm, but some rumours say 16nm whereas other say 10nm.



It says: "ARM multicores as efficient as Intel at the same frequency"... using ancient Tegra 3 (the supercomputer will use Tegra 6).

The tegra 3 phone chip is offering the same efficiency, or better, up to its maximum freq of 1.3Ghz, whereas the Intel chip (which is not a phone chip) can continue up to 2.4GHz, among other reasons, because its cores are being feed by about 4x more memory bandwidth and 7x more cache size. There is no reason which you cannot clock an ARM core up to 3.5GHz, this is unrelated to x86 vs ARM, but simply the rest of the chip has to be designed accordingly.

Look to the single core efficiency: "ARM platforms more energy-efficient than Intel platform". A single core is being feed accordingly.

If the tests are repeated, but disabling the extra 6MB cache of the i7, and limiting it to a single memory channel, the i7 will perform much poor, whereas consuming about the same energy. I think this would offer a better perspective of the efficiency of ARM vs x86.

About overclocking. Again this is unrelated to ARM vs x86. Pay also attention to the "Thermal package not designed for sustained full-power operation".

You are right, I was comparing CPU improvements with CPU+GPU improvements, my mistake. Still my point holds; CPU improvement for Intel was of about 10%--15% between Sandy and Ivy and then dropped to about 5% for Haswell.

Tegra 3 CPU is about 2x faster than Tegra 2 CPU. Now look how Exynos (dual A15) offers the same performance than Tegra 3 (quad A9) clock for clock. This implies A15 is about 2x faster than A9. Look to the new Apple A7:

Apple doesn't quite hit the 2x increase in CPU performance here, but it's very close at a 75% perf increase compared to the iPhone 5.

About MIPS, I think that what you say is not correct, but the question here is not one RISC vs another RISC, but RISC vs x86 (CISC).
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Then CEO saying that they will focus on mobile doesn't mean anything?



«Even though it uses "New Hardware," It's all custom, so it wont be true hardware.» True vs. untrue hardware LOL
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


It only proves that anyone can write a 'review' at Amazon and then someone can quote it for further nonsense in a forum.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


«That's 2.2x the relative performance of the FX-8350, up at the lower end of the K processors, using about 13 watts. Amazing.» What is amazing is that someone can write that.
 


No.
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860


Wow .. more utter nonsense

3770k-vs-2700k.png


http://www.tomshardware.com/reviews/ivy-bridge-benchmark-core-i7-3770k,3181-24.html

IT NEVER EVEN HIT 10% ...
 

it was more successful than amd's own zambezi(what isn't? lol) and is more successful than all 6-8 core cpus. but it's nowhere near sb/ivb core i3, in oems. amd's apu success is majorly due to zacate and bobcat and later llano. amd failed to make trinity capitalize on that(edit1: i don't mean like 'oh noes! apus going down!'. trinity's success is similar to llano's level but at a later time when intel also improved slightly). it's weaker capability in lower power envelop is one of the reasons. the other, less discussed reason is that intel might have done a little something to put pressure on trinity (i didn't quite understand how they did it, just oems preferred intel. iirc it had something to do with price change.) adoption. i know to up to the level where i observed people buying core i3 and i5 machines (laptops, mostly) despite those having underperforming hardware (dual core, weak@$$ igpus etc) over the apus. worse, i noticed that trinity laotops were usually bundled with single dimm ddr3 1333 rams.
as for marketshare, kabini is their best bet. it's better than even zacate, i'll say. it's up to amd to get it done and done right unlike in the past.


ahh. i get what you mean. in that case, intel's designs look pretty tightly integrated to me. i think silvermont may be different.

their philosophy is what drew me towards zambezi (and i got burned :)). i could see the potential for high performance. but the reality was different. besides, zambezi and vishera looked transistor-hungry to me. i mean, there's just the imc and the cores inside, what are the rest of the transistors doing?

their mobile uarches seem good.. for business, not performance or technology. i could wish a lot of things the technological advancements would bring, but these guys seem more keen on making money (management-driven).
yeah, somewhat like that. what intel's taking is from their own closed market and they flat out refuse to go anywhere. even their own so called foundry business is solely to benefit their ecosystem. since said ecosystem is shrinking fast, i think of intel being a big fish in a pond (formerly a lake) that's being slowly filled up and it's second big fish is growing wings and getting ready to move on/fly away. please don't be creeped out by weird biological analogy. i apologise in advance. :ange:

what a load of loaded pile of bull (steaming gets added next year geddit?:p). the phones, tablets, servers, hpc all use os (i mean a workable environment) based on the kernel. how many users can directly use the kernel without a user-friendly gui? i am newbie about command line and i had a hard time navigating ubuntu and fedora command line interface. the linux forums didn't help much. from what i understand about these, kernel is like a middleware and it's not high level. it's closer to the machine than it is to the user. ps4 os, ios, osx all of them thrive because of the user interface and the support. otherwise everyone woulda use the middleware. edit 2: the kernel is free to use, it lessens development cost significantly, enabling software vendors to focus on support.
the vendors that use linux kernel, pay coders to build user-friendly gui and other supports so that general populace can use them without frustration. <- that's what companies pay coders for, for the support service.


 

hcl123

Honorable
Mar 18, 2013
425
0
10,780
They talk crossfire as if its going to stay centered in AFR (alternate frame rendering)... If hUMA/HSA is any good as a hint, tiling/supertiling across different rendering processor boundaries is quite facilitated. hUMA/HSA is already crossfire.

Thats is why i don't see APU yet to replace the all high end (FX), 512sp or even the double is not good enough for all the future needs, unless like intel they want to kill the the DT, for a market where x86 in doomed in the long run ... but if they don't want to kill it, then a GPU at 20nm will have easy >4000 sp...

( Radeon: 4 raster engines, as with Tahiti 4 blocks of 4 CU for each raster, makes 16x4(rasters) = 64 CU = 4096sp <400mm² (easier to get X2 and >8000 sp) ; GeForce: 6 raster (from 5 of today titan), 4 SMx(192sp each) from today 3 for each raster, makes 6x4x192 =4608sp < 500mm²)

umm... all those issues will be addressed in next year GPUs.. more than addressed... and how ridiculous 512 sp seems, to the point putting a top 2014 Radeon with an APU, better turn the APU GPU off.

No AMD needs FX... and if smart HTX+PCIe combo slots (its in patents), hUMA (need cache coherency and right now only HTT tech can provide)/HSA and natural crossfire with discrete parts will be accomplished then.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


LOL and more FUN.

First, that is only for "clock for clock".

Second, they didn't got a 8% in those tests, doesn't imply other tests don't exist.



LOL "how many users can directly use the kernel without a user-friendly gui?" You are kidding true? Because the other option would be that you don't understand the difference between a kernel, a CUI (or CLI), and a GUI.

This reinforces my belief that average joe is completely ignorant about linux.

Pay attention to what I wrote, because I didn't mention "kernel" I was talking about linux _distros_ and _OSs_. E.g. the SteamBox will use a linux _distro_ . FreeBSD is a _OS_, not a kernel and so on.

And the user friendly GUIs were not invented by Microsoft despite many people think so.
 

GOM3RPLY3R

Honorable
Mar 16, 2013
658
0
11,010


Wow you must need some help. True hardware meaning: They are not going to throw a real (true) AMD card in the console. The card in the console is a insanely modded one that has no where near similar performance to the real card.
 

Tell me, which is the better GPU? Speaking of which, about your GTX 680M analogy, a 680M is literally a 670 mashed into an MXM package. AMD can pull off putting 7860-ish GPU levels in the PS4's custom solution.
x360Gpu-Xenos.jpg

or
sapphire-x1900xt.jpg


ps3rsx.jpg

or
7900GTX-front-big.jpg

 
one more speculation.
looking back, when intel got outdone by amd in the past and faced problems, they started evolving their mobile-focussed(laptop) uarch, right? sandy bridge, ivy bridge, haswell are all results of that.
meanwhile, amd was developing a performance, desktop/server oriented uarch with bd. i think that may have been the mistake. whoever gave amd engineers the goal was wrong. edit: a top-down approach like that, to aim for high revenue market from the start, was way too risky. zambezi's pricing reflected that. if bd was developed as a mobile(laptop) uarch along with modular design, i think amd woulda been in a much, much better place money-wise. this is why i think jaguar is so good and why amd should evolve jaguar like intel did with uh.. conroe was it? amd half-assed with trinity to slap together a clocked down dt core with vliw4 igpu...not cool. the result wasn't bad. but it coulda been so, so much better if you try to imagine the potential. i want excavator to be like that or whatever ends up getting the excavator title.
intel seems to be already grooming silvermont and it's successors to be their next major uarch after haswell topped out. <- this is not a m.i.l.f. bait. i repeat, this is not a m.i.l.f. bait. it's a speculation about future amd cpus.
 

GOM3RPLY3R

Honorable
Mar 16, 2013
658
0
11,010


From one angle it looks great, but from all angles (reality), it's not going to be amazing like everyone thinks it will be. Again, just because it says so doesn't mean it is. There's no point in putting your expectations so high.
 

etayorius

Honorable
Jan 17, 2013
331
1
10,780
AMD Posted the following on Facebok:


*AMD Gaming
Have you noticed the @AMDFX tweets about September 23? We’re two days away from celebrating one of the most successful…*


It may be related to the Athlon 64 which was released of September 23 2003... could it be...

 

Perhaps we will have a Steamroller FX announcement on Sept 23 :D
AMD Athlon 64 X8 FX-85+ :p *Thinks of 3800+ 7900GT SLI AN8-32X memories*
 

montosaurous

Honorable
Aug 21, 2012
1,055
0
11,360
Steamroller should be interesting, but personally I'm far more concerned with Volcanic Islands. Getting kind of tired of my Radeon 7770, and I just got a job. Of course there are many other things I'd like to invest in, but a new GPU is first on my list. Hopefully they won't be too overpriced...
 

griptwister

Distinguished
Oct 7, 2012
1,437
0
19,460
SWEET Gibus! (TF2 Reference.) If this is true... the Titan has met its maker.

http://www.overclockarena.com/amd-radeon-r9-290x-with-hawaii-gpu-pictured-has-512-bit-4gb-memory/

On another note: Sept 23rd huh? I can only hope...

**edit** Well played Q6660 Inside! Well played in deed. Haha!
 

Radeon is back to over 9000, you know what this means. This may be the only truly great graphics card since the 5870 and GTX 480.
money.jpg

photo_r98p128d_big.jpg

http://www.youtube.com/watch?feature=player_detailpage&v=WOVjZqC1AE4
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Actually there is a reason you can't clock them that fast, it wasn't architected to run that fast. They will need to add more pipeline stages to do 3.5Ghz, and the IPC will take a hit.



Again with the crippling? Yes if you cripple the architecture by cutting off caches and RAM it will perform worse. Is that not obvious to everyone? What is the point of even making statements like that?



Understood, but that's a challenge for Mont-blanc not Intel or NVidia. They're the ones that wanted to take a "cheap" cell phone processor and turn it into a supercomputer. They will have to do the extra work to make sure the package can handle the thermals, or contract with NVidia for a custom and much more expensive version of the chip. Which NVidia is unlikely to do for under 1 million parts. Just not worth their time to do so.



You're implying that ARM is just going to keep scaling forever which is just not possible. Go back a generation (ARMv6) and ARM was an in-order processor. ARMv7 went to out-of-order processing which got them into a more modern era of computing but still pre-2000s era. ARMv8 adds 64bit processing which finally gets them into server capability.

But where do they go from there? What is the next big leap for ARM now that they have played all the BIG tricks that AMD/Intel used a decade ago?

We already know mostly what they'll have to do. More transistors, larger caches, higher clock speeds, longer pipelines, more ALU, wider memory interfaces. All these things ADD to power consumption. There is no free lunch here.

A lot of ARMs success is they've had a 10 year technology road-map laid out in front of them. They're leveraging the pain sweat and tears of Intel/AMD/IBM/Oracle, which have actually pioneered technology.

Hindsight is 20/20 right? What will happen when ARM has to do some of their own innovation? They're actually a pretty small company of 2500 employees, of which probably half are lawyers/sales.



Yes great for Apple. Going 64bit has it's payoffs which we've all enjoyed for the last decade in x86-64 land. I was impressed that Apple actually got the whole of iOS7 ported to 64bit so quickly when Microsoft took years to do that.

You're also looking at a 1 billion transistor SoC. That's a big departure for ARM cores which are usually touted for being smaller than x86.

http://en.wikipedia.org/wiki/Transistor_count

That part is in the transistor count realm of a 6 core Opteron 2400 or a 4 core Sandy Bridge i7-2700K.

Now if they wanted to double the speed and go to a 4 core version they would have to go to 2 billion transistors. So much for small ARM chips. More transistors hurts yields and brings prices up. Just like Intel/AMD they're going to be constrained by process technology and thermals.



RISC or CISC doesn't really matter. It's all about performance/watt. Intel has had RISC like cores since the Pentium Pro and the introduction of micro-ops.
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


512bit memory interface is only part of the issue... remains to be seen if it has Quad raster engines ( Titan have 5)

http://www.3dcenter.org/news/erste-spezifikationen-zu-amds-hawaii-grafikchip

It then it will make something like a dual PS4

http://translate.google.com/translate?depth=1&hl=pt-BR&rurl=translate.google.com&sl=ja&tl=en&u=http://pc.watch.impress.co.jp/img/pcw/docs/604/107/html/19.jpg.html&sandbox=0&usg=ALkJrhhf8kyaaZM0-ZgBs0k0dOHTH9UvEA

Between 2304sp up to 3072 sp ( 3 CU groups like PS4 per raster, at lest 3 CU in each group(2304) to 4 CU in each group (3072)).

Hawaii vs Tahiti : 50% more CU groups matches the layout of PS4, if GDDR5 6500 50% more bandwidth, if 48ROP then 50% more ROP, if 3072sp then 50% more sp and TMU...

If they could fit all that in ~430mm² then its a really amazing job of optimization, though i believe something more like 440mm² (its not square)... performance should be > 50% over Tahiti , its GCN2, and the augment in structures points to that ( a full blown Hawaii should be at least 20 to 30% above Titan)
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


I got confirmation recently from an AMD fellow that Kaveri is on bulk not SOI. It was already publicly stated anyway, so it's not a breach of confidence.

That doesn't preclude other SR from being on SOI as the contact is in the APU division not the server division. Same cores but different design teams.
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780
The Steamroller i was referencing was not Kaveri... but a probably 5M/10C, with the same 28nm half node compared to bulk that GloFo is using for FD-SOI litho. It could be FD-SOI, but it could also be PD-SOI and use the same lithography ( ~35 to 40% smaller than 32nm)... and given the recent uter surprise of 22nm PD-SOI for a 650mm² monster at 4Ghz(IBM Power 8 )(edt)... i would prefer PD-SOI lol ( 5M/10C SR FX/Server would be less than 300mm², even with 3 channels of DRAM-> obviously could never fit AM3(+), not even server C32, so most probably only 2H14.. if not 15)
 

jdwii

Splendid
Looks like a decent diffrence if the numbers are true at least we will be able to see the performance of steamroller this year
http://wccftech.com/amd-desktop-kaveri-apu-13-cus-enabled-radeon-r5-m200-832-stream-processors-gpu-spotted/
 
Status
Not open for further replies.