AMD CPU speculation... and expert conjecture

Page 174 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860


depends on who you want to believe.
Toms:
average-power.png

Anandtech:
55330.png


Xbit:
power-3.png


pcp:
power-load.png


the list goes on and on. very few even attempted to say haswell draws less load power than ivy. only thing hasbeen has going for it is a slight drop in idle power, but nowhere near the 50% less intel was claiming.

http://www.extremetech.com/computing/156739-intel-haswell-will-draw-50-less-power-than-ivy-bridge
 
What would be more interesting to me is if they used a 4.5Ghz Ivy and its power consumption against HSW, it would tell us much more than we see with these numbers.
Are any savings at higher clocks still there? Or, is HSW maxxed out for DT, with its features designed more for mobile and no OCing.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


It's not just hardware that's driving the market. Most of that is the software side which always lags far behind the hardware. Intel could give us a 8C/16T cpu for the price of an i7-4770k but they added more graphics instead. The issue is current software just won't take advantage of the extra cores so they really had no choice.

Even in the ARM world there is a disparity between what the hardware can theoretically do and what the software allows it to do. The whole big.LITTLE architecture that Samsung started using first with their 8 core CPUs. Turns out it's not that easy to implement and it doesn't cut power like it should.

The OS's are going to have to get power aware and I'm guessing that's years aware from happening. OS's are all about virtualization and hardware abstraction but in the end it's down to real physics. If you have 4 cores and 2 of them are physically closer to the GPU block, it's not rocket science to guess which cores should be overclocked first.

At least the hardware is adding to the tool set with more P-states, power gating, and per core clocking. Enthusiasts will just have to keep getting better at wringing out performance. So far I haven't seen a single review of what happens with a Haswell i5/i7 with the GPU turned off or severely underclocked. With the GPU taking up to 65% of the CPU that must have an effect on cooling.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Desktop is just one aspect of Haswell that doesn't have all of the power saving modes enabled. There will probably be over 100 SKUs before the year is over.

"The new active idle (S0ix) states are not supported by any of the desktop SKUs. It’s only the forthcoming Y and U series parts that support S0ix."

 
But here again, going to more cores is stifled by throwing away the APU design, which is what average Joe wants, OEMs want, providing the igpu can function "good enough"
Going to a greater core count while dropping their igpu wouldnt help them in a larger picture, because once started, I doubt either company will go back, and volumes need to sell for margins, so APU is here to stay.
Could they come out with exclusive multi core?
We see them in the server chips, again, because the pricing allows for margins.

Intel is stuck with their igpus, and needs to sacrifice die space for them as they mature and get better.
AMDs complete package is from the top down, achieving its needs save for power and BW.

SW wont save density/heat/thermal/physics problems, because Intel wont make a seperate line strictly for DT with DT type enhancements when they can give good enough to all and focus on the changing market which is mobile.
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


ummm.. not the voltage regulator, but non-deterministic approaches to TDP, that is, for very small time intervals you can go quite above the targeted TDP(now even AMD uses this for the temperature turbo). The idea of this techs is that if you execute faster, then you have more time to throttle down or sleep, so your "average" power consumed in a determined more large time interval tends to be lower, than if you execute slower with less power but spend less time idling (i think its a good approximation to the idea).

The main problem with this is that heat generated to push higher performance, doesn't vanishes as soon as the current applied to the circuits vanishes, that is, clock gating or power gating etc is instant in cutting power, heat dissipates much more slowly.

Also seems to me HSW fab process is even more pushed for "low power" than with IB, which might mean that the general impedance of the circuits got higher, its good for low clock low power, but tends to generate more heat passed a certain supplied "current" level (not voltage), and the bigger a chip is the more circuits electric current is passing the more heat it generates, and being performance almost identical, chances to get throttling down or gating is identical. You may even get lower average TDP marks, but heat gets worst.

At the end of the day, the old rule applies, you can't get high clock an high performance and low power at the same time... only rude approximations.



Now that massively rejected i doubt very much. Its not from today, even at the time of the P4 vs the K8, Intel managed to sell much more chips in total than AMD, P4 went to all sectors including servers, though being clearly inferior in all aspects than K8... meaning not only never was intel in a dire position as AMD is now, it even grew above competition based on clear inferior products.

"Propaganda" always counted a lot in this affairs, and becnhmark business is his main tool. It will be worst now, or even more favorable to intel, because intel has the "money" and influence to subsidize large part of their sales, meaning some deals with those bigger OEMs can get so sweat that they are almost irresistible (meaning they can get the parts almost for peanuts, and so potentially make huge profits at the expense of non the wiser users that will pay premium for products that are not premium at all).

In the end is up to the users really, no law or no force can stop this once in runaway mode. And its a trend not only for IT, its every commercial sector, advertising and propaganda are major factors, other issues including technical tend to get a back stage(but omitted or spin otherwise by the marketing).




 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


Yes Nvidia is gonna feel the pressure more than AMD... and the principal "actor" in a technical prespective is gonna be AMD not intel. The discrete GPU market might shrink abruptly in the next years, is quite possible to put in an APU (or CPU+GPU) levels of graphics performance on par with middle-high range levels of today (1024 sp quite above in performance to the same 1024sp of today, is possible with those embedded/stacked DRAM) , which will be the most sold by far.

To battle this would be *software* dedicate to a certain hardware, like AMD games bundles but take to all multimedia apps, or go a competing route with chances of success... namely the ARM route.

Nvidia doesn't have the compute cloth outside of HPC for the software part, no matter how much they delude themselves, AMD has with HSA. CUDA simply inst really for the task for client side or heterogenous multimedia compute apps

Ok! ARM is in HSA in force, but Nvidia is not... and in HSA the hardware implementations are almost free to every IDM, doubt nvidia will or be authorized to follow those standards without being a member, worst they have a $1.5B deal with intel, starts to seem like a rock and a hard place, except outside of x86. So a bet in force in Tegra like might be the only salvation in long term, quite convinced ARM will spill in force to the "mobile" and "small form factor PC" sectors with the 64bit versions (servers OTOH is an almost done deal for ARM), lets see what nvidia invents, if they are any good at CPUs, but at the end of the day i'm afraid they may end up with no other choice but to follow HSA.

So medium long term, i think that the battle will be more intel vs HSA, than intel vs ARM or intel vs AMD for that matter. Nvidia will manage to be a considerable competitor if they manage to be a top HSA ARM supplier, which is the route that AMD might end up following( transition to ARM and be a top HSA ARM supplier, at least ARM server is already settled).
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


Yes it might be truth. Don't know what the deals AMD had back then with M$, but perhaps not has good as now. Now the BD OS "scheduling" issues was fixed but by AMD not MSFT. A NUMA aware scheduler could had been done back then to. Ok suppositions, back then was pretty much wintel on the open...

EDIT: OTOH i don't know if that was result of the too much intense "probe" traffic between sockets that clogged memory access. An hypertransport ccNUMA system is a seamless SMP (simultaneous multi-processing) system. Server loads might not feel this much because they have much larger data sets more well aligned, client software might not be nearly as well behaved. Probe filters or "HT assist" (a parse directory based coherency mechanism) only came with Instabul... QuadFX could had been launched then...



1) Most is SOFTWARE side. period ... even the much debated benchmarks is *software*, "arranged" to help sell hardware. Hardware without software = paper weight (many have told), no matter how good the hardware is.

About the 8C/16T it just doesn't make any sense now for the desktop. What ppl are parroting of propaganda without giving it a second thought is that AMD is a fail because it has low single-thread performance, this counting half of what you point. And now you want the double thread count without any software in the client side able to take real advantage of it !?.. first a very good push to make general apps much more thread aware is undoubtedly necessary (multithreading hardware without software = paper weight in full force)

But happens that BD even the first, was better at multithreading for the same number of threads... don't expect such push for multithreading client software anytime soon from the part of the leader, it just doesn't suit them, and they can have a very great influence (in any direction). Also GPGPU compute power will get at least an order of magnitude improvement than CPU cores alone(can't show in the much more CPU centric Lucmark as example), it seems a lost battle, because intel hasn't been able to stop it(only remains to adopt it, which they are doing), but again this needs software, and in this i think HSA has an upper hand.

2) ARM software is much more clean and less bloated (the ARM ISA helps in this, is a RISC with a memory relaxed model). So they can very well compensate any lack of theoretical max GIPS of FLOPS performance by the software side, meaning you don't feel any sluggishness, matter of fact is as snap fast as any other top system.(performance is dictated by the software... as clear as water... and yes it has relation with benchmark software).

3) I think the power frenzy, even in mobile/portable will loose a good deal of its appeal in the next years, 2 to 3 times more energy density batteries are sprouting out on the market right now, it will not take long before general adoption... then performance and what new or better tricks this worses can do, will get new interests, but in here i see "heterogenous dominate" (from face recog to gesture to speech commands etc etc )

 
i've been thinking about good enough...
the first thing popped into my mind was an hsa enabled arm apu with gcn 2.0/3.0 shaders made by amd on a problem-free process using the usual techs like rcm, hdl, fd-soi (as far as my imagination allows :)) etc. with proper power monitoring and controlling logic and better power gating.
a little more thinking started to show these gaps in the fantasy.
amd doesn't have a working mainstream arm core. yet. but doesn't have it nonetheless. jaguar, otoh, seems perfectly suited despite being x86 arch. so if i imagine the same thing with arm replaced by a jaguar or jaguar-successor core, looks more realistic and less gap-y.
amd is process-bound i.e. they may have a good arch but be hindered by process. i assume something like that happened in 2009 and again with llano (caused to lose apple contract).
apple (or any other high volume arm customer) can monopolize process and lock amd out. i suspect apple might be doing it to nvidia @28-20/14nm but i cannot confirm.
amd's power monitoring and power gating needs work.
[strike]current bd arch seems transistor hungry.[/strike] needs to be tuned to use less transistors, keeping the modularity and semi-customizability intact.
edit:whoops. i am not so sure about this. i was blaming higher power use on linear increase with transistor count.

but when i consider overall good enough, in terms of power efficiency, software ecosystem, device availability, choice of hardwares - arm has been dominating that for a few years now, and resulted in x86 being increasingly cornered.

the rather unethical business practices seem a bit different from intel vs the rest or x86. there's incessant patent-trolling, followed by process monopolizing, passive slandering and massive p.r., locking out popular apps. the recent new trend seem to be restricting different sources ranging from raw materials to socs and other components like display panels.
....
okay it's pretty much the same as intel vs others. it's corporate business... sigh...
what i'm trying to say is that amd might be getting off the x86 pond, but when they swim in the arm sea, they'll be part of a much larger food chain and they need to be close or at the top as much as possible or risk being eaten. figuratively typing.



nothing good.
...
but great news: kaveri exists. in silicon form. and it runs, games and stuff. haven't seen anything official/solid about specs yet. huma, hsa, gcn igpu are pretty much confirmed.
why won't amd just come out and say kaveri has steamroller cores? :S they didn't hide anything about kabini. kaveri's on track for 2013 launch, not sure if it's paper launch or real launch.

@gt3e: apple wanted gt3e with ivy bridge iirc. intel declined, citing scaling reasons (poor excuse imo). i suspect intel coulda launched hd3000 with clarkdale/arrendale cpus but they were being lazy. getting as much shader scaling and perf as gt3e roughly within a year of hd4k launch is suspicious. intel's igpu roadmap also looks suspicious. y'know, the one with 40x and stuff. anywho, now intel sees enough reason and possible a few other oems might have asked for it too, giving intel enough incentive to extort money er... finally implement gt3e and put it in mass production. any gt3 that fails notebook validation might wind up in nuc/brix or a-i-os as hd5000/5100 (likely @$500-700+).
seriously, no one sees the innuendos in those names? :pt1cable:

edit2: really want to see a duel between 4core silvermont + gt3(40eus) vs 4c jaguar + 384 gcn shaders!!
 
As some have discussed, using less power for mobile, and since that is the largest growing market, and where theres money to be had...
As hcl brought up, it isnt just our chips controlling this, as there are potential huge gains to be had in batteries:
http://www.extremetech.com/extreme/157525-new-sulfur-based-battery-is-safer-cheaper-more-powerful-than-lithium-ion
Now, this isnt the only one on the potential block either, but a good example.

So, again, this brings us around to the good enough scenario, where power usage could no longer be such a driving issue in the mobile market, as cheap energy once found brings a boom in perfomance.
If you have a weakness, which is good enough fail, then you could be left behind.
When it comes to average Joe, and decent markets he uses, sustainable ones, where progress and monies abound it no longer becomes whos cpu is microseconds faster when theyre good enough for average Joe, but the entire experience of Joes device, which includes gfx capabilities as well.
Throw anything at a chip that Joe may like to use, it should work, good enough.
 

8350rocks

Distinguished


But system wide numbers are up...at idle it's slightly lower, but as soon as you open an internet browser...you're drawing more juice than IB was. More battery life on standby sure...more battery life at load? LOL...yeah...they'll keep saying that, but it isn't true.
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


Yes what AMD has stated of ARM is Seamicro cloud servers... and rumors about "hybrid" systems to, that is, x86+ARM. Don't know at what level that will happen ( IF ), if at a platform level, that is, mix those Seamicro micro-blades, ones with x86 others with ARM, or develop an entire die where ARM is mixed with x86 (perfectly doable, matter of fact the chinese godson is MIPS with x86 at the chip die level).

About the client side doubt AMD will have ARM anytime soon. But they can arrange a deal ala game consoles (PS4,xbox), and put a full HSA GNC based GPGPU on some ARM 64bit (of which they are developing for server to) offerings of someone of the ARM armada. AMD is ahead in the HSA game, this deal would mean that that ARM dealer will be ahead on HSA to, and most probably at games to compared with PowerVR or Mali or Nvidia.

A good candidate i see will be STMicro [ the A57 64 bit can be up to 30% above A15 in performance and so ~2.3x above A9, yet consume less power than this last one, and STMicro at 2.5 to 3ghz (which they showed for their eQuad) with a piece like that able to go down to less than 2W (a A57 double core version) will be truly superior in every aspect(missing Android 64 bit which is on the ramp to i think)], they can make a deal changing with licensing for the superior STMicro FD-SOI process that is to be fab at Glofo. It could be beneficial to all. Temash could suffer from additional competition, but in exchange AMD could have the best process bar none to low power chips for all his APU offerings.

Speculation alright, but a gran-entrance in the smarth/super phone business for AMD (and what STMicro will miss to catch a much better portion of market)...

but great news: kaveri exists. in silicon form. and it runs, games and stuff. haven't seen anything official/solid about specs yet. huma, hsa, gcn igpu are pretty much confirmed.
why won't amd just come out and say kaveri has steamroller cores? :S they didn't hide anything about kabini. kaveri's on track for 2013 launch, not sure if it's paper launch or real launch.

They already did that "officially". And the first version will be 28nm (officially stated to), and as stated in an interview likely a bulk Glofo process. If the first versions of Kaveri will be mobile especially centered at Ultra-thins, low power bulk can be a good bet, and most probably this pieces wont have GDDR5 and be at most 384sp.

Its quite possible that higher versions, top laptop and desktop versions will have GDDR5 ( but probably only beginning 2014), be up to 512sp, and be made at Glofo 28 SHP that is 28nm PD-SOI . At least it leads to suspect, Glofo talks plenty about 28nm, 20nm and even the 14nm XM finfet, all bulk processes... but Glofo never talked about what is the 28 SHP that appeared in some of their official slides.

http://translate.googleusercontent.com/translate_c?depth=1&hl=pt-BR&rurl=translate.google.com&sl=ja&tl=en&u=http://pc.watch.impress.co.jp/docs/column/kaigai/20130425_597459.html&usg=ALkJrhj5iCPDW56y7_m02XjJ0ay7X9SLuw

http://translate.googleusercontent.com/translate_c?depth=1&hl=pt-BR&rurl=translate.google.com&sl=ja&tl=en&u=http://pc.watch.impress.co.jp/docs/column/kaigai/20130502_598132.html&usg=ALkJrhiw5Sent091tNFVo1nU1Bw3hWz11A

If 28nm PD-SOI (?).. wise AMD, but it must had been a "greased pigs" contest with Glofo to get this lol



For electronics usually Li-Ion batteries are yet a little below the 200 Wh/Kg (though there are some -Panasonic- around 240 ). Envia now as a "commercial" tech already with 400 Wh/Kg (its on the market) http://enviasystems.com/ (they focus on automobile alright, but perfectly adaptable i think)

Power densities can also grow (1100 W/kg you could run a full desktop on this), not only energy densities, meaning portable chips with more clock, but also that last longer in operation(more energy density), and withstand more deep cycles (up to 10.000)

http://www.greencarcongress.com/2013/06/zsw-20130604.html
(pdf in article)

The 4 to 5x more energy densities like those sulfur batteries, are yet a few years away i think.

 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


ummm... yes(EDIT lol). Where failwell tech will shine in grand is exactly at standby and idling power, the all chipset is managed from the chip die, the voltage regulators are on "socket" with the large part of the all voltage management on chip die.

Yet this doesn't mean much for load power, its a tech more suitable for phones, suitable to spend a lot of time sleeping. The problem of failwell is "heat" (the higher the heat tends to accumulate the higher the TDP will be), those small form factors can't have good coolers, and intel with haswell is standing in the middle-ground, have real low power techs (even process more tweaked for this) on a general purpose chip that is going way up to mainstream desktop offerings.

Running way up into the 3Ghzs is quite different than running up to 2Ghz, where haswell feels comfortably, push clocks a little further and heat starts to shown [ this process is not clearly for high clock, 84W compared with the 220W for PD-SOI could be as much factory OC as the centurion, different processes different possibilities, not invalidating you can push any process as high or as down (Trinity PD-SOI at 17W as example ) as possible independent of power, only its not the best options).

 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860
Here is part of the problem with hasbeen.

figure4.gif


its one chip, one design for low power. look at the vrm efficiency. If you design it for low power, it sucks at high power. If its designed for high power, its not quite as good at low power. MB designers could hand-design their boards for the end-users purpose.

This chart i think was from ~2006. Today the efficency peaks ~95%. If you hand desing your entire system for the load that will be drawn (easier and commonly done on servers), you can hand-pick your vrm module such that its always or almost always at 95%. The curve difference is still the same. if designed for low power, it peaks really fast and drops off fairly quick. If its designed for high power, it peaks slower, but doesn't drop off nearly as fast (wich is why its essential for overclocking to have 6 or more phase vrm)

With haswell ... its designed for low power apparently (see idle power) so at high power, you get an inefficient design. This is intel's way of giving the finger to desktop computers and even more so to overclockers.



 
I agree, for somebody like you or I who mainly run "real" programs on Linux, a super G34 (or 2 or 4 of them) would be much better than the Centurion becuase it would boost multithreaded performance a lot and that is what matters. Tech sites are paid for by traffic. The big popular sites are all predominantly gaming/overclocking/benchmarking oriented. There is only one I can think of that even touches on Linux, and that's even a partially subscription funded site (Phoronix.) Unfortunately many people seem to care about the benchmark bragging rights a lot, so the fake/canned benchmarks really DO mean something to them.

This is why industry use's SPEC as their benchmarking service. You can contact them and they will analyze whatever software your using to create trace sets (profiles and sets of code that your actually running) and from those create a user specific benchmark for you. Either them or yourself can then run that benchmark on different platforms to determine actual real-world performance. It's not particularly cheap but if your going to be spending a few mil USD on a new engineered system then it pays dividends to know what HW runs your code the best.

Best thing about SPEC is they are incredibly transparent, they will tell you everything about the conditions of a particular benchmark (compiler settings / flags). No hiding data or biased info, just plain performance data.
 


Do not make me start analyzing compiler signatures again; didn't I debunk the "Things use ICC" myth already?

I know of exactly ONE benchmark that uses ICC, everything else uses MSVC, because it is the best development studio on gods green earth, and nothing else comes close.
 

that's your beef with haswell? :lol:
wait till you see haswell u-series specs. no pcie 3.0 support. no discreet gfx support. <- this may trickle down to desktop broadwell or skylake.
the nuc/brix pcs use u-series cpus iirc.
bleak times ahead. :D
edit: at this point, i think sandy bridge was the last desktop-oriented(!) cpu lineup from intel.
 

abitoms

Distinguished
Apr 15, 2010
81
0
18,630
(the following - you might be already knowing)

i am a faculty and had called an AMD server representative at my country (not JF-AMD !!!)

we were discussing the possibility of adding some AMD hardware in our institute, and he said HSA hardware is coming in the OND 2013 period. OND = OctNovDec

I think he was referring to client hardware. He didn't mention if it was going to be a paper launch or a real one. And he also said AMD+ARM servers in 2014. I don't recall if he said Q1 or Q2. He said the delay to 2014 is because AMD is waiting for 64-bit in ARM
 
Status
Not open for further replies.