AMD CPU speculation... and expert conjecture

Page 218 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Refer to: http://www.tomshardware.com/forum/352312-28-steamroller-speculation-expert-conjecture/page-111#11150830 (Link fixed)

http://www.anandtech.com/show/1610/20 (insult2injury)
 


My guess is that the 2M/4C Steamroller die is the "most commonly used" die and is inexpensive to make, since it is being used in laptop, mainstream desktop, and 1P server. AMD is seeming to take a page from Intel and introduce the "small die" parts first and then later on come with the 2P/4P server-derived parts on a larger die (e.g. 2P is still on Sandy Bridge despite Haswell being released on desktop/mobile.)
 

8350rocks

Distinguished
Thought I would share this info:

http://semiaccurate.com/2013/07/12/apple-has-their-own-fab/

Apple will fab their own chips...the ramifications of this could be widespread. For example...TSMC 20nm process could be freed up if Apple can ramp up to meet their own demands.
 

+1 MU_Engineer

12+ Core Steamroller is likely going to release for Server products later in 2014 around the same time as the FX release, FX will likely come back in the same server derived 4/6/8-core platform we have seen for an eternity, all we have seen for now is really Kaveri\Berlin APU(s).
 


Unlikey, but it would be beneficial for everybody else.
 

I will be years until Apple can leave TSMC. I'm guessing at least 3. They might leave at about when they can get 14nm finfetts up themselves.
 

cowboy44mag

Guest
Jan 24, 2013
315
0
10,810
Honestly I think we have beat the performance/watt horse to death, haven't we? Most everyone on this thread is looking for desktop Steamroller processors, and as they don't run off a battery power consumption is not an issue. As long as Steamroller can deliver the performance increase we are all hoping for, then it will be a big success. Price/Performance is much more important to desktop users than Performance/Watt period end of discussion.
 

rmpumper

Distinguished
Apr 17, 2009
459
0
18,810


This one is better - a lot of games and apps tested with most of the CPUs availabe (both stock and OC'ed):
160 CPUs tested
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


I think people shouldn't expect miracles from SR. Take the latest A10-6800K benchmarks and add 30% for best case scenario. That's the best the 4C SR APU will do.

It's a good jump but don't expect it to beat the higher clocked i5s. It should put a hurt on i3s though, which is it's actual competition price wise.
 

rmpumper

Distinguished
Apr 17, 2009
459
0
18,810


Nope. i3 has a crap GPU in it so AMD's APUs price wise are competing with Pentium CPUs + low end discrete GPUs.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810



No one sells Pentiums with discrete graphics. People may build them custom but you won't see them in stores.
 

rmpumper

Distinguished
Apr 17, 2009
459
0
18,810


No one cares about pre-built wallmart junk PCs.
 

howee

Honorable
Apr 10, 2013
21
0
10,510


Is that so? You would be surprised :na:
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


This one

http://www.phoronix.com/scan.php?page=article&item=llvm_clang33_3way&num=4

But don't get an aneurysm lol... its not the "opteron" its the SOFTWARE (more tuned for it)... the same with all those PassMark etc

At least its not "uber ridiculous"... yes like 40%... this is >600% for some test

If that doesn't "enter" then perhaps its better to cut off your head and replaced it by a new one lol



 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860


another useless post.
Processor Intel Core i7-3940XM
Processor clock : 4,290 MHz

I bet you were hoping no one noticed that one
 

cowboy44mag

Guest
Jan 24, 2013
315
0
10,810


Don't get me wrong, I totally agree with your comment, however I am much more interested in Steamroller FX. I do believe AMD can get 30% improvement over A10-6800K, and that would be an impressive APU, however the FX is still the "performance" end of AMD. If they can get the same 30% improvement from FX-8350 to Steamroller FX, then the gap in performance becomes even more dependent on the software and fine tuning as to which brand is better. At any rate 30% improvement over FX-8350 would be a total victory for AMD and would be an awesome processor.
 

hcl123

Honorable
Mar 18, 2013
425
0
10,780


wow!... an actual AMD uarch probably related Steamroller post, in a AMD uarch "expert" Steamroller thread ????? ... naa! i think "juanrga" is derailing the thread lol

elas!.. NO decode is not the main problem... it can be, depends on the software(compiler)... the problem with decode its the same as "If" a single core per "decode engine", that is if it were there only 1 integer core would be the same problem, i.e., in x86 the "complex" decode pipe blocks the others upon more complex instructions to decode, that is, upon more than 1 MacroOp( 2 microOPs, execute + memory), the 4 decode pipe acts like if it were only 1... the same thing happens with intel uarchs.(edt)

*IF* (not easy due to the "strong dependency model" of x86, which "a priori" sees every instruction dependent on another)... it was possible that upon decoding "complex" instructions the "complex decode pipe" doesn't block the others, the "same 4 decode" pipes would act like more 2 decode engines(or more) than else...

I think that is what happens with "Steamroller", it has the same 4 decode pipes of BD/PD, only perhaps instructions from one thread don't block instructions from the other thread, that is, Steamroller must have 2 "complex" decode pipes on those 4, that assume "non-dependent" instructions by checking the "contexts", and relax the dependency checking.

Yes... steamroller most probably will have the same 4 decode pipes of BD/PD... only arranged in a different fashion.

The question of "Vertical Multi-Threading" is not the culprit either, actually it is what makes it clearly superior to any intel uarch, that VMT works upon 2 "open contexts", that is, its incredible fast "internally changing the thread contexts" ( otherwise if it were OS dependent it would be slower than a Pentium I (quite slow) lol).. and the operation is inherently "asynchronous" (quite difficult).

VMT work like this.. .contexts can change from 1 cycle to another

AA or BB ... of course A_ or B_ or _ _ can happen... ( but there NEVER is AB or BA, that is SMT and no context changing needed) (edt)... but can happen also in STM (simultaneous multithreading = hyperthreading)... or in single cores... caches misses always leads to "bubbles & stalls".

So it is not VMT the problem, since its not only decode that is VMT... its fetch, branch, dispatch including the "FlexFPU frontend". The decode problem can be fixed by relaxing the "dependency" checking and constrains upon decoding complex operations (2 "independent" complex operations, forcingly upon 2 thread contexts) .. MOAR = brute force, always leads to disappointing results.

VMT is so nice... and since a "module" is an optimized 2 cores sharing... that is, another CPU core "jumped inside" another core to make a module, that in the future i see a "module" jumping inside another module (sharing) to make a 4 core/thread module lol

I also see VMT use in a "Horizontal" way... so Horizontal Multi-Threading using the same "asynchronous open contexts" of the VMT( in an horizontal way) and each Cluster ( integer core) have more than 1 thread context, yet not being SMT(simultaneous multi-threading) but an evolution of the CMT (cluster multi-threading) concept.

[ UPDATE: in actually the scheme of of BD/PD is... A ->B -> A ->B -> ... its "interleaving", its one thread one cycle the other thread the other cycle, being the beauty of VTM that you have more fine grain control on the execution, i.e., it can be A->A->A->A->B ->B -> B -> B ... or any granularity between 1 and 4 instructions...

SteamRoller changes this to A-A ->B-B->A-A-> ... that is the "minimal granularity" passes to be " 2 instructions" in any combination up to 4( can be AAAA or BBBB)... it simplifies it, and probably wastes less power on context switches, and has no adverse effect on performance...

So the advantage of VMT over SMT is exactly the possibility of sharing resources having a fine-grain control, upon cache misses or other events, one thread never clogs the other , contrary to SMT/Hyperthreading that had to have resources augmented since Nehalem, since it was often that with SMT/HT turned off, the performance was greater than with SMT/HT on!

On VMT this is way much better (ok it doesn't approach the performance of 2 separated cores... but NOTHING will...)... if a thread just grabs all resources (and even waiting on instructions and or data can grab resources, that is, doing nothing can grab resources lol) then the VMT control can simply switch to another, it leads to much more efficient utilization of resources ]
 

GOM3RPLY3R

Honorable
Mar 16, 2013
658
0
11,010


That's a nice article! Really goes into depth. That's one reason why I'm getting the Intel CPU over an AMD. ArmA 2. The 3770k and 3930k gained about 5 frames over the 8350. :3

EDIT: And also BF3. ~ 15 frames increase with the 3930k. :D
 




In value for money terms (170+)
3930K (If the user can take advantage of it)>8350>3570K\4670K>8320>3770K\4770K>3820>3960X>3970X:lol: IMO

There is no denying the 3930K is good, even 8350rocks said he would get one if he needed a step up from the 8350 :3
 

GOM3RPLY3R

Honorable
Mar 16, 2013
658
0
11,010


Yeah. I saw someone on youtube a couple days ago with one and they re-encoded a 30 minute, 14 GB 1080p video in ~ minutes... And it was at stock clock. I just can't wait to see "All CPU meter" with the 12 cores all there (6 physical 6 HT, obviously).

3930kCPUMETER.png.html


:D
 

jdwii

Splendid


Plus the most important thing your forgetting is in 50 or so years the price difference will make you the real winner
 

GOM3RPLY3R

Honorable
Mar 16, 2013
658
0
11,010


Totally!
 

GOM3RPLY3R

Honorable
Mar 16, 2013
658
0
11,010


Oh my god, I just died from laughing. That's definitely an upperclass education there. :D
 
Status
Not open for further replies.