AMD CPU speculation... and expert conjecture

Page 128 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

griptwister

Distinguished
Oct 7, 2012
1,437
0
19,460
@cowboy44mag: For sure man! Haha! Though, I will say, in certain games that 10 FPS will make a difference in smoothness. But GPUs are far more important an upgrade than CPUs. Also, I wish I had a HD 7970! Lol, Enjoy it man! I guess I'll wait for the 8XXX Series for my next Serious upgrade.

@de5_Roy: June? Man... I was hoping earlier, that way they can rush Kaveri out :D Lol. Weather thats how it works or not, I was hoping for it. I also forgot it's already May!!! :O Time flys.
 


Well Kaveri will have 1GB of embedded GDDR5 yet there seems to be much contention about that, now we know it has unified memory controllers as well, no longer leaching resources of the system for dedicated usage. The leverage intel has is being able to drive the graphics clock up to ridiculous levels, but that has a downside. Most of us are expecting DT GT3 to be more or a less on par with the Trinity top end part, which one you choose to base a system off comes down to money. The APU is much cheaper but the intel platform will be faster.

Haswell will need to be good as it will likely have 3 generations of APU's competing with it, Trinity which will be EOL soon, Richland which is looking really good but the major step up is Kaveri, its probably going to be the first Graphics on Chip solution to play at mainstream level.



"how effective any piece of hardware is directly related to the software tied into it. how irrelevent it makes itself and empowers the software around it."

Right now software writers are still pushing single thread optimisations leaving the world of multi processing and heterogeneous system processing and the infinite potential it offers to drive x86 into bed rock. Well we have seen a number of game developers say in the last month or two that with consoles unifying the market they now have something new to maximise performance that is not in a traditional manner. While it was true for older consoles that specialized silicon was acceptable and understandable it is no longer pertinent to todays climate. Sony and AMD had been in discussion for longer than most will themselves to accept, this wasn't AMD is the only player with a all in one so lets just take it because its cheap, that is blatent naivity as to what AMD and Sony have produced and from those that have had the chance to see the PS4 all say its nothing they have seen before, but the effectiveness and the true beauty is not the specs of the PS4 its how hardware, firmware, software have created a fully heterogeneous system capable of seemless execution of all its resources in an efficient manner.



 

yeah, it's just one month away. richland is a stopgap lineup since kaveri was the one supposed to launch in summer (llano, trinity all launched in summer...may-june). this way amd keeps up with their promised 'new apu every year' and 'one more apu for socket fm2' mottos.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


HP is offering said cheap disposable servers with their Project Moonshot. Currently Intel based but they've added ARM to the mix. Would HP do that if they didn't also have the software stacks, services, management, and expertise to offer those solutions with a competitive advantage? It means they've already done that work.

Maybe Intel/MS will get to maintain key aspects but the growth segments are what matters most.

Record numbers of smartphones/tablets and soon watches are being added. What is servicing this new volume of clients? If the infrastructure to handle this increasing load was that locked to Intel wouldn't their sales also be increasing at a comparable rate or some fraction of that? Instead of being flat?

The numbers suggest it's not Intel, so it must be going to PPC/Sparc or ARM.

http://seekingalpha.com/instablog/3211861-awannabe/1612281-intel-s-monopoly-to-serve-computing-needs-is-gone
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


http://www.geek.com/chips/the-inaccurate-benchmark-conundrum-553369/

http://techreport.com/review/17732/intel-graphics-drivers-employ-questionable-3dmark-vantage-optimizations

http://semiaccurate.com/2012/01/09/intel-fakes-ivy-bridge-graphics-on-stage-at-ces/
 


The entire GPU buffer isn't reserved by the OS. I think only about 512MB or so is reserved at any one time by the OS.

Coincidentally, we probably will never need more then about 36 bits or so. Even a database of hte DNA molecule doesn't need more then that. Hell, even weather simulation engines (which are MASSIVE) need that much RAM. So I'll be happy once that switch is done with, as I doubt we'll need a bigger address space in my lifetime.

Since tessellation is already implemented, as well as physics, these new consoles could very well use their resources, and again, if the big houses choose not to do so, for monies/profit sake, the little guys will, as well as some larger groups.

Physics is NOT implemented. People really don't get how incredibly simple most physics implementations are these days. Single object interactions are really easy to compute, multiple object dynamic interactions are NOT. If you want good physics implantation, you need a massively parallel processor, because you will bog down on a serial CPU. Tesselation meanwhile is almost entirely GPU bound.

 
You're oversimplifying the issue in my opinion. I mean is there a massive difference in this particular workload between CPU and GPU?

Given how the workload involved the rendering of grass, and given how much grass exists, yes.

Also on the other hand in situations where GPU is a bottleneck overall throughput is increased this way. If a game is performing at ~30fps at 1080p resolution on 7970/680 I'd wager that most people are going to be GPU bottlenecked.

And anyone not using an 8-core processor will be CPU bottlenecked. The difference is that GPU's get more powerful faster, so if you are going to overload something, overload the component that is actually improving at a faster rate.

Also anyone with $400+ graphics is likely going to have a CPU strong enough to handle the load. So I think it's a safe statistical bet that the overwhelming majority of gamers is going to benefit from their design choice.

Provably false:

Crysis3-CPU.png


With a 680 GTX, which isn't exactly a slouch. Notice the walking CPU bottleneck?

By contrast:

1920-VH.png


With a 3960x, you see clear evidence of a GPU bottleneck, but given the results above, that would be expected. So we can conclude a 3960x is NOT a CPU bottleneck, even with CF/SLI/Titan. But with a single 680 GTX, its clear that you fall of very quickly as you drop down the CPU ladder. Other sites show a similar trend: A single 680 GTX is clearly bottlenecked by the CPU. [Now, I would LOVE to see a slightly less CPU as the baseline for the GPU tests, say a 3770k, rather then a SB-E. That would prove beyond a doubt what is happening.]

Right now software writers are still pushing single thread optimisations leaving the world of multi processing and heterogeneous system processing and the infinite potential it offers to drive x86 into bed rock.

Open up Task Manager, add the "Threads" column, and look how many threads programs are using.

People really don't it get it: Program flow is sequential in nature, and everything that programs CAN parallize is ALREADY offloaded to the GPU; thats the point of API's like CUDA and OpenCL, and the reason why AMD is embracing HSA. Software developers like me are doing everything in our power to get parallel work OFF the CPU, because it does so poorly at it.

So on one hand, I get yelled at by paper software gurus because I don't maximize CPU potential. Then when I do, I get yelled at for not using OpenCL/CUDA/WhateverElseIsHotThisYear to offload to the GPU.

Meanwhile, thread control structures (Especially Mutexs and Sepharmores) break when exposed to a multiple-core environment, so I have to carry around a performance hit every time I use them to prevent them from crashing the OS, but hey, as long as Task Manager shows 100%, I guess you're happy, even if the program runs like crap.

So how about this: Next time I write a program, I'll thread the following for CPU's that aren't doing any work:

while(1)
i++;

100% core loading! Exactly what you all want.

Pardon my language, I am really getting pissed at people telling me how to do my job when they have no clue how code works, how OS's handle thread and memory management, how current thread control structures break when exposed to multiple cores, how platform independent threading API's (pthreads) are functionally broken (no way to suspend a thread), or how theres no performance benefit to threading light-work workloads, even if they do scale well. Because Task Manager is the holy grail for performance management, and if they can't justify the purchase of their $10k CPU, they get upset and have to whine and complain at someone to make themselves feel better.

So you know what? I'm done. Have fun with the groupthink; I'll be back in a few months, going "I told you so" once again.
 

mayankleoboy1

Distinguished
Aug 11, 2010
2,497
0
19,810
Meanwhile, thread control structures (Especially Mutexs and Sepharmores) break when exposed to a multiple-core environment, so I have to carry around a performance hit every time I use them to prevent them from crashing the OS,

Lol yeah. People dont realise how easy it is to crash the OS/program by threading if you dont use hell of mutexes and semaphores. But if you do use them, a lot of time the code is doing nothing but waiting for the critical section to complete , hence nullifying a lot of multithreading effort. So the code is an compromise between speed and correctness/stability.
And this is on the "mormal "piece of code, that is not parallel in nature, but has to be forcibly threaded , using the abovementioned semaphores/mutex. These workloads dont scale well beyond 3-4 cores, and really need one big core than hundreds of small ones.

Most people think of "parallel" workload as "the massively parrallel one," which scales almost infintely, and doesnt require much of synchronisation and mutual exclusion.
 

kettu

Distinguished
May 28, 2009
243
0
18,710
Given how the workload involved the rendering of grass, and given how much grass exists, yes.

You're not giving out a lot of details here.

And anyone not using an 8-core processor will be CPU bottlenecked. The difference is that GPU's get more powerful faster, so if you are going to overload something, overload the component that is actually improving at a faster rate.

Only partially true. If 8-core is not a bottleneck then it is irrelevant which improves faster since 8-core is allready enough. Besides, a bottleneck doesn't necessarily mean the game is unplayable.

Provably false:

With a 680 GTX, which isn't exactly a slouch. Notice the walking CPU bottleneck?

By contrast:

With a 3960x, you see clear evidence of a GPU bottleneck, but given the results above, that would be expected. So we can conclude a 3960x is NOT a CPU bottleneck, even with CF/SLI/Titan. But with a single 680 GTX, its clear that you fall of very quickly as you drop down the CPU ladder. Other sites show a similar trend: A single 680 GTX is clearly bottlenecked by the CPU. [Now, I would LOVE to see a slightly less CPU as the baseline for the GPU tests, say a 3770k, rather then a SB-E. That would prove beyond a doubt what is happening.]

And as I said most people with high end graphics card are likely to have a good enough CPU. Those people are not negatively affected. People who have a strong enough CPU (which by the way doesn't cost that much compared to highend GPUs) but weaker GPUs will benefit from the higher overall throughput. My guess is that ~$200 CPUs are much more common than ~$400 GPUs.
 


I know what you are trying to say but like the last link I posted. The problem with current efforts into offloading parallel workloads is that there has been no chip that is condusive to such, until now. Currently yes you have your API's and writes to offload to GPU but the whole process is slower because the software is not really creating a synergy between the components, the GPU's load is still sent back and forth over many cycles before its processed. As the rather in depth breakdown of AMD's HSA drive is more hardware and software working in synergy to maximise performance, AMD will have the first unified memory which is pivotal to any parallel/serial workload in a pure heterogenous system, right now current efforts are not seemless and there is to much time in between which is bottlenecking performance.


As to the other slide, and this was killed a while ago. X86 is dying at a rate of knots, newer CPU's are unable to sustain its improvments over older parts, this is contrary to the GPU which is becoming exponentially faster than prior generations, so much so the HD7970 embarrasses Intel's 3970x by about 9x the compute performance, and this gap is getting bigger. So yes you back what makes a difference and that is the unharnessed potential of the GPU's parallel computing. Many consider the graphics card a games orientated component but many les naive users know all to well the true potential of a GPU.

And lastly I have no idea why you are throwing a fit because nodody is saying you are not doing your job or not doing it to the best of resources at your disposal. But its also incorrect to make out that software is ahead of the curve when just about every developer in recent times has come out and said they need more freedom from restrictive codes. Your line of work like hardware evolves to, its just its further behind than the hardware. OS itself shows that it is still better for older systems than new. 6-8 core or up to 12 thread processors have been out for the best part of 4 years now and still Windows struggles with thread scheduling, while there are intermediatries which try help schedule threads accordingly the process as above is not seamless.


HSA is the something new but like all things new is not build overnight, but it and the HSA foundation is growing there is stories of EA, MS, and further industry leading developers joining. It is moving forward and traditional ways will move with it. They said 2014/2015 will we get a number of HSA supported applications and games but only by 2016 will it be in true fruition.

 
@gamerk316: can you elaborate on '100% core loading'? i don't think that softwares will load all cpu cores to 100% and stay there - it'll adversely affect multitasking and background processing, like antimalware programs.

@intel buying amd rumor, fudzilla gets in on the fun:
"This is why Intel picking up AMD might make sense"
http://www.fudzilla.com/home/item/31270-this-is-why-intel-picking-up-amd-might-make-sense
it's kinda (sad and) amusing that this rumor has done something piledriver(fx cpus) launch failed at doing - raise amd stock prices and raise it this fast. iirc amd stocks went down when pd came out and kept going downwards. :lol:
both are corporations. one isn't better than the other one. imho, if the positions were reversed, i woulda said the same thing about intel. ^_^
oh, another reason could be upcoming richland and console launches. so it could easily be those two real events instead of the intel rumor. amd fanboys shouldn't panic. :D
edit: slow news day...
 

Blandge

Distinguished
Aug 25, 2011
316
0
18,810


@1: This decade old opinion piece without an author is your leading piece of evidence? A single paragraph that is nothing but opinion with broken links from 2003!?

@2: Maybe you should actually read the whole article instead of just the information that supports your claim.

From the article, Intel:
We have engineered intelligence into our 4 series graphics driver such that when a workload saturates graphics engine with pixel and vertex processing, the CPU can assist with DX10 geometry processing to enhance overall performance. 3DMarkVantage is one of those workloads, as are Call of Juarez, Crysis, Lost Planet: Extreme Conditions, and Company of Heroes. We have used similar techniques with DX9 in previous products and drivers. The benefit to users is optimized performance based on best use of the hardware available in the system. Our driver is currently in the certification process with Futuremark and we fully expect it will pass their certification as did our previous DX9 drivers.

From the writer:
Intel's software-based vertex processing scheme improves in-game frame rates by nearly 50% when Crysis.exe is detected, at least in the first level of the game we used for testing. However, even 15 FPS is a long way from what we'd consider a playable frame rate. The game doesn't exactly look like Crysis Warhead when running at such low detail levels, either.

Our Warhead results do prove that Intel's optimization can improve performance in actual games, though—if only in this game and perhaps the handful of others identified in the driver INF file.

@3: This was really dumb I agree, but Intel later proved that the game did indeed run on their system successfully.

If this was all of the evidence you have then you aren't making a very strong case...
 

Blandge

Distinguished
Aug 25, 2011
316
0
18,810


2009 35.1
2010 43.6
2011 54
2012 53.3

Revenue growth is flat for 1 year after 2 years of double digit growth and x86 is dying. OK.



If single threaded CPU performance is your only metric than it is true that the rate of improvement growth has slowed, but it's not. Performance/Watt is a much more important metric among hundreds of other metrics. I'd like to see a graph of the rate of growth of network packets processed/second on Intel Xeon processors. I'd be willing to bet that it still improves linearly generation over generation, while adding features like encryption and compression.

Everybody on these tech sites has a very narrow minded view of what can be considered "performance".
 
The problem is simple.
x amount of silicon at x amount of power, within x amount of TDP.
Before we had hit those limits, we saw larger growth, adding to the cores, widening BW, adding cache all that helped.
The low hanging dfruit is gone, and where you see the three things I first listed at their limits, to see similar scaling as we go to new nodes should continue, but the huge jumps of growth weve seen are pretty much behind us, until something can effect those 1st three things
 

Blandge

Distinguished
Aug 25, 2011
316
0
18,810


Except that we are still seeing huge growth in other areas besides general purpose single threaded performance. It's true the problem you speak of does indeed exist and it's a big problem, but it doesn't mean that we aren't seeing massive improvement in important areas.
 

8350rocks

Distinguished


We are now, where we were when the K6 II was the top AMD product. Then a revolutionary new architecture arrived in K7. I have a feeling something else is over the horizon...growth is stagnating some...but it was toward the end of the 90s as well. Now we are just seeing the vein of architecture draw to the end of it's capability. New advancements will make it so that the extremely large scale jump we saw in all of a 12 month timeframe will happen rapidly again. I can recall when PCs were obsolete 6 months after release because the hardware advances came so quickly...no one could keep up. It grew so quickly that 64 bit architectures became a requirement because 32 bit operating systems couldn't address as much RAM as you could put into your home PC anymore.

 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


@1 Give info and many links. For instance it mentions how sysmark is a biased benchmark favouring intel. Google how AMD, Nvidia, an VIA abandoned BAPCO due to sysmark bias. BAPCO and its product, sysmark, are controlled by Intel.

@2 Read the part where says that Intel is not playing "fair" and how they violate the benchmark guidelines. Read the part just below the one that you quote. Again explain how Intel is explicitly "detecting 3DMark Vantage and changing the behavior of its drivers in order to improve performance". Look at the graphs showing the real performance for the same code with different executable name. That is the real performance.

Look at the list of games with fake performance. It is fake because performance does not depend on the code used by the programmer but in the name of the executable. Reviewers would then select some of the listed games in their reviews but users would suffer performance lost with other hundred of games even when games that would be using similar quality code.

That is the same kind of lie than Intel does with its compiler and the biased CPU dispatcher.

@3 Intel lied. It was a recorded video. Yes latter they showed the PC playing the game, just after being caught. What if nobody had noticed the fake press release?

Intel also claim now that its compiler does not optimize for AMD and other's chips. They do now because they were obligated by the FTC, after being caught on the lie.

And we know Intel continues cheating and lying today and not only about performance...

http://www.theverge.com/2013/1/9/3856050/intel-candid-explains-misleading-7w-ivy-bridge-marketing
 


25 watts- http://www.cpu-data.info/index.php?gr=15

My point here is, sacrificing perf in pure output by reducing or maintaining power.
Low hanging fruit is used up pretty much.
100% scaling?
Last seen P4 vs C2D.
And that was lower clocks and power, which since, the clocks are back close to optimal, and power has been maintained, with of course a little less overall top model to top model.

Incrimental steps?
AMD has alot more known ground and available to them than does Intel does.
 

truegenius

Distinguished
BANNED
since we are talking about paralleization of threads
i have an idea
note : this an idea only, so it may contradict with working of cpu, i am a noob so take it with a heap of salt


amd module contains flex fpu which surely benefits floating point tasks when processing 1 thread on single core

so can we have a flex integer core
means 2 integer core will get combined during single core workload
thus providing more instruction level parallelism thus more ipc and it is way better than turbo core mode
 


AMD has been the subject to buy out when stocks hit 1.8/share in November/December 2012 yet the stocks didn't gain. As before in 2012 AMD's products were only geared towards PC, we have seen a far more aggressive product line which is diverse in all segments of PC and electronics. AMD doesn't have the best but they make up in quality and quantity. There have also been big marketing wins which helps add that to the fact that they have full control of entry to high end GPU in terms of performance its been a little period of success. Then there is the work done by senior officials, designers and writers that have produced good products and stabilized the bleeding when everyone proclaimed the end. Believe it or not but companies can grow without intel and its not really hard to improve from where they were.

 

griptwister

Distinguished
Oct 7, 2012
1,437
0
19,460



I'm not buying it. Lol. even if AMD does make a FX-8570, there is no doubt in my mind that this is fake.
 
Status
Not open for further replies.