AMD Piledriver rumours ... and expert conjecture

Page 286 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 



Steamroller has some significant IPC improvements but there is a big catch 22 there. The SR cores are going to need more memory bandwidth to keep the cores fed.

They can scale the GPU by adding more shaders and clock speed but they have the same catch as the SR cores.

Both will be fighting for the same memory bandwidth that isn't getting any faster. DDR2400 is about the max for FM2.

Until we see how SR is going to improve the memory controller you're going to be in the situation of the 128bit discrete GPU cards that really need a 256bit interface. They're memory starved. 128bit discrete cards try to get around that limit by overclocking the memory, which GDDR5 can do.

Intel will face similar limits on their low end parts, but they have triple and quad channel DDR3 sockets available. FM2 is stuck at dual channel DDR3. This is why Intel is starting to put memory on die for the GPU. If they want to keep the pin counts down and only support dual-channel they have to.
 


Which I again stress can't be done for most workloads.

Look, we did experiments like this back in the 80's. MIT built a supercomputer with several THOUSAND CPU's inside it. And they discovered that most S/W loads do NOT scale well. And even those that do tend to drop off after 16-32, due to various reasons (I/O bottlenecks, locks, the OS scheduler, etc).

Sure, SOME applications can scale well close to infinity. SQL databases are largely independent on a per cell basis, so you can scale at close to 100%. Rendering is another example, hence GPU's. But most workloads do NOT scale in this way. Simple examples: You can't do 3d sound propagation without knowing the current geometry, so sound and rendering touch. That overlap hinders scaling. You can't do AI without inputs from the geometry and sound, so more overlap there. And so on. You don't have totally independent threads, and every time a thread touches another, you have a performance bottleneck.

Finally, you have the scheduler to consider. Developers do NOT use hardcoded thread logic (put thread "x" on core "y"), because there is no way to know current and future workloads on that core (EG: you are not the only thread running). So the cores threads are assigned to are left up to the OS scheduler. In theory, you put threads on the core with the least amount of current work, dynamically adjusting every time a thread gets run. But then you have to consider the CPU architecture (shared cache or dedicated cache per core? This matters if you share a lot of data between threads) as well, so even that gets sticky.

My point being simple: For the majority of workloads, applications can not, and will not, scale beyond a couple of CPU cores. No amount of coding by developers is going to change this very simple fact.

I also note that trying to determine scaling based on Task Managers 1 second sample rate is kind of silly when you think about it, considering most threads will run, finish their work, and be replaced with some other thread LONG before Task Manager gets its next workload sample.
 



We have been hearing that for the last 2 years now. Kinda reminds me of cod games; this time, its going to be different. oh wait, this time it will get better. no, no, no this time it will definitely improve by a fat marging. all we see is the same draggy thing. :sarcastic:
 



It may very well be too little too late. It's not like Haswell is sitting idle, and 14nm is around the corner. There is no easy road ahead for AMD.

I'll give AMD credit for finally admitting and addressing the biggest bottleneck that Bulldozer had, the instruction decoder. This was a major oversight on the Bulldozer module design.
 
^ Isn't AMD supposed to make some sort of announcement next week concerning their future product plans?? Thought I saw that somewhere in one or two of the PD reviews..

Anyway, I'd bet a dollar and a donut that Steamroller gets delayed well past their Q4 of next year target release..
 


Post-excavator I would very much like to see AMD plan for a unified FM3 socket with DDR4 support. Possibly an APU only lineup.
 



Doubt it. They now seem to be on track after layoffs and resource re-routing.
 

thing is last I saw haswell was supposed to be UP TO 10% over ivy. SR is up to 30% over BD (20% over pd?) along with a 15% improvement in power efficiency over pd.

http://www.tomshardware.com/news/AMD-Steamroller-Piledriver-Kaveri-processors,17217.html

Its much easier to speed up a new "slow" architecture than to speed up whats already fast.

 

very unlikely AMD will delay steamroller. It all will depend on globalfoundery. So far AMD has only tape out mobile chips from the 28nm process so it could be a problem getting mass production up fast enough.
 



The first Steamroller already taped out, so the APU (Kaveri) is safe. Could very well be true for the desktop SR.
 


That would make sense, but I guess we'll find out more next week..

If SR is locked into GF's 28nm process, then I'd still hold my bet 😛..
 



That could mean lack of supply around release day, like nvidia had with GTX600 series chips produced by TSMC. It did not move release date, and products hit shelves within a week anyways.
 
What i noticed is the 8320 is faster then the 8150 all the time even with a slightly less clock speed. And the Power consumption is lower(as well as the turbo speed)

And at newegg the 8150 is more money

Also with xbits review its clear Amd improved the FPU unit by around 18% which kinda shows in games which even the 4300 is on average faster at gaming then a 8150.

 


AMD has been downplaying the enthusiast market for some years now, with the "less than 5%" mantra, etc. Given the reduction in the already limited resources available, more bad news for enthusiasts seems likely..
 
Status
Not open for further replies.