AMD Piledriver rumours ... and expert conjecture

Page 88 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 
well i was actually talking about the trinity die, about how 2 modules+graphics could fit under 65W :)

Anyway, if you compare the number of FX-4100's sold vs FX 8120+8150 and 6100 together at Newegg, i'd say its high time for AMD to make a native 2module FX..... a 125W 4170 is a really sad joke!!

btw love the i4 idea, 3 cores with HT, clocked at 3.4ghz/3.7 turbo and with say 4MB L3 cache, at around $165 would be an instant hit 😀 😀

That i4 would have to be a K version or it won't be as much of a hit on the market. It would do a lot of good for Intel through so they can dump partially defective samples at a low cost. Each sample complete without retail box and cooler runs around $30-40 to make if it isn't subsidized by different models.
 
so basically if they get Trinity right then odds are better for them to get Piledriver correct too.?
but in the end still fall short of expectations.

Trinity will overall be better than Llano but mainly from a much better memory controller and gpu beyond that all is just clocks. I doubt Piledriver will be worth much talking about when it comes besides it being able to clock like a bat out of hell. At least in the mobile world they will be the rage as Llano was.
 
I hope it does, do not get me wrong.
I run both AMD and Intel and run my AMD more than the Intel.
So my fingers are crossed as I already have the hardware to run it, GA-990XA-UD3.
To be honest my 965BE @ 3.8GHz stock voltage has my quite content for now, not my gaming unit.
With a voltage bump I can Prime95 @ 4.2GHz holding below 55C (on air) if I wish and that alone will smash the FX-41xx and FX-6xxx models.
Instead of dumping my AMD unit, I built a i5-2500K on a Z68.
I realize that AMD's time is about up in the CPU (desktop) arena unless something miraculous happens.
Hell, I love and wish for miracles all the time.. :??:
The FORCE is leaving them young jedi..

It will be years before amd turns things around, I doubt they will belly up due to their gpu sales but it is GF that caused most of the problems for amd with their design teams in second place. Getting rid of GF was the best thing they have done in a long time in that are so hopefully they can keep up process wise even though they may not ever be as good as Intel but maybe one day they can move more samples thanks to TSMC.
 
There is no such thing as "CISC".

Apparently there's a conspiracy going on all over the world because there are all these books and papers about a thing called "CISC." I guess that's what happens when people believe wikipedia :/

In all seriousness, I understand what you are trying to say, and perhaps the terms RISC and CISC are thrown around too casually. But to say that CISC does not exist is incorrect.
 
The last RISC CPU made was the SB. There is no such thing as "CISC". RISC isn't a standard, it's a set of principles and philosophies that engineers are encouraged to go by to keep processors clean and efficient. There are entire books discussing the various recommendations, the biggest and most important one is that every instruction should be executed within one clock cycle. That is important because it makes scheduling and prefetch easy to do, the instruction's become very predictable when you don't have variable execution times. The last "CISC" CPU if you insist in using that term would be the Intel Pentium processor. The Pentium Pro used microcode and had moved on to an internal RISC uArch.

Modern CPU's are all RISC, even Intel's. The core computing components all process using a propriety vender specific machine code that is designed with the RISC principles. On top of that you have the x86 decoder which takes the x86 instructions and breaks them into their component RISC instructions for internal dispatching and execution.

An easier understanding would be to look at how Java and the JVM works. Your program is written in Java and then compiled to Java binary code (analogy of x86 machine code). The JVM then takes that code and recodes it into the uArch's native language for execution (analogy of the RISC instruction decoder on CPUs). No matter how efficient you make the JVM, there will always be a performance penalty occurred due to the translation. Just as no matter how efficient you make your x86 binary, there will always be a penalty occurred from having to translate it into micro-ops for execution. The reason for Java (and other interpretive languages) is the same as the venders using x86, universal backwards compatibility. You don't have to recompile every piece of software for SB vs BD vs Phenom II vs Core vs Athlon vs Pentium 4 and ect.

If a processor vender (AMD / Intel / Via) decides they want to redo their internal instruction sets they can without fear of breaking backwards compatibility. Again the downside is that x86 isn't particularly good at multiprocessing / scaling.

Congrats, you finally jumped overboard.

Yes, X86 has over the years adopted a LOT of RISC principles, but its still a CISC CPU at its heart.
 
If a processor vender (AMD / Intel / Via) decides they want to redo their internal instruction sets they can without fear of breaking backwards compatibility. Again the downside is that x86 isn't particularly good at multiprocessing / scaling.

Yet there are many well coded applications that scale near 100% linearly on multi-core and multi-cpu systems.

Compilers, languages and API's lag years behind the CPU technology. Just making a new ISA is so far from trivial, the last one Intel tried (Itanium) cost them billions and now it's practically been EOL (end of life).

ARM has taken nearly 30 years to get the success they have today, and now working on their 8th ISA to add 64bit.
 
Apparently there's a conspiracy going on all over the world because there are all these books and papers about a thing called "CISC." I guess that's what happens when people believe wikipedia :/

In all seriousness, I understand what you are trying to say, and perhaps the terms RISC and CISC are thrown around too casually. But to say that CISC does not exist is incorrect.


Go look again, "CISC" was never a defined term. Back in the 70's when the earlier CPU's were being made, every manufacturer did their own thing. Eventually a bunch of engineers got together and defined a set of principles and philosophy's for design's CPU's to keep them small and simple yet powerful, these philosophy's were known as RISC (Reduced Instruction Set Computing) and any CPU that followed them was known as a RISC CPU. It's also known as a load - store CPU because no math may be done on a memory location, instead you must first issue a load command and when your finished issue a store command. The term "CISC" was invented afterward not as a set of standards or philosophys but to represent anything "not-RISC"

http://en.wikipedia.org/wiki/Reduced_instruction_set_computing

The attitude at the time was that hardware design was more mature than compiler design so this was in itself also a reason to implement parts of the functionality in hardware or microcode rather than in a memory constrained compiler (or its generated code) alone. After the advent of RISC, this philosophy became retroactively known as complex instruction set computing, or CISC.

Thus it's not RISC vs CISC but RISC vs "not-RISC".

Contrary to popular simplifications (present also in some academic texts), not all CISCs are microcoded or have "complex" instructions. As CISC became a catch-all term meaning anything that's not a load-store (RISC) architecture, it's not the number of instructions, nor the complexity of the implementation or of the instructions themselves, that define CISC, but the fact that arithmetic instructions also perform memory accesses. Compared to a small 8-bit CISC processor, a RISC floating-point instruction is complex. CISC does not even need to have complex addressing modes; 32 or 64-bit RISC processors may well have more complex addressing modes than small 8-bit CISC processors.

RISC is a design goal, you set out to design a CPU for maximum efficiency by using RISC principles, CISC is just a term being you didn't do that. Today there are no pure CISC CPU's, everything is RISC if only internally.
 
Congrats, you finally jumped overboard.

Yes, X86 has over the years adopted a LOT of RISC principles, but its still a CISC CPU at its heart.

I figured you'd attempt to jump in. You lack of knowledge of uArch and ISA's is showing again.

x86 hasn't adopted a single "RISC" principle, x86 is the exact same ISA that it was when Intel designed the 80386. AMD64 (EMT64) is just a 64-bit extension of the x86 ISA.

What exactly do you think the instruction decoder is on an x86 CPU? Why do you think it's so important and directly coupled with the instruction scheduler? And most importantly, why is a complex decoder not present in the SPARC / MIPS / PPC uArchs?

http://en.wikipedia.org/wiki/Classic_RISC_pipeline

x86 operations take too many cycles and are inefficient to execute directly in transistors. Due to their variable execution time and operand length pipe-lining them is nearly impossible with high latency and frequent stalls. To create the superscaler pipeline found in the Pentium Pro Intel had to design a miniature processor that translated the x86 instructions into smaller load / store / execute instructions. These smaller instructions have a standard length and execution time thus making them easy to schedule and track across a longer pipeline. Thus the birth of the modern x86 CPU.

BTW the two defining traits of a RISC design's CPU are standard / singe cycle execution of instructions and the need for separate instructions to retrieve and store data from memory to registers.

Now go scamper back to your hole.
 
Yet there are many well coded applications that scale near 100% linearly on multi-core and multi-cpu systems.

Compilers, languages and API's lag years behind the CPU technology. Just making a new ISA is so far from trivial, the last one Intel tried (Itanium) cost them billions and now it's practically been EOL (end of life).

ARM has taken nearly 30 years to get the success they have today, and now working on their 8th ISA to add 64bit.

You misunderstand, your thinking scaling as in multiple threads, that's relativity easy and entirely up to the software / OS scheduler. I'm talking scaling in HW processing capability. x86 as an ISA can only execute one instruction at a time, or rather expects only one instruction to be executed at a time. It was never designed to have multiple execution engines and every CPU is under the assumption that it's the only CPU present. It takes a combination of HW and SW magic to get them to play nice with each other. It's just more complicated and less efficient over all.

This is vs something like PPC or SPARC which are designed from the beginning to be expansive.
 
That i4 would have to be a K version or it won't be as much of a hit on the market. It would do a lot of good for Intel through so they can dump partially defective samples at a low cost. Each sample complete without retail box and cooler runs around $30-40 to make if it isn't subsidized by different models.
intel isn't stupid enough to sell a i4 k version. It would cut their profit margins massively from not selling nearly as many i5s and i7s which cost the same money for intel to make.
 
Thus it's not RISC vs CISC but RISC vs "not-RISC".

The industry has clearly given a label to "not-RISC" ISAs in the form of CISC.

I can go on all day saying things don't exists just because they are the lack of something.

Atheists don't exist, only not-religious people

Composite numbers don't exist, only not-prime numbers

CISC ISAs don't exit, only not-RISC ISAs

I understand the point you were trying to make, and there is merit to your arguement. But there's no need to confuse people by saying there is no such thing as CISC. I think you should simply be straight forward in your explainations. The way you write makes me think that sometimes you are just stroking your own ego for being knowledgable, and provoking people into challenging you on a subject that you know a lot about.
 
The industry has clearly given a label to "not-RISC" ISAs in the form of CISC.

I can go on all day saying things don't exists just because they are the lack of something.

Atheists don't exist, only not-religious people

Composite numbers don't exist, only not-prime numbers

CISC ISAs don't exit, only not-RISC ISAs

I understand the point you were trying to make, and there is merit to your arguement. But there's no need to confuse people by saying there is no such thing as CISC. I think you should simply be straight forward in your explainations. The way you write makes me think that sometimes you are just stroking your own ego for being knowledgable, and provoking people into challenging you on a subject that you know a lot about.


CISC was never an actual term used in processor design. That's the point I'm trying to make, engineer's didn't refer to their own design's as "CISC". No processor has ever been designed as "CISC", many have been designed a "RISC". It's become a common term only because people keep repeating it often enough, no committee, no engineering body, no company ever actually defined the term CISC.

The proper terms to use instead of RISC/CISC is load-store vs non load-store indicating the actual difference between the two.

In fact I challenge you to find the industrial body that coined and defined the term "CISC". If it's a standard then someone defined it, engineers are kinda OCD that way. Find that body.
 
Lets put any future debate over RISC / CISC and the x86 line of CPU's (AMD / Via / Intel) to rest.

x86 and it's extension x86-64 is a non load-store (CISC) instruction set. The CPU's that process that instruction set do so by first translating it into a load-store instruction set (RISC) that is implemented in microcode. These RISC style instructions are then pipelines and executed internally with their result returned to the program. Using this technique all three manufacturers have been able to implement super-pipelining and out of order execution on an instruction set architecture that was never designed for that.
 
Lets put any future debate over RISC / CISC and the x86 line of CPU's (AMD / Via / Intel) to rest.

x86 and it's extension x86-64 is a non load-store (CISC) instruction set. The CPU's that process that instruction set do so by first translating it into a load-store instruction set (RISC) that is implemented in microcode. These RISC style instructions are then pipelines and executed internally with their result returned to the program. Using this technique all three manufacturers have been able to implement super-pipelining and out of order execution on an instruction set architecture that was never designed for that.

I agree, well said.
 
intel isn't stupid enough to sell a i4 k version. It would cut their profit margins massively from not selling nearly as many i5s and i7s which cost the same money for intel to make.

They don't have to, just a small % of any total batch that they can throw out as chicken feed rather than going through the expense of recycling as many bad samples. Any way it has happened before for those that can remember with the celeron 300a during the p2 era.
 
You misunderstand, your thinking scaling as in multiple threads, that's relativity easy and entirely up to the software / OS scheduler. I'm talking scaling in HW processing capability. x86 as an ISA can only execute one instruction at a time, or rather expects only one instruction to be executed at a time. It was never designed to have multiple execution engines and every CPU is under the assumption that it's the only CPU present. It takes a combination of HW and SW magic to get them to play nice with each other. It's just more complicated and less efficient over all.

This is vs something like PPC or SPARC which are designed from the beginning to be expansive.


I see modern RISC processors needing just as much magic HW and SW to get them to "play nice too". At least for any that can compete at the workstation level. They require some fairly complex/expensive I/O systems and database/grid compute software to get that performance.

IBM processors are usually packaged in giant MCMs with 100s of MB of cache/eDRAM. The MCM itself needs 800 Watts just to run it.

This is an old pic but similar is done with the latest Power7.
http://en.wikipedia.org/wiki/File😛ower5.jpg

SPARC modules required similarly large and expensive amounts of cache to be useful. This is partly why they pretty much died off until Oracle breathed some life back into them. The price/performance just wasn't worth it.

As long as price/performance keeps getting better the underlying ISA is just 1 aspect of 1 piece of the puzzle.

http://www.theregister.co.uk/2012/03/08/supercomputing_vs_home_usage/page2.html

 
They don't have to, just a small % of any total batch that they can throw out as chicken feed rather than going through the expense of recycling as many bad samples. Any way it has happened before for those that can remember with the celeron 300a during the p2 era.
I really doubt they would sell it unlocked as it would be the highest demand chip if priced competitively in their line. A cheap OC able chip would mean no more i3s being sold and no i5s being sold because you can get pretty much the same gaming performance from the cheaper chip. i4 3 core unlocked chip would mean it would be the only gaming cpu worth the money.
 
I really doubt they would sell it unlocked as it would be the highest demand chip if priced competitively in their line. A cheap OC able chip would mean no more i3s being sold and no i5s being sold because you can get pretty much the same gaming performance from the cheaper chip. i4 3 core unlocked chip would mean it would be the only gaming cpu worth the money.

exactly :)
forget unlocked, at $165 i wonder if intel would be generous enough to even keep TurboCore on !! 😀
 
I really doubt they would sell it unlocked as it would be the highest demand chip if priced competitively in their line. A cheap OC able chip would mean no more i3s being sold and no i5s being sold because you can get pretty much the same gaming performance from the cheaper chip. i4 3 core unlocked chip would mean it would be the only gaming cpu worth the money.

I didn't say large volumes like what they are doing with most of their current line up but something that would have been kept in small supply to begin with to help cut production costs. There wouldn't have been enough to have a negative impact on sales other other models while adding more pressure on amd in the mid range. If it lowers production costs by harvesting samples it will help drive down the costs slightly for the higher end models. Intel would have earned more per wafer produced so it wouldn't have hurt the sales of the i5 and i7 much at all. Each wafer runs a few thousand usd each before they even find out how many functional and semi functional samples they are going to get.
 
I really doubt they would sell it unlocked as it would be the highest demand chip if priced competitively in their line. A cheap OC able chip would mean no more i3s being sold and no i5s being sold because you can get pretty much the same gaming performance from the cheaper chip. i4 3 core unlocked chip would mean it would be the only gaming cpu worth the money.

Not quite. The 2500K will still hold a lot of vaule since it would be a quad core, not a 3 core.

My main point is thats why the CPUs Intel has will be designed the way they are, to be able to add or remove cores as needed. I am sure they will go that route with the IGP at some point too to make it easier.

Its much like the Terascale CPU, where you can add or remove any type of core to the design. The 80 core was just a random number they decided on since it could have been 50 cores or 100 cores.
 
gamers and high end enthusiast.
AMD just needs to compete and not actually be better than right now it doesn't even do that.

A lot of conjecture is created by benchmark numbers but I consider them indeterminable, I have my 2500K clocked at 5ghz improved the RAM timings and in one run on SuperPI I got 1m 24.256, the run immediately after the time droped to 1m 25.086, eight tenths droped in a matter of seconds, which is why benchmarks to real world performance is grossly irrelevent, BD is not as strong as SB per core but the overall performance is not as wide a gulf as made out to be, my 8120 with dual 7950's is within 5% FPS to that on my 2500K with the same cards. Sure there are games with the odd micro stutter but they are few and far between, for the average end user the BD is copious and cheaper.

Right now AMD cannot compete at the highest end, but at least I hope PD is competitive.

The Bulldozer architecture has issues. However, that AMD is representing that they are giving up in the performance desktop market (and, therefore, the workstation market) is even more of an issue. We all know that AMD is capable of making a good architecture. And as has been said, the next implementation of Bulldozer does not even have to be better than the competition, just better and of course it needs to have good value.

Benchmarks are real world usage, in a sense. They are an attempt to show performance and they work quite well for what they are designed to do. If you use enough of them, it's hard to get around them. They show the performance, they are not irrelevant. Are other things going on within your operating system while benchmarks are being run? Of course they are. Finally, Bulldozer is not cheaper, that just isn't so.

With all that, I am hoping for a nice surprise with Piledriver, both in the APUs and in the desktop versions.
 
Not quite. The 2500K will still hold a lot of vaule since it would be a quad core, not a 3 core.

My main point is thats why the CPUs Intel has will be designed the way they are, to be able to add or remove cores as needed. I am sure they will go that route with the IGP at some point too to make it easier.

Its much like the Terascale CPU, where you can add or remove any type of core to the design. The 80 core was just a random number they decided on since it could have been 50 cores or 100 cores.
3 cores with hyperthreading is almost as good as 4 cores so it would be a very hard sell for the i5 if they did it like that.
 
3 cores with hyperthreading is almost as good as 4 cores so it would be a very hard sell for the i5 if they did it like that.

Not at all, the i5 would remain the superior cpu and people would pay for it just as those value seekers would pay for your imaginary 3 core. Myself, I went with the i7-2600K because that is superior to the i5 (sure someone will argue about that) and in line with my budget. Remember, different people have different budgets, we aren't all making those hard business decisions.
 
Not at all, the i5 would remain the superior cpu and people would pay for it just as those value seekers would pay for your imaginary 3 core. Myself, I went with the i7-2600K because that is superior to the i5 (sure someone will argue about that) and in line with my budget. Remember, different people have different budgets, we aren't all making those hard business decisions.
hyperthreaded cores are like 1.2 of a core. 3 x 1.2 = 3.6 which is pretty much 4 cores. Games don't even use 4 cores most of the time so an unlocked i4 with 3 core 6 thread would pretty much make the i5s mute. Would probably OC higher than the i5 because of less heat. There wouldn't be a better gaming cpu for the price. I would buy the i4 if they made one. Im sure many people would also do the same over the i5 to save money.
 
Status
Not open for further replies.