AMD CPU speculation... and expert conjecture

Page 382 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Interesting perspective on it. I find that while the hardware all share the same ISA and architecture, the similarities, for the most part, end there-- kind of. They use different OSes, different APIs, and really; anything to do with the software is entirely different. The move to x86 was definitely for the reason you stated (as well as cost of course), and the porting is certainly easier with the same ISA. Though the porting isnt as easy as changing the label; though I'm sure you already knew that.

Personally though, I think the move to x86 was a step backwards. CISC is a nono. RISC is a yesyes. I hope ZISC becomes widespread before I die.

Anyways, Jaguar and gaming consoles are sortof irrelevant to the topic.

On another note, thanks for giving me that article on Mantle. I see some possible issues with Mantle but it probably won't be as bad as I imagine it and some good programming will eliminate most of the issues I imagine.

Either way; the basis rests, I dislike the FX module design (finally something OT.)
 
Hum! Two days and I only can read posts about ARM, consoles, who said what, MANTLE, RISC vs CISC, help me with my GPU... The hypothesis that only one person here was posting info about Steamroller and Kaveri seems confirmed.

Besides real benchmarks (no simulations) of Kaveri against a high-performance intel chip, I also got AMD 2015 desktop roadmap. The same people will be deluded again. But as almost everyone here has one friend working at AMD you already know what dCPU is coming for 2015, true?
 
Is there any info on what the yields were like at Bulldozer launch, compared to end-of-2013 yields? It would be interesting to see a graph, it might give an idea on how much more AMD can compete by driving down the FX prices further.

Regarding the quote pyramids, I'm on another forum that prevents that by blocking posts which are 90% quoted material. I think that would improve this thread 🙂
 


juanrga,
if there is a scale where 0 is utter disappointment, 50 is neutral and 100 is absolute delight, where do you put the results of the above comparison u mention?
 
I can say that the results outperform my expectations/predictions about Steamroller and that AMD has a clear winner here.

I am going to write an update to my Kaveri article, but cannot post detailed info and benchmarks here, because this thread is only about ARM, consoles, who said what, MANTLE, RISC vs CISC, help me with my GPU, Bulldozer yields... :sarcastic:
 
Roadmap through 2015.

Rg9fKas.png



Edit: Source http://hardforum.com/showthread.php?p=1040422648
 


I don't like the module design either. It has trade offs made for server workloads at the time it was being developed.

They should scale up Jaguar for higher performance.
 

if this is not a fake (weird how it came out so soon, with entirely different color scheme from recent officially released amd roadmaps), a few things become apparent:
amd might not implement ddr4 in 2015.
carrizo might be drop-in upgrade from kaveri.
excavator will debut in carrizo in 2015, a year after steamroller debuts in kaveri.
no mention of exc's fab. process.
beema might not come in desktop.
the roadmap will likely change in q3 2014.
amd is severely bound to fabrications, for now.

amd is being quite bold claiming to launch excavator in 2015, considering how they fumbled(ing) around to launch steamroller. i am more interested in what comes after puma... or if there's something after puma. seems like amd is thinking about giving up ulp x86 socs until they get access to a smaller process.


imo it takes up too many transistors for the performance it delivers, in the name of modulatiry. +1 for jaguar empowerment! been saying it for a while.
 


Childish statements. People who have friends at AMD wouldn't publicly out information they don't want outed. As employees of AMD they are under a non-disclosure agreement and releasing such information is grounds for a firing at best, and a lawsuit at worst. Who would out their friends on a public forum merely for entertainment?
 

There is this thread, moar food4thought.

 


If anything I think servers is one of the only workloads it works out fairly OK. Basically, anything that doesnt need any floating point calculations at all and can tolerate high latencies and a high power envelope.

Which I dislike those restrictions. I actually prefer Jaguar.

Jaguar isnt a design you can reasonably scale up to high power envelopes, but it's okay.
 


No problem with switching from x86 to ARM for Microsoft?

How about the billion+ dollar failure of Surface RT?
How about their prior ARM failure, Zune?

These are not trivial things. Maybe Apple makes it look that way but they're a 500 Billion dollar company. and their OS (based on NeXTSTEP) was designed from the beginning to run on x86.


Your statements about NVidia and Tegra are simply false. No one tells their managers they plan to be profitable by the 5th generation. They would be laughed out of the board room. It wasn't just an R&D project. They needed a product to replace their nForce chipset unit that ended. They intended to be profitable from the beginning. Things just didn't work out that way. They got some design wins over the years but the competition is incredibly stiff.

 


Let him have fun thinking Amd cares more about Arm then their own X86 design its more funny that way
 


With how small the Jaguar cores are there are lots of "levers" they can pull to increase performance.

AMD slides show a 20-25% gain in performance with a 40% power reduction. Kabini -> Beema

 
http://wccftech.com/amd-carrizo-api-excavator-core-gcn-graphics-2015/

Those of us on Vishera have bought into a loooong lasting platform. FX is pretty much officially dead seeing as AMD has no plans for dCPU from now through 2015.

Here's to hoping that enthusiast class APUs are on the radar.
 


That's not what I meant, essentially.

The reason the design is so small is because of it's environment. It's being pit at highest; the 25W TDP. To me, even that is stretching the design a little far out of its strengths; to pit it right against ULV Intel Core series (way, way out of it's strengths actually. Jags are comfortable for Silvermont killers; staying away from Haswell is much recommended.)

Think about it for a moment; it has two 64-bit ALUs, one of which is stacked with the mul/div abilities. That's it in terms of integer power-- pretty damn similar to Silvermont so far. The fact it only has 2, thinly faceted ALUs makes it's integer pipeline not suitable for a high TDP with competitors. Then, on the LSU side, it has one load unit and one store unit; both 128-bits wide. That's not bad-- pretty decent actually; but comparing it to other 25W+ designs makes it anemic. Then on the FPU and vector point; all of it's units are 128-bits wide. Respectable; and it has 2 FPUs and 2 vector units; again, very respectable and almost equivalent to Piledriver. I'd say here it's fairly fine for a higher TDP, but then you go back to that anemic LSU situation and it just doesnt scale that high.

Don't even get me started on the lack of queues and buffers, along with the fact that one AVX instructions will tie up the floating point pipeline for quite awhile all on it's own. It just won't scale well above 25W. It's a small chip and packing 8 of them for use above 25W is fine; but going up against even the FX design will need serious clock rate (in excess of 5.0 GHz easily because of Piledriver's gigantic integer execution abilities) and it's a lost battle on the floating point side against any Haswell designs.

It just won't scale well that high. That's all there is to it.
 
http://techreport.com/news/25707/all-signs-point-to-kaveri-being-an-evolutionary-upgrade

Interesting article on techreport about steamroller. I'm quoting the part I found most relevant below:

"To begin with, Adam Kozak, AMD's marketing chief for client processors, told folks during a press briefing that Kaveri will be competitive with Intel's Core i5-4670K processor. (That's a $225 offering and the cheapest quad-core Haswell CPU with an unlocked upper multiplier.) When pressed for details after the briefing, however, Kozak clarified that Kaveri should only be equivalent in terms of combined CPU and GPU compute power. If one measures x86 performance on its own, Kozak said, "we'll lose." However, Kozak expects Kaveri's integrated graphics, bolstered with Mantle support, to be better than the latest version of Intel's HD Graphics."

Given how much more powerful the APU's graphics are expected to be, doesn't that imply that the cpu isn't going to be that great?
 


Take into account for example, the A10-6800K. In combination, it probably could fight with the i5-3570K in GPU+CPU.

I'm sure they're just saying the same thing, but now with the new Mantle API, the GPU difference becomes much larger and they can now stake out that claim more definitely.

Of course it'll still lose on the CPU side. It's a piledriver derivative after all.
 
http://techreport.com/news/25707/all-signs-point-to-kaveri-being-an-evolutionary-upgrade
read this earlier. seems like amd couldn't get competitive performance (compared to haswell) from steamroller. i assume bd was supposed to be 32nm, pd 22-28nm and sr @20-14nm when amd started designing them (glofo!!). kaveri powered-by-mantle is obviously superior to intel's cpus for gaming... but that was not necessary considering how weak intel's igpus are.

amd has a huge uphill battle if they want to make mantle ubiquitous (and to make their gpus and apus seem perform "better"):
http://www.techpowerup.com/194979/graphics-card-market-up-sequentially-in-q3-nvidia-gains-as-amd-slips.html
they don't have anywhere near enough gcn market share for that. amd needs to add older vliw gpus to mantle optimization asap... if that's somehow possible.

tr's pricing speculation seems a bit lower than mine, i was expecting kaveri a10s to sell for $140-150 range at launch.
.....
i really wish kaveri isn't just an evolutionary upgrade over (blegh..) richland.
 






Technically, DX11 already supported multithreaded rendering, though its a PITA to set up well. BF3 and Crysis 3 use it (so I assume all Crytek 3/Frostbite 2 titles do as well), using about a dozen worker threads to help split the workload off the main render thread [via GPUView analysis of those two titles]. That being said, the two main heavy threads (Main program + Main render) were still doing ~85% of the workload for those two titles, hence why Intel still won on performance. [AMD had better core usage, in that the threads lowered the workload of the cores on average, but Intel won on performance metrics, as no individual core is bottlenecked so individual core power was more important then the number of cores].

What Mantle can do, according to AMD, is basically take away some of the CPU overhead involved in running DX/OGL. Remember: The game thread sends work to the CPU, which then has to submit it to the GPU. There is a gating factor there.

Me? I don't view it as much as a game chanager, since the overhead in DX/OGL isn't that large to begin with (anymore). I'm more worried about NVIDIA allowing its internal low-level API to be used, and starting a major API war that kills the entire market.

http://graphics.stanford.edu/~mdfisher/GPUView.html

For the people who want a more in-depth analysis to what goes on under the hood.
 


Actually, there are significant differences in the hardware design of the XB1/PS4 that require significant code changes between the two. In particular is the XB1's ESRAM, which will need to be fine-tuned per application to get the same performance as the PS4's GDDR ram. Likewise, both will need to be very careful to manage its shared RAM pool, to prevent the GPU from hogging to much resources.

In short: The memory subsystems for the PC, PS4, and XB1 are totally different, and will need to be re-written for each system.

There really isn't anything to gain porting wise be going to X86; its not like anyone hardcodes low level CPU opcodes. In most cases, replacing a few library functions (math libraries mostly) and a re-compile is all you need to do for a PPC to X86 port; the OS differences are actually much more taxing to do. So the porting argument was never really true, just marketing fluff.
 


Juan is just using the forum to promote his blog site. Yes, kaveri will be competative in the IGP with Hasbeen and the cpu side will not.

IMO the problem with DT is the lack of a decent fabrication process. GF always talks about their 28nm this and 28nm that, FD-SOI, but we haven't even seen any confirmed products. Just a drop in clock speed when Kaveri hits.

AMD has no choice but to try and promote the best side of Kaveri. Its not because Kaveri is "just so friggin awesome that AMD needs no other products".
 


It has tradeoffs. Compare to Intel HTT: HTT is REALLY cheap to add to the chip (like 10% extra die space), for a decent performance gain in some subset of tasks. Its basically free for Intel to throw in and charge a $50 premium.

AMD's CMT is a lot more powerful, but a LOT more expensive, since you are basically duplicating everything but the CPU dispatchers and schedulers. Its ALMOST like adding a full core. The downside however, is AMD made each individual core very weak.

What AMD failed to recognize (and I called), is a four core chip can be faster then an 8 core chip, if the cores of the 4 core chip are faster and NO INDIVIDUAL CORE IS BOTTLENECKED AT ANY POINT IN TIME. That's where Intel has positioned itself going forward; it doesn't need 20% CPU gains per generation. As long as they keep the CPU fast enough where they can keep any individual core from being bottlenecked, they will hold the performance crown, regardless of how many cores AMD slaps on.

By contrast, AMD has the problem where because it has weaker cores, it is MUCH easier for any single one to bottleneck, and kill performance of the entire CPU. Hence why they are reliant on programs being heavily threaded, and why BD/PD stinks at any single-threaded benchmark. This also explains why the FX-4xxx series does so badly compared to i5's.
 


DDR4 probably will be not ready in 2015. By "not ready" I mean that it will be expensive and 'slow' (it starts at lower speeds than current DDR3).
That Carrizo comes in 2015 has been known since June or so. Why this surprise?
The desktop roadmap mentions Beema. Cannot read again a roadmap?



Ridiculous post. In the previous page you can find someone saying us what his friend at AMD said him. This thread is full of "my friend at AMD said so", despite what they post here contradicts what AMD says officially.
 


The dead of the FX platform was suspected since the beginning of this year and confirmed latter. It has been explained here during months that AMD is abandoning FX and Opteron CPUs by APUs. Don't understand people surprise.




If you read the last part of the article it mentions 20% faster than Richland CPU (In my BSN* article I predicted ~17%). This means ~30% IPC gain over Piledriver. This means that Steamroller will be at the i5-2500k level of performance. This means that Haswell i5 CPU will be faster... as expected. There is nothing new under the Sun.
 
Status
Not open for further replies.