AMD CPU speculation... and expert conjecture

Page 458 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


If it didn't scale well, just putting Megahertz would not work out at the end. I agree with palladin that K10.5 was a dead end for AMD, but it doesn't justify going with BD just because K10.5 was the top of the Barcelona uArch. Making it smaller would not change the fact of the core design being old (or showing its age) with no deep tweaking/redesign.

Anyway, that being said, for AMD to let K10.5 make a comeback they would have to re-design most of it, so they might as well call it something else (like P3 to Core 2, haha).

And hell with price and complexity, I just want a dual APU board for bragging rights! Besides, AMD already has IOMMU, so they're not adding an unknown layer of complexity to a possible dual APU board IMO. Memory management won't be a worlds part from the dual Opteron boards. Specially since they won't need ECC RAM in them at all, so they can get rid of all deep certification and server backstage tricks/needs for ePCI and stuff.

This ideal board to me would have just 1 PCIe slot (x16) at the bottom (or even none, let it be a fully integrated MoBo), be microATX (or ATX, depends on layout for RAMs) and with 2 RAM slots for each APU (to keep size in check) and with a plaethora of USB3/2 connectivity. I wonder if uATX can house 2 APUs. If not, just let it be EATX, haha. It would really make a fine heater in winter!

Cheers! :p
 
*imagination goes wild* amd should make a 13.4"x6.7" motherboard with 8 so-dimm slots for a dual apu system. then orient it vertically so that 8x usb 3.0 and other ports be at the top of the enclosure. i know cases can do that but amd should make it default for the dual apu motherboards. the single pcie 3.0 slot should be at the bottom under the so-dimm slots. power supply connectors should be in the middle. the sata ports and internal headers should be near the top.

or. a dual die cpu with 4M-8M cpu (with u.n.b.) on one die and the gcn gpu in another die.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810



Yes AMD chose that compromise for their own HSA objective. The density of GF 28nm process is actually higher than Intel's 22nm process. The clock speeds offered are also quite similar to Intel's process.

 

juggernautxtr

Honorable
Dec 21, 2013
101
0
10,680


A dual die 7850k g32 socket would be a sexy beast. 32 gigs 2133+ ram, wouldn't really need a graphics card, but a couple epci slots on an EATX, wouldn't need more than maybe a 500w psu, ..........nerdgasm time.
 

blackkstar

Honorable
Sep 30, 2012
468
0
10,780


I am concerned about it too. It feels like as long as we have been doing this whole GPGPU/OPENCL/CUDA/ZOMGSOFASTCOMPARED2CPU thing that every step of the way the GPGPU versions have been hobbled in some way, if you're lucky enough to get one.

At OCN I was trying to help someone with a rendering issue and increasing performance. I mentioned VRAY-RT instead of a new CPU as the performance would be a lot better. They went to play with it and found the quality was not as good and it was missing features in the CPU version.

This is what I am personally worried about by AMD potentially not giving us a big CPU anymore. I don't recall a GPGPU version of anything offering more features or higher quality features than the CPU only version. My entire experience with GPGPU accelerated anything has been some graphs showing how much faster it is and then it not being usable because it's missing features and not as high of quality as the CPU version.

Even if HSA catches on and is massively popular, imagine how much older software that isn't HSA accelerated you're still going to have to deal with.

For me, the most time consuming things I do are compiling in GCC and rendering/texture baking in Blender. I don't see any of those getting HSA support any time soon. Blender's new renderer, Cycles, still doesn't support texture bake and it's been around for years.

I'm personally opposed to APU only future and this running from big dCPUs because it feels a lot to me like how OEMs love to push the whole "SO QUIET SO SMALL" thing on us for things where it doesn't matter most of the time (like desktops and people insisting on buying mATX and then getting massive monitors).

And then don't get me started on ARM, it's the end of x86 applications working. No skype/adobe/etc.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Only the integer and memory clusters of the original Silvermont core are out-of-order, the whole core is not ooo, but this is not the point. The point is that the custom core in Knights Landing is derived from Silvermont, but with substantial modifications. because are the KL cores optimized for throughput. Throughput cores cannot form a LCU.

IBM BG/Q is a homogeneous design. It cannot compete against heterogeneous systems from Intel, AMD, or Nvidia. This is why IBM has just joined Nvidia to develop heterogeneous architectures: IBM will make the LCU and Nvidia the TCU.

http://venturebeat.com/2013/11/18/nvidia-and-ibm-team-up-to-build-next-generation-supercomputers/



GF 28nm SHP process is not FinFET but planar. The next 20nm process (likely for Carrizo) will be planar as well. The next 14nm process will use FinFETs.
 

8350rocks

Distinguished


Exactly, all my predictions came true about clock speed reductions negating any positive growth in effective IPC from architecture improvements.

Additionally, it is not that bulk is bad, I would wager that roughly 80-85% of the consumer parts in the world are made on substrate that starts life as bulk wafer. However, bulk is designed for certain power envelopes and certain properties of a particular processor. HPC/HEDT applications demand more performance. Intel went to using FinFET on bulk substrate to try to squeeze more money out of less materials. Ultimately it probably costs them just as much as if they had gone with FD-SOI and done a simpler configuration with back biasing potential and all the other good stuff that comes with FD-SOI. Additionally, node shrinks for FinFET are becoming harder and harder for Intel to get right with each step. They have even conceded themselves that they will have to go to FD-SOI past ~10nm to make the shrink worthwhile at all.

You see, FD-SOI UTBB for lower node sizes offers a lot of things that are beneficial for higher clock rates and better thermal properties. That's the only reason AMD could run a 5.0 GHz stock turbo frequency on the 9590, had it been on bulk substrate, 4.0 GHz stock turbo would have likely been all you would have seen.

In recent times, AMD has been able to somewhat offset Intel's process advantage by using better materials to achieve higher clockrates for CPUs. Now, they are using bulk substrate which negates the previous advantage they had in clock speed. Thus their architecture, which runs better at higher clock rates, is now knee capped by a decision to use bulk substrate. This was likely something they were boxed into by fabs who could not master the 28nm FD-SOI node, even though IBM is already producing everything they make except ULP applications on FD-SOI @ 22nm.

The fact that IBM is selling this fab with the licensed technology from STMicro gives that candle you were discussing a bit of a flicker in the flame with a slight uptick in hope that we might yet see an enthusiast/HEDT class platform from AMD. As their access to fabs clearly killed the shot they had at doing it with this last go around.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810



Then by your definition Intel is doing it wrong then too. The substantial modifications is making it quad threaded and expanding from the 64bit SSE3 unit to a 512bit AVX3 unit. So they are homogenous. There are not separate cores like AMD does with SR+GCN or Jaguar+GCN or even like what Intel is doing in their mainstream product Haswell+EU cores. Phi cores actually run linux. You can SSH into each individual core by IP address.

IBM BG/Q is 18 cores, quad threaded with SIMD vector units.
Intel Phi is ~72 cores, quad threaded with SIMD (AVX3) vector units.

The difference is 45nm (IBM) vs 14nm (Intel) so they're getting a lot more cores in there.
 
Perhaps i think you're missing the main point and i'll say it again i never even seen the Athlon II x3 which can be overclocked to a good 3.6ghz easily for more than 70$ during that time and i could always find a 6570-5670 for around 70$ on newegg you are using listed pricing i'm using the actual prices that i could always get for them, throw in the need for cheaper ram and you can even get more performance per dollar.

Your comparing two products on sale prices vs one product at MSRP. You didn't even read my statement as I explicitly stated that I use MSRP because sale pricing can be found for everything. You could also find Llano for $120~130 USD and not $140 but that wouldn't be a realistic comparison. I've dismantled your entire statement of "you could be a Athlon II x3 + 6570 and do more for less then a Llano". I did that because I follow the pricing of products very closely and don't make broad emotionally generated judgements about products. APU's have their place and it's in an arena that enthusiasts rarely consider. Physics will always limit their maximum potential performance so we'll never seen discrete components go away entirely. Instead the budget and value sector, which each rely on using low power, low cost components to create a product that is "good enough for government work". From a pricing point of view, that seems to hover right around the $140~150 USD MSRP line.
 

juggernautxtr

Honorable
Dec 21, 2013
101
0
10,680


" Physics will always limit their maximum potential performance so we'll never seen discrete components go away entirely."

But the dedicated components will most likely be priced so high we can hardly touch them, if hsa takes off which most likely will, cause Adobe is already writing to ppu/gpu as they can get significant gains from it.
the apu will become the better processor even in the enthusiast lvl, as the gpu/ppu takes on more actual compute.
FX most likely won't last past 3-4 more yrs, as the apu, will just be getting better as time goes on, unless your building say $1200+ machine the way the apu are going.
The only thing i see staying around is dgpu, for heavy gaming.
 


That was a total joke. The titles were picked specifically for single threaded performance on single player games. You can tell by comparing the performance between the i3 and i5. The i5 is, quite literally, double the CPU that the i3 is. They also put APU's on there for good measure, but again using a dGPU with an APU defeats the entire purpose.

I didn't like that article as it's way to selective and cherry picking, it gives off the impression to pair something like a 780 with an i3 cause "you only need a dual core yo". In actual practice it'll fail hard. If your going to use a dGPU, go with an i5 or fx6 or higher.

On the other hand it does show the A8-3870K whopping the sh!t out of the Athlon II X3 450 like I said it would. Another anomoly I noticed is that the 8350 performs a whole helluva lot better then what they have it listed as. The 8350 + 780's I have put up better numbers so I'm thinking that had to do with weird ATI driver issue.
 
BIngo. And the 651 is even faster without the gpu . 74.99 at directron.

It's the exact same CPU. What people gotta remember is that the on-die p-state programming of APU's is done with the iGPU component in mind, so it's very conservative to what a dCPU would have. If your not using the iGPU then it will be in idle and only sipping a few watts of power, even less with the GCN ones. You can use K10stat to manually program the p-states in (multiplier and voltage) and undervolt + overclock it. My 3550MX is set to run at 2.7Ghz whenever TDP allows (it's 2.0Ghz base). I could actually get more aggressive but the notebooks' cooling can only do so much. Desktops would have better cooling solutions and you can do some really crazy things. Manually configure two cores to run much higher then it's rated for while forcing the other two to downclock to an undervolted 800mhz to free up power.
 

abitoms

Distinguished
Apr 15, 2010
81
0
18,630


In the various reviews, are the CPUs are being tested with all services turned off, and no AV running in the background? My PC, for e.g., has an AV running along with a firewall. My assumption is that all else being equal, a CPU with more cores should better the extra load of services and AV/firewall running in the background. So, I assume & expect, an 8350 should better handle a game compared to an i3 when services & AV/firewall run in the bkg. I think that is how (with AV, fw, services) most of us use our systems in real world.

I remember Toms did an article on something like this.
I did a search just now...but could not locate that article...could have been from 3 or 4 years back.

Ok, found something similar but not the in-depth entire article that I am searching for.:
http://www.tomshardware.com/reviews/athlon-ii-x3,2452-11.html
 

truegenius

Distinguished
BANNED


WOW, thats great now we have a cpu from AMD whose single core performance is like phenom, clocks like phenom, tdp like phenom so basically we are back to phenom, now they only needs to eliminate multi core performance loss
great performance improvement by AMD :sarcastic:
cee2e2017554de6da437cb4c37dfb65be82a6da7949a56fc24e9ad41e00b84b7.jpg
 

underline mine. that's not correct. kaveri shares it's tdp with the big igpu. it delivers better performance despite having to concede tdp and die area to the igpu. and when a dgpu is used as primary, the igpu can act like a heatspreader.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


We can see how the entire industry is massively migrating to bulk and FINFET. Even Glofo is migrating from 32nm SOI to 28/20 nm bulk and then to 14nm FinFET bulk.

AMD chose bulk process for Kaveri because SOI couldn't provide enough density for the iGPU. A hypothetical Kaveri APU on SOI couldn't be so powerful as actual Kaveri is. Moreover, Glofo has shown that bulk can achieve higher frequencies. Kaveri crosses 4.5GHz barrier on air, and it is worth to recall that months ago people on forums was predicting Kaveri couldn't hit 4GHz. One concrete 'expert' even predicted Kaveri would be clocked under 3GHz. He was plain wrong.

Several experts agree on the difficulties of scaling FDSOI to 7nm. This is why the expected market for FDSOI at 7nm is of a almost inexistent 1.5%, whereas FinFET on bulk will count for 96.9% of the market

https://www.semiwiki.com/forum/content/3128-soi-future-flop.html

FinFET on bulk wins as I predicted months ago.

It is worth mentioning that i5-2500k average OC freq. on air is 5037MHz whereas it is 4944MHz for the A10-6800k and 4609MHz for the FX-6350. Thus SOI at 32nm doesn't offer a giant advantage on average OC capacities on air over 32nm bulk silicon; at contrary, the i5 overclocks better. SOI is good for filling news with breaking of world-records using exotic cooling: LN2 and the like.




I bolded the relevant part above. Intel consider its KL belong to heterogeneous computing

Intel_Xeon_Phi_Knights_Landing.jpg

Intel_Xeon_Phi_Knights_Landing_2.jpg


http://news.softpedia.com/news/Intel-Launches-Knights-Landing-Xeon-Phi-PCIe-Coprocessors-361583.shtml

http://www.v3.co.uk/v3-uk/news/2275261/intel-announces-14nm-knights-landing-processor-in-xeon-phi-update

 

ColinAP

Honorable
Jan 7, 2014
18
0
10,510
This whole thing about IBM's foundry is wishful thinking at its very best.

1) It hasn't been sold yet, and we don't know who the buyer might be; it could be a rival of AMD for all we know - there are more of them than just Intel.

2) We don't know if AMD would even be interested in fabbing chips on that process, as their focus seems to be elsewhere now

3) If they are interested, then it would need to be roadmapped. The lead time in designing a chip for a new process to getting it to retail is - what? - at least 2-3 years?

What we really need to know about is what comes after Excavator. We've known about the full BD - Excavator roadmap since the latter days of the Phenom II, i.e. current gen + 4. Now all we know is current gen + 1, i.e. still Excavator.

You don't bring on board all that chip designing royalty that AMD have without having a plan that extends years into the future. What is it?

They will be designing that next generation from the ground up to be on a bulk process, SOI is DOA for AMD from now on.
 


But keep in mind, as of Vista, background userland processes, such as AV programs and firewalls, are scheduled less frequently then they were on XP and previous OS's. Combine that with the fact that most of them run in "game mode" now, and you get VERY light loading which shouldn't produce a measurable effect on FPS.

Even if your AV does go off during gaming, its more likely the load on the HDD, rather then the CPU, will be the cause of performance loss.
 


According to the rumor, TSMC and GloFo are the two most likely bidders; who else is in the Foundry business these days? Of course, GloFo probably can't even afford IBM's foundry's, so that leaves TSMC as the only possible buyer.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Yeah but imagine how much cheaper and smaller an FX-4x00 with 4 SR cores and no L3 (doesn't really need it) would be. About half the die size and less than half the cost. I'm sure there will be Athlon 750K style die harvested 7805Ks w/GPU fused off eventually but the yields have to be pretty bad to make those worthwhile.

They will be great overclockers but not earning AMD much money. Intel still makes good money on their i3s because they're not die harvested. They're simply smaller die.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


It's always going to be heterogeneous as the primary model is Xeon + Xeon Phi. If you're just working on Xeon Phi then that is homogeneous.

By making it stand alone and removing the PCIe dependency they make it easier to deploy on cards with just a high speed ethernet port. It becomes much more scalable than the 4/8 or so PCIe cards you can fit in on a motherboard.
 
Status
Not open for further replies.