AMD CPU speculation... and expert conjecture

noob2222 · Aug 24, 2013

juanrga :

so you found 1 game where a faster clocked a10 5800k was almost as fast as the 8150, here are check this one out.

http://www.anandtech.com/bench/product/675?vs=434

how many wins is that for the 5800k? ... 2, barely ... out of 8 games where it loses miserably to 5 of the other 6.

another 8 games to go with that.

http://www.tomshardware.com/reviews/piledriver-k10-cpu-overclocking,3584.html

750k aka trinity cpu is found 3 from the bottom

http://www.tomshardware.com/reviews/piledriver-k10-cpu-overclocking,3584.html

3 out of 16 different games for trinity means that kaveri is going to be better than the 8150 ...

how about the other aspect, .
try looking at some multithreaded "ordinary" productivity instead of some single-threaded benchmarks where you only use 1/8 of the 8150.
alrady shown anands benchmark so here is toms.

** note: may not be the 8150, but in multithreaded, we all know the 8150 is faster than the 6350.
good luck getting close to the 8150 on that one. trinity is starting dead last in "productivity".

but hey, in a few single threaded apps, that kaveri is going to win, depending on how wonderful GF's bulk 28nm process is.

Who was it that said the 8100 might win "some" ....

anyway, ill leave you to your opinion that a 4 core kaveri apu is going to own all and be the greatest thing ever created. We will see who is right in 3-4 months.

gamerk316 · Aug 25, 2013

Since we're all talking about how next-gen is going to aid AMD, how HSA is so great, and the like, here's something worth keeping an eye on:

http://www.ign.com/articles/2013/08/23/gamescom-battlefield-4s-graphics-lacking-on-ps4

Optimization issues on next-gen, prehaps?

de5_Roy · Aug 25, 2013

gamerk316 :

they won't get it right at first.

still.. iirc the ps4 dev kits were powered by amd fx cpus and radeon gfx cards - while the bf4 pc demos were run on intel cpus. multiplayer might look better on the intel ones (or on overclocked fx, if the demo pcs had those).
not much to read into until psr and bf4 come out...

de5_Roy · Aug 25, 2013

just wondering: will the use of hypertransport instead of pcie vastly improve amd cpus' multi-gpu performance?

Cazalan · Aug 25, 2013

blackkstar :

It really depends on how profitable the platform is for AMD. I would hope they don't abandon it but if they're sitting on huge inventory the platform just isn't working at the moment for them from a business perspective.

Clearly the MOAR CORES type marketing hasn't worked so well. Focusing on gaming is doing better.

hcl123 · Aug 25, 2013

juanrga :

Hypertransport (HTX) doesn't break hUMA. Besides hUMA is not yet part of HSA standards, just there is nothing to specify really beyond IOMMU and cache coherency, just read

http://hsafoundation.com/standards/

OTOH IOMMU v2 (2.5) is an "official HSA standard". That is the part, based on what is specified "same *virtual memory*", is what it does. And since this IOMMU specification is based on some Hypertransport internals (JUST READ THE PDF), i think cache coherency is facilitated, and that was always intention of AMD to "push" for cache coherency. So any cache coherency reference implementation only have to follow what is already in IOMMU, even if no other HSA member adopts the Hypertransport protocol, its facilitated. Neither there will be any *hardware* "cache coherency" standard or coherency protocol standard( i think), and there is no need to, others only have to adapt their *CC hardware implementations* to work with IOMMU (minor tweak), and even if they don't have CC, they only need IOMMU that the rest can be implemented in software

So for now hUMA is an AMD thing, which doesn't prevent others HSA followers to have identical, even having different hardware CC implementations.

In AMD case all is more facilitated, since *EVERY* CPU and APU of AMD, have for the Xbar/Northbridge inside those chips, an *Hypertransport switch*... yes since the Athlon64, every CPU and now APU is the same for that internal (system request interface/Xbar/Northbridge/possible several I/O HT links)... its a master work of Jim Keller among others... even if there is additional PCIe onboard controller for PCIe links, one doesn't invalidate the other, APU inside *core to core* communication including the GPU, is not PCIe, but HT protocol based ( the why the IOMMU). Besides everything is tremendously facilitated for integrating PCIe, since HT data transmission mapping is the same of PCIe but in reverse, that is, one is from pin 1 to 20 (example), the other the pin 20 is the 1 to 20 (which is 1 on the other)... understand ?... a packet only have to have the order reversed and that is it(which is fast and furious).. lol

That is what makes possible for a same slot to have PCIe and HTX, the link PHY only has to detect what type and reverse the order. The catch HTX needs an additional sideband interface for control and with that : PCIe/HTX ____________ + sideband_____ (same row) you have a "combo HTX +PCIe" slot, that could function with boards of both kinds(one at a time of course)... understand ?

So HTX comes very natural, even for hUMA, better is needed for hUMA for discrete adapters, PCIe doesn't do, doesn't have cache coherency. *IF* AMD will ever implement this is another story.

juanrga :

I think that is a misunderstanding. PS4 is different, and in terms of cost never did any sense to have only GDDR5 for a mainstream APU, a DIMM format for this, would drive price of memory trough the roof, very different of having GDDR5 devices soldered to a PCB... and the added latency of a DIMM approach would hurt the CPU part notoriously. (Tough it can have it, GDDR5 simple was not designed with DIMMs in mind...).

So to be, to me was always identical to intel, a pool for CPU and a GDDR5 pool for the GPU, but "interposed in the socket subtract", like some mobile GPU parts of AMD have

http://www.techpowerup.com/img/11-05-03/17a.jpg

Replace that GPU with an APU, and put there connections for DDR3 channels.

It still can have GDDR5 in a Kaveri revision... but since is only 512sp, i think ESRAM is a better bet, even Wuii has it, and it surely will be cheaper, and provides the GPU with a whole lot of more bandwidth than that provided by the DDR3 channels. In XBone supports 768 sp, for 512sp it would be greater the effect... yet the DDR3 channels could still be hUMA( which in XBone isn't).

hcl123 · Aug 25, 2013

esrever :

will never happen that convergence of performance, its simply a hogwash. Today a entry level desktop costing 5 to 7x less than a laptop, can out perform this last one. This all "mobile" push is very artificial, most of ppl don't need to be mobile at all... and is a tremendous cache cow for all the industry, that consumers are so damn dumb...

If you *really* need to be mobile is a different story... but even most of those IDM and retailers like more mobile, the profits are just so much better... while the performance are just, and will always be because mobile is considerably power restricted, so much worst... LOL

So the power mantra: "go half the power or less" ... even if with that you safe perhaps not even $5 a month (depends on intensity of workloads), with the *price difference* of a laptop you could have a top chip OC for free for 5 or more years, and enjoy > 5x the performance for free(edt)... big LOL

Fashions... not that mobile is not needed and the search for better power is not needed... both are... but fashions..

hcl123 · Aug 25, 2013

juanrga :

Yes all that is possible... but then you are barking at the wrong tree... its not the fault of the octo-core, its the fault of the "SOFTWARE"... when a quad core outperforms a similar octo-core, there is something terribly wrong and inadequate with the "SOFTWARE"...

Yet ppl can even pay a lot of money for crap software lol... you have to go read the thread from the beginning to understand that is "SOFTWARE" above all that commands performance, and is software more tweak this way or that way, that commands becnhmarks .... its all a big Psyops marketing campaign... just don't run or pay for crap software (included games), boycott who ever pushes for those, and everything will fall into "sane" tracks.

hcl123 · Aug 25, 2013

hafijur :

What you don't realize is that the very large majority of monitors tops at 60Hz... so its 60 frames... FPS is another big artificial thing... a Psyops... a tempest in a tea cup meaning nothing.

And those game studios committing to "software optimization" for those consoles speaks more volumes than any top hardware. Nothing is absolute of course, but software optimizations is orders of magnitude better than the top hardware, most 100% sure it is. That is why they have chosen relative low specs, they might have over estimated in some cases, but the "option" is quite valid nonetheless.

hcl123 · Aug 25, 2013

de5_Roy :

Sure. Those chips for PS4 and XBone already have HT in the Xbar. Everything is integrated so almost no expansion is needed, in PS4 even the SSD (flash devices) is soldered to the same mobo PCB i think... so there will be no "multi-gpu" unless they soldered to the mobo, and even then, don't know if possible, if there is any more PCIe or HT or any other kind of links available... which btw would defeat the purposes and price targets.

This is to to play good enough optimized games at good enough quality for a low price (PS4~$400)... not to engage in pointless e-pennis contests. Just remember *ONLY* a top GPU card can cost almost double of a PS4, and you wont have double the performance, not even near that, and we are talking only a GPU card...

Of course you can't expect pointless Guinness records breaking... but what else for $400 on the PC world you have to compare ?

juanrga · Aug 25, 2013

noob2222 :

The productivity benchmark that I mentioned is multithreaded and puts kaveri near a FX-8100.

The above graphic is very interesting. Steamroller is waited to a 15--20% faster than Piledriver, which would put the Steamroller quad just behind the six-core FX-6350.

de5_Roy · Aug 25, 2013

hcl123 :

de5_Roy :

Sure. Those chips for PS4 and XBone already have HT in the Xbar. Everything is integrated so almost no expansion is needed, in PS4 even the SSD (flash devices) is soldered to the same mobo PCB i think... so there will be no "multi-gpu" unless they soldered to the mobo, and even then, don't know if possible, if there is any more PCIe or HT or any other kind of links available... which btw would defeat the purposes and price targets.

This is to to play good enough optimized games at good enough quality for a low price (PS4~$400)... not to engage in pointless e-pennis contests. Just remember *ONLY* a top GPU card can cost almost double of a PS4, and you wont have double the performance, not even near that, and we are talking only a GPU card...

Of course you can't expect pointless Guinness records breaking... but what else for $400 on the PC world you have to compare ?

i meant in PC, not consoles. since i mentioned multi gpu, i thought that was understood. i was thinking like - if the amd cpu/apu used HT instead of pcie to communicate with a couple of gfx card e.g. 2x 7950 for pc gaming.

8350rocks · Aug 25, 2013

de5_Roy :

hcl123 :

i meant in PC, not consoles. since i mentioned multi gpu, i thought that was understood. i was thinking like - if the amd cpu/apu used HT instead of pcie to communicate with a couple of gfx card e.g. 2x 7950 for pc gaming.

Speed of data transfer is the reason. HTX is vastly faster than PCIe, and could be upgraded to be even faster with some adjustments to it. Jim Keller was a genius when he put it together...that technology is K8 era, and still runs effectively now...think about that. PCIe has seen 3 revisions since then and is still less effective than the last revision of HTX.

8350rocks · Aug 25, 2013

hafijur :

Consoles typically run about 30-35 FPS. 60 FPS on a console is most likely entirely out of the realm of reason.

Additionally, an i3 is done, broadwell will not offer i3s, and skylake will not either, outside of BGA OEM configurations for low end business situations where a thin client would actually be in place and a server would do the heavy lifting. So get off of that...

Additionally...multiple threads are the way of the future, it is already happening now. The newest MMOs, which are typically low end requirements even use 4+ threads. Look at MechWarrior Online...it runs 4 threads and will over run an i3. The target audience is shifting, more people own quad cores now than own dual cores, and it will continue to migrate away from dual cores.

The days of the i3 are done. Dual core desktop CPUs have terminal, inoperable cancer, and are on life support until the plug gets pulled. Their day is done, over, finished. Kiss them goodbye, and stop giving gamers recommendations for them in threads where they ask about budget CPUs, because you're not helping anyone.

hcl123 · Aug 25, 2013

de5_Roy :

Only and when they have everything set for those combo "HTX +PCIe" slots.

Intel will never have HTX, they could, but they will only implement what they can control to the bone... not even a free no-royalty standard will do, if they don't control it.

With this having GPGPU designs ones with PCIe and others with HTX is simply not an option now for AMD, just too expensive, and the intel "game" market, and a lot due to their own fault, will be much bigger than any new AMD game market for the time being... AMD needs PCIe graphs (ironic since intel never had a GPU of any good)

So with HTX+PCIe slots ready, and PCIe+HTX interfaces in those GPGPUs and CPU/APUs, everything is possible. And multi-GPU then is so sweat because if with "Lightweight Notification", practically there is no needed to have xfire drivers to have a "multi" configuration, everything will be pretty identical to have multi CPU chips on a mobo.

And that is a route AMD will take i think... if ppl remenber the "fusion" slides, their GPGPUs will have identical preemption/context and interrupt handling as a CPU, they will be like CPUs, full processor capabilities, and so HT/HTX just makes so damn much sense... multi CPUs + multi GPUs will be possible seamless in a lot of possible configs.

But if the *IF* is possible and can be likely... the *WHEN* is much more difficult to guess.
(EDITED)

juanrga :

Yes something terribly wrong with the software. Probably will not go above 4 threads.

The same as in there, you could do identical comparing intel 4C or 4C/8T to the HPDT 6C/12T... the smaller weaker chips can win for chips that cost more than double...

Terribly bezerc with the software for intel to...perhaps next time they will remember not to campaign so heavily for single-thread mantras.
(EDITED)

de5_Roy · Aug 25, 2013

8350rocks :

eh..? is that what he did... then why didn't amd replace pcie in their enthusiast lineup sooner with a pcie-ht hybrid slot configuration? i think at least one high end enthusiast motherboard shoulda had a configuration like that e.g. 990fx-HT or something like that(edit: seems a perfect config for HEDT but with far, far cheaper cpu choices even if the mobo's relatively high initial cost). people woulda scooped those up in no time. imo users woulda preferred a cheaper amd platform like that to intel's x79 platform. why amd didn't do it - doesdn't make sesne to me. :S

8350rocks · Aug 25, 2013

de5_Roy :

Because PCIe is so firmly cemented in among vendors. Unless a new HTX slot took PCIe, then it would be an AMD exclusive or proprietary system, even though Intel could do it. The issue was pointed out above, Intel want to see nothing from AMD succeed, and it would ultimately fail without large scale support. Even if it was a better system.

Now, as was somewhat discussed earlier, AMD could do a PCIe + HTX type slot on their MB with the right engineering. That would mean that feasibly you could put a HTX card into that slot on AMD MB's and get much better performance. If AMD specific vendors were willing to do such a thing, it would be an interesting situation.

The GPU would have to have PCIe compatability on top of HTX in order for us to see them in any large quantity. If that was possible, then it would definitely give AMD an edge that Intel wouldn't have seen coming at all.

EDIT: Or you'd have a situation where there were low numbers of HTX compatible cards that fit a PCIe slot, and they would be a bit more expensive because of low volume. That would, of course, be if they were not compatible with both.

de5_Roy · Aug 25, 2013

^^ that seems more like a management-level-incompetence than intel's tactic, since the engineering seems to exist already. amd coulda done a pcie-HT hybrid for niche hedt sector as much as they did centurion, instead of dragging on supporting pcie 2.0 after 2012. they coulda done it in 2012 and 2013.
edit: hopefully amd has something interesting to counter ivb-e. they publicly said they gave up performance race....
yet they released centurion.
imo pcie/ht radeons and motherboards woulda been more interesting than centurion.

8350rocks · Aug 25, 2013

de5_Roy :

It would likely take a bit of Jim Keller brilliance to get a PCIe + HTX hybrid slot...though it isn't outside the realm of possibility. AMD have already applied for patents for such.

hcl123 · Aug 25, 2013

8350rocks :

No the speed of data transfer *Now", comparing HT 5 years old v3.1 with PCIe v3, is not that much. PCIe v3 reduced the latency and can have more bandwidth, but will always be worst in latency because it needs a SERDES while HT don't.

But HT v4.0 will be 16GT/s... that is, will function at the same speed of PCIe v3 or up to 8 Ghz (v3.1 is 3.2Ghz double data rate), and can have quite enhanced power management features.

That will put HT in a league of its own high in the sky.

But the main reason for HTX, is exactly that for this AMD doesn't need to implement more or add any other controller, into their CPU and APU designs. The HT controller has been there since long... only tweak what is there. OTOH adding a PCIe controller can be space consuming... its bulky... that is why probably the PCIe controller will be on the chipset, and for PCIe slots from the CPU sockets (along with HTX for those combos) AMD will implement a PCIe bridge and will route PCIe to the chipset over Hypertransport, which has been possible since version 2 of the standard.

I can see quite a lot of advantages.

hcl123 · Aug 25, 2013

de5_Roy :

Only now it makes real sense.

HTX has been implemented since looong time ago in server/HPC boards... but never was for GPUs. And it was a sensible choice some how, GPUs were and still are pure devices, that need quite bloated drivers independent of the interconnect, and independent of the interconnect the performance driving feature is exactly the drivers(SOFTWARE), not the interconnect (CPUs need no drivers so the opposite can kind of apply). Having a HTX interface that only AMD could use, would had serve only as a lock-in tactic, would had been smart on the winning phase of Athlon64, but then there wasn't yet an AMD ATi... yet would had been an expensive lock-in tactic, since the lost of athlon 64 mojo, that fact would had made the large majority of GPU cards sold PCIe, without any notorious performance difference for HTX ( lack of hUMA or NUMA or similar and processor features would had dictate this). (EDT)

Now with hUMA and transforming GPU as pure processors, HTX for GPU makes terribly more sense than PCIe. To alleviate the isolation against intel and to reduce costs, HTX+PCIe slots only now were invented, the patents (some) are still only in the application phase, are not yet granted... so it will take some more time.

hcl123 · Aug 25, 2013

8350rocks :

Yes that is exactly the big twist to reduce costs. The GPU cards will have

PCIe+HTX______________ + Sideband_______ (same row) can you imagine ?

Go for a Intel platform which is only PCIe______________ only the first part of the interface will be slotted, and so the card would function has a normal PCIe... if both parts of the interface are slotted, then the card would function as HTX . But this will be only on AMD platforms for sure (jaw dropping surprise to me if intel implements HTX).

If this is not possible, there would never be HTX GPU cards, simple economics dictates a failure. That is why HTX+PCIe slots is only half of the story, inside those GPUs there must be identical interfaces of the CPU/APUs of now, that s why i'm so much curious of Volcanic Islands, to see if they have started a move... stupid e-pennis perf bickering is so danm uninteresting. To me VI could use HT, even if not yet ready for those combos. AMD could simply with this make a single chip GPU out of 2 parts in MCM, and so beating Titan to a ridiculous Titanic wouldn't require more sp or this or that or crap... would only need this... it would be single chip vs single chip ( and yes they could still have X2 like, doesn't AMD servers have 2 socket of MCM parts ? )

That is the part of the ATi legacy i'm curious to see if Jim Keller and Gustafson can change... Nvidia approach is simply very stupid, and even more when silicon prices are escalating... you don't need bigger chips more sp etc... you need several chips working seamless together well, and HT can provide that. Only for starters makes yields several times better, and so no GPU card should be above $500.

[ UPDATE : "then it would definitely give AMD an edge that Intel wouldn't have seen coming at all." ... oh! yes they have suspected... why do you think Broadwell is mostly BGA ? ... they will lose something with this, they will give hard times to their nvidia partner, but they will not give any leverage for anything like this. ]

8350rocks · Aug 25, 2013

Thought I would share this...

http://www.hotchips.org/

3:20 PM

Processors 2

Hardware-level Thread Migration in a 110-core Shared-Memory Processor

Mieszko Lis, Keun Sup Shim, Brandon Cho, Ilia Lebedev and Srinivas Devadas

MIT

Whoever said 100 core CPUs were way off in the distance may have been entirely wrong...

Seems MIT is working on such a threading solution on large scale super scalar CPUs. Wouldn't that be interesting? The coding would be nightmarish at first, though once the software industry takes a step back, and begins to think about how to solve problems in parallel, it may eventually* facilitate such solutions.

*Note: I am not saying this is happening, now or tomorrow, or even this year. Merely that we are clearly heading in that direction at the limit of current technology's capabilities. Barring the discovery of room temperature super conductors, there would really be no other obvious way to get x86 architecture to do more work outside of adding the capability to process more threads simultaneously. Obviously this takes a great deal of work to even get to a point where a program could sufficiently use such a massive number of threads. However, I could see something like a HPPC utilizing such a solution for flow dynamics or protein synthesis, or some other massively parallel task that could use such an absurd number of resources now...

Ags1 · Aug 25, 2013

100 threads nightmarish? I'm not sure so sure. Once you've got 4, 6 or 8 threads working correctly, scaling up to 100 threads is not such a big deal - all the technical issues have been solved, you're just doing more of the same.

8350rocks · Aug 25, 2013

http://www.tomshardware.com/answers/id-1779897/thoughts-results.html

Look at that thread.

AMD CPU speculation... and expert conjecture

Distinguished

Glorious

Splendid

Splendid

Distinguished

Honorable

Honorable

Honorable

Honorable

Honorable

Distinguished

Splendid

Distinguished

Distinguished

Honorable

Splendid

Distinguished

Splendid

Distinguished

Honorable

Honorable

Honorable

Distinguished

Honorable

Distinguished

Share this page