Discussion: AMD Ryzen

sarwar_r87 · Jan 21, 2017

respectfully, a 1060 performance better than 390, does that make it a high end? you have to compare rx480 with 1000 series as they were released within months of eachother.

you do not have to take my work for it:
http://www.tomshardware.co.uk/amd-polaris-10-mainstream-gpu,news-53093.html

sarinaide · Jan 21, 2017

I agree with the reasoning of posters like Catmerc. AMD 3 2.5yearss ago targeted 40% but as the silicon evolved and the uArch took shape AMD were surprised by the gains. Aparently the F4 samples are showing extreme potential with clock speed and AMD may be taken at an awe. People I know with privy to samples are saying that Ryzen is very relevent and so relevent that Intel have taken strong alert to it. This would follow the release of a K series i3 and HTT on Pentiums as Intel look to try hang onto lower markets vs AMD's impending threat with 4C/4T Ryzen's and Raven Ridge APU's which should be quite strong given L3 advances and a far stronger uArch to operate off, I wouldn't be surprised if the iGPU practically doubles on the 7890K's iGPU. It will be a trench fight for sure.

simon12 · Jan 21, 2017

Do you all think AMD has finalised the clock speeds 100% or could the full production chips be a bit better or worse than they are expecting still? Are all price guesses pure speculation at the moment or is there anything official or leaked?

sarinaide · Jan 21, 2017

sarwar_r87 :

A 1050ti is close to the performance of a GTX670, a GTX670 in 2012 was the best value for money card on the market. The 390 is old like the 670 is to the 1050ti. The market was the 900 series and that is what the 480 did well. It is for the most part a very good graphics card, if just a little light in the tooth for modern 4K standards.

sarinaide · Jan 21, 2017

simon12 :

It will be finalised soon but not yet, F4 is out so next is QS which is likely what will retail.

sarwar_r87 · Jan 21, 2017

sarinaide :

AMD Radeon 400 29 June 2016
Radeon R9 390 18 June 2015 or 11/05/13 if you consider it to be an overclocked r9 290x
AMD Fury X 06/24/15

GTX 1060 06/10/2016
GeForce GTX 980 September 18, 2014
GeForce 600 March 22, 2012

1. "390 is old like the 670 is to the 1050ti". no its no
2. "900 series and that is what the 480 did well". 1070 got released weeks before rx 480 so clearly it was not.

3. GTx 980 was launched before 390, Fury X and 1.5 years before 480

sarwar_r87 · Jan 21, 2017

Rogue Leader :

Just FYI there is no rule against speculation or opinions specific to Ryzen, thats what this thread is for, discussion of such things. The rule is against attacking people personally for their opinions or speculation (or facts for that matter). As well taking said opinions and speculation down a rabbit hole away from the topic of the thread. Please do not submit alerts because you don't like other peoples opinions, unless said opinion personally attacks or insults you or someone else.

If you do not understand this very simple concept please PM me. If you do not feel you can adhere to it I suggest leaving the conversation. If you do not adhere to said rule we will exit you from the conversation, permanently.

do we have rules against misinformation/misrepresentation of facts though?

Rogue Leader · Jan 21, 2017

sarwar_r87 :

I cover that under "taking said opinions and speculation down a rabbit hole" Should misinformation be posted, we ask and expect that everyone can rebut such information WITHOUT personal attacks. This is still a discussion, maybe that person just doesn't understand what they are reading. If the person continues to repeat such information in a spam like manner they will, of course, be removed.

Saga Lout · Jan 21, 2017

sarwar_r87 :

If there was one, most Intel -v- AMD threads would have died off years ago. 😀

juanrga · Jan 21, 2017

Both are compatible. If Zen is 40% above Excavator (AMD's official statement at Hot Chips), then Zen IPC is below Sandy Bridge

74.60 * 1.4 = 104.44.

But Lisa Su said at New Horizon that the original target was surpassed and that it is now "greater than 40%". I take that as something between 40% and 50%. If the IPC is 50% above Excavator, then Zen is 4% above Sandy Bridge

74.60 * 1.5 = 111.9.

The middle point, 45% over Excavator, gives exactly Sandy Bridge IPC.

That Zen IPC will be more close to Sandy than to Haswell is my opinion, unless someone shows a CBST benchmark that demonstrates otherwise.

It is an estimation based in three asumptions: (i) Zen has about the same throughput than Broadwell, (ii) Zen has about same SMT yields than Broadwell, and (iii) the IPC gap between Broadwell and Excavator is about 55%.

The main goal of clustering those structures is not to save power, because they aren't iddle enough time how to power off them. The main goal of clustering technique is to reduce the critical path of the circuity to allow higher clocks.

The mention of AVX2 is misguiden as well. On vector-like stuff the throughput is almost linear with the wide of the execution pipe. That is why we can combine two or more units to execute a wider vector in the same cycle. This is not what I did mean, and it is not that the author of that post mean. We are refering to the ability to extract more ILP from a sequence of instructions and execute them OoO. In this case the throughput is not linear to the size of the window of instructions being scheduled. That is the reason why he correctly notices that an unified scheduler (like in Intel desings) has a penalty in both perf/watt and perf/mm --indeed, the penalty scales at least like O(N²)--. Distributed schedulers (like in Zen) have better perf/watt and perf/mm, but at cost of generating a less optimal schedule for a single sequence of instructions, which in the end means a lower IPC. That is the reason why he thinks that Zen will have less IPC than Haswell/Broadwell and why he thinks that AMD will have higher gains with SMT due to the more distributed nature of the muarch.

I agree with him. Of course, only a full review and detailed analysis of the Zen muarch will prove/disprove this.

juanrga · Jan 21, 2017

sarwar_r87 :

If I recall correctly AMD only said that different SKUs will be available on launch. No one confirmed that both 8C and 4C will be available at launch. Therefore both claims are compatible.

No one said that a BIOS bug is the reason for the delay of 4C part. What is being said is that a BIOS bug is the reason for this last delay of the launch of Zen, because the 8C Zen CPUs are ready, but mobos aren't.

I don't know why the 4C CPUs would be launched latter, but three different sources are saying the same now. BitsAndChips and CPCHardware also reported the rumor of delay for the 4C CPUs.

juanrga · Jan 21, 2017

sarinaide :

Untrue, Salgado18 asked me about the possibility of a 4C/8T 140W part:

salgado18 :

salgado18 · Jan 21, 2017

juanrga :

Was about to say that. Glad for the answer, and sad it's not viable.

sarinaide talked about a 6c/12t middle-range product, and we already discussed how there will only be 4 and 8-core parts. But it raised a question: I didn't see any discussion about 8c/8t and 4c/4t, without SMT. Is that possible or viable? Do you guys have any rumour/info on that?

sarinaide · Jan 21, 2017

It would be interesting if it is performance over a K10

Phenom II x4 965 - 89 @3.4Ghz /79 @3Ghz

40% = 110 / 50% = 119

Since a Phenom II X4 has L3 it appears the hit is not substantial but enough to impact slightly

juanrga · Jan 21, 2017

salgado18 :

Yes, models without SMT are viable. Personally I was expecting the next segmentation:

SR3: 4C/8T
SR5: 8C/8T
SR7: 8C/16T

An 4C/4T engineering sample is confirmed by CPCHardware. They comment it is possible it goes into the commercial chips.

https://twitter.com/CPCHardware/status/818932115270209537

sarinaide · Jan 21, 2017

Was running through some benches with No L3 vs L3, given that Zen is radically different to Bulldozer it is not as easy to equate shared resources and a L3 tied to the NB vs exclusive caches and FSB. The cumulative aggregate L3 vs No L3 was 10-12% in single threaded games like civiliazations and GTA.

Assume 75 factoring in at least a 10% penalty puts Excavator at around K10 performance ~82.5 again the outcome changes.

40% - 115.5
50% - 123.5

IMC and Cache overhauls were the major evolutionary performance leap from Nahelem to Sandybridge, I think AMD will make a similar leap just with more resources than Sandybridge had at that stage.

sarwar_r87 · Jan 21, 2017

juanrga :

obviously. my point still stands.

juanrga :

the fact that you can reach higher clock is true but only for multiplication and sqroots. more importantly, the CPU clock will still be limited by the AVX unit, so a gain from this is not very relevant. That is why intel chose to implement negative differential AVX2 clock in BIOS to allow KL to hit 5Ghz.

Also as AMD mentioned their goal is power.

sorry but can any claim that you made in the last para regarding lower IPC be backed by maths? unfortunately there is nothing that can confirm your assertion. why? because it has no logic and there is no scientific paper illustrating your claims, not on sciencedirect anyway

... That is the reason why he thinks that Zen will have less IPC than Haswell/Broadwell and why he thinks that AMD will have higher gains with SMT due to the more distributed nature of the muarch.

even if what he said was right, which I would dispute him directly if he were here, you are doing a cross uarch/vendor comparison, which cannot be done. what I mean is, if there were two versions of zen one with a static another with dynamic scheduler, you can prove/disprove that theory, but not in this case I am afraid. too many other more important things are not the same.

i.e. you cannot make that claim unfortunately.

sarinaide · Jan 21, 2017

The shift in uarch design alone can bring about performance yields that are not quantified, shared resources in a module design together with L3 that was slower than the memory interface resulted in most of Bulldozer's woes and why K10 is still a faster uarch. Zen is what AMD needed to follow K10, less shared resources and focus on performance and efficiency. The biggest changes have all been related to Cache and Memory, which was slower on the Bulldozer than it was on Phenom II which is why Thuban/Deneb was very close to the performance of Nahelem despite using an old IMC and aging chipset.

Improvements to a uarch are not linear, AMD targeted a 40+ gain as a minimum but changes can yield beyond that. The 40% has seems to be used in the "at least" sense but in reality the complete redesign could be comparable to Mercedes before the new turbo era vs Mercedes in the V8 era of formula one, they nailed the money shot with the the turbos.

I still think performance across the board is good and that is the general sense that follows that sentiment, that finaly there is competition again.

os2wiz · Jan 21, 2017

sarinaide :

I am in sync with your assessment of things, I have a question which is only peripherally related. I just bought an EK X360 water cooling kit from Newegg. I got it at a bargain price of $299.99 instead of the normal $389.99. I know EK has promised AM4 compatibility for Ryzen. But I have no idea if this kit uses brackets or has another means of achieving compatibility. I have been unable to find out yet from EK whether they will ship me an AM4 bracket separately if required or if it is already in the box compatible. I know from their website the Predator AM4 bracket is available for $8.99 , but this is not the Predator.

cdrkf · Jan 21, 2017

juanrga :

I actually think the reason for no 4C cpu at launch is simple- it's a business / marketing decision. I can't remember where I read it now but I'm certain last year there was a quote from either Raja or Lisa Su regarding the fact that AMD were planning to maintain a series of staggered launches throughout the year- I'm guessing in a bid to help bolster the stock price and keep AMD in the news more. It's an approach that appears to be working as I think it's at least one reason why AMD's stock is looking so strong- remember that stock market analysts and many investors have little in depth knowledge of the products. AMD being mentioned on a regular basis is likely to give a better perception "oh look, ANOTHER new AMD product, wow they are innovative"... If there is one thing I've learned keeping an eye on stock prices- the price of a stock is 99% about *perception* of a company. I think this is AMD's way of playing the game a bit more on stocks and the stock prices are looking better for it (obviously the fact they had a good launch with Polaris and the fact Zen looks promising aren't hurting either but I do think in the past they have missed a trick in terms of how the presented / launched their products).

Rogue Leader · Jan 21, 2017

Gentlemen,

Due to violating the warnings we JUST POSTED os2wiz is no longer with us (If you're looking for it, his post was deleted FYI). We were not kidding when we posted our latest warnings and our most recent bans in December.

Do not insult each other, do not call each other out, do not accuse people of being paid off by Intel, do not tell people they need to learn math, do not call people simple minded.

I don't know how much clearer I can be here. If you have questions please PM me, but please do not make yourself next on the list.

sarwar_r87 · Jan 21, 2017

juanrga :

not sure i follow your post.
If I recall they did confirm that it will not be a paper launch. I though different SKUs are categorized by the core count.

Of course some unforeseen thinks might have propped up forcing the delay. it will not be the first time nor will it be the last time. Or as cdrkf says, could be their aim at trying to dominate the new headlines. or maybe they realized that the 4c modules can easily be unlocked to 8c modules by a bios hack. only time will tell.

8350rocks · Jan 21, 2017

os2wiz :

This is borrowed tech from the discovery tablet a few years back. The discovery tablet had the ability to scale the processor speed based on the temperature of the skin of the tablet to maintain comfort for the user. This tech has been adapted to HEDT processors and allows you to run the processor at any speed up to a set maximum as long as the cooling maintains the specified temperatures.

In other words, with high end cooling, you could potentially adjust your maximum turbo clock to 4.4 GHz and the CPU would run as fast as it could and still maintain temps, even reducing to lower clocks within the turbo bracket if necessary to maintain temperatures. The tech itself is actually quite interesting...I had the opportunity to get a hands on with the discovery tablet...it was quite an impressive design.

8350rocks · Jan 21, 2017

juanrga :

sarwar_r87 :

Both are compatible. If Zen is 40% above Excavator (AMD's official statement at Hot Chips), then Zen IPC is below Sandy Bridge

74.60 * 1.4 = 104.44.

But Lisa Su said at New Horizon that the original target was surpassed and that it is now "greater than 40%". I take that as something between 40% and 50%. If the IPC is 50% above Excavator, then Zen is 4% above Sandy Bridge

74.60 * 1.5 = 111.9.

The middle point, 45% over Excavator, gives exactly Sandy Bridge IPC.

That Zen IPC will be more close to Sandy than to Haswell is my opinion, unless someone shows a CBST benchmark that demonstrates otherwise.

sarwar_r87 :

It is an estimation based in three asumptions: (i) Zen has about the same throughput than Broadwell, (ii) Zen has about same SMT yields than Broadwell, and (iii) the IPC gap between Broadwell and Excavator is about 55%.

sarwar_r87 :

The main goal of clustering those structures is not to save power, because they aren't iddle enough time how to power off them. The main goal of clustering technique is to reduce the critical path of the circuity to allow higher clocks.

The mention of AVX2 is misguiden as well. On vector-like stuff the throughput is almost linear with the wide of the execution pipe. That is why we can combine two or more units to execute a wider vector in the same cycle. This is not what I did mean, and it is not that the author of that post mean. We are refering to the ability to extract more ILP from a sequence of instructions and execute them OoO. In this case the throughput is not linear to the size of the window of instructions being scheduled. That is the reason why he correctly notices that an unified scheduler (like in Intel desings) has a penalty in both perf/watt and perf/mm --indeed, the penalty scales at least like O(N²)--. Distributed schedulers (like in Zen) have better perf/watt and perf/mm, but at cost of generating a less optimal schedule for a single sequence of instructions, which in the end means a lower IPC. That is the reason why he thinks that Zen will have less IPC than Haswell/Broadwell and why he thinks that AMD will have higher gains with SMT due to the more distributed nature of the muarch.

I agree with him. Of course, only a full review and detailed analysis of the Zen muarch will prove/disprove this.

http://www.cpu-monkey.com/en/compare_cpu-intel_core_i7_2600k-6-vs-amd_fx_8350-7

How do you think that excavator is 40% behind Sandy Bridge when piledriver was only 25% behind Sandy Bridge in single thread?

If Excavator was indeed 20% faster than the piledriver cores, and data points to this being accurate, then Excavator is nearly Sandy Bridge performance already.

If you take a worst case, and assume that Sandy Bridge is some ridiculous number, like 30-35% faster...then we are looking at excavator cutting that down to 10-15% difference.

Since you are quick to point out that AMD said from excavator, then that means that a 40% improvement would literally put them at minimum 25-30% faster than Sandy Bridge, which puts them somewhere in the realm of haswell. If it really is 55%, then we might honestly be looking at Broadwell level performance in single core, or possibly even SL/KL.

I think it is likely around Haswell, personally...however...I have seen no benchmarks of any single threaded application showing PD to be anywhere close to 50% behind SB, much less EX. If you have some information that shows EX 50% behind SB, please present it for the group to review.

EDIT: This is the only test I could find that showed clock per clock comparison.

http://www.kitguru.net/wp-content/uploads/2012/10/cinebench10.png

This shows a 2600K @ 4.6 and the 8350 @ 4.6.

The 2600K is a 9.05 and the 8350 is a 7.56

That is a 17% gap...which is actually even smaller than the other link claims.

sarwar_r87 · Jan 21, 2017

8350rocks :

juanrga :

http://www.cpu-monkey.com/en/compare_cpu-intel_core_i7_2600k-6-vs-amd_fx_8350-7

How do you think that excavator is 40% behind Sandy Bridge when piledriver was only 25% behind Sandy Bridge in single thread?

If Excavator was indeed 20% faster than the piledriver cores, and data points to this being accurate, then Excavator is nearly Sandy Bridge performance already.

If you take a worst case, and assume that Sandy Bridge is some ridiculous number, like 30-35% faster...then we are looking at excavator cutting that down to 10-15% difference.

Since you are quick to point out that AMD said from excavator, then that means that a 40% improvement would literally put them at minimum 25-30% faster than Sandy Bridge, which puts them somewhere in the realm of haswell. If it really is 55%, then we might honestly be looking at Broadwell level performance in single core, or possibly even SL/KL.

I think it is likely around Haswell, personally...however...I have seen no benchmarks of any single threaded application showing PD to be anywhere close to 50% behind SB, much less EX. If you have some information that shows EX 50% behind SB, please present it for the group to review.

EDIT: This is the only test I could find that showed clock per clock comparison.

http://www.kitguru.net/wp-content/uploads/2012/10/cinebench10.png

This shows a 2600K @ 4.6 and the 8350 @ 4.6.

The 2600K is a 9.05 and the 8350 is a 7.56

That is a 17% gap...which is actually even smaller than the other link claims.

I think the problem here is that he is extrapolating from Athlon 845, in which case clock for clock AMD is roughly 40% behind SB.

http://www.cpu-monkey.com/en/compare_cpu-intel_core_i7_2600k-6-vs-amd_athlon_x4_845-624

I have given up debating him why AMD would choose a low end APU to compare to the latest gen. And the fact that you are comparing a core with 4 int and 2 fp unit to a processor with 4 of each.

but in his defence, AMD did leave it vague.

In my view, one has to compute the percentage improvement from piledriver APU to a XV APU, which is 10pc in CM ST and factor it in to the piledriver number. Like you did.

or compute the improvement of latencies and throughput between SR and BR as published in the micro benchmark by CPC, and use that to extrapolate using the IPC of A12 9800. This lands zen 9 pc bellow KL IPC. Or just about in par with haswell

Discussion: AMD Ryzen

Distinguished

Splendid

Splendid

Splendid

Splendid

Distinguished

Distinguished

It's a trap!

Retired Mod

Distinguished

Distinguished

Distinguished

Distinguished

Splendid

Distinguished

Splendid

Distinguished

Splendid

Distinguished

Judicious

It's a trap!

Distinguished

Distinguished

Distinguished

Distinguished

Share this page