Discussion AMD Ryzen MegaThread! FAQ and Resources

Page 61 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.


Actually, in the old thread I pointed out quite often that IPC as it is used here is inaccurate at best, and lazily ignorant at worst.

Yuka raises a valid point.

IPC in Ryzen is not more than 6% behind Kaby Lake per AMD themselves. AMD know far better than we do what their processor is capable of...I mean...they did design the damned thing after all, right?

Ultimately, performance of the platform as a whole is correct analogy. Platform performance takes into consideration lots of different things, and assuming the architecture is deficient in one area, when it may not truthfully be deficient there is no different than a mechanic telling you that you need to replace the condenser coil in the A/C of your car when you really need to recharge the R-134A coolant.

The issue here is that many do not understand or differentiate the performance, even some of the self proclaimed experts. Were this the case you would see far more discussion about the impact of memory on CCX communication, the implications of the IMC on Ryzen forcing 1T timings on memory, and the fact that Ryzen performs amazingly well in applications that do not require much thread jumping (even exceeding projections in some cases), but significantly worse when the scheduler tends to push threads to a different CCX for various reasons.

All of those factors come into play more significantly on Ryzen than actual IPC. The benchmarks show when a thread is static on a core that it can run with the best that Intel has to offer barring clockspeed deficits.

So, the long answer is noted above, the short answer is that IPC has been used egregiously out of place in most of the discussions about Ryzen. IPC has become a buzzword, and it is inaccurate in 95% of the applications it is used, simply because lots of arm chair engineers do not know a better way to quantify the gaps, or reference the phenomena they see intelligently beyond that buzzword that engineers loathe to see in discussions like this.

Yuka is correct, performance is the correct term, and IPC is tossed around far too casually here and many other places. It annoys me as well, if I am totally honest, and I try to ignore it, but the point stands. He is 100% accurate in this case.
 
No offence but the last person or company i'll listen to when it comes to how great their product is, is the company itself, Amd, Nvidia and Intel really do talk more marketing then facts and they will always try and show their product in the best light for example Amd's ryzen gaming demo's.

Yuka is right about using the word performance over IPC i wish many would do the same but i do believe that is what people mean when discussing IPC

if they are wrong(and they are) about the definition it still doesn't change the fact that performance per cycle is less in games with Ryzen then Haswell let alone kabylake and its easily measurable in some titles, it was already discussed that it's in part do to the latency between the CCX communication.

We also discussed why you don't create a program to stick to the cores instead of letting the OS handle it with the scheduler as it can cause a lot of issues.

It's almost a red herring if i had to use a logical fallacy that i believe is being used as its not the original claim but another whole different subject all together and that is terminology.

Most people relate IPC to performance per cycle. Reviewers, and heck probably even Amd themselves in the marketing slides so it relates back to the tech reviewer.
 
IIRC, the AMD guy said Ryzen is 7% behind Kaby Lake. Well, that's around Haswell's level, which is 6% behind Skylake/Kaby Lake on average.

That being said, Ryzen looks worse in professional work than Haswell on average. Sometimes, it even bests Broadwell, but eh. Ryzen's IPC might be on par with Haswell, but its gaming performance is more in line with Ivy Bridge at best, which isn't a bad thing at all.
 
I think what their claiming is 6.8% behind Kaby Lake.. But hey were really splitting hairs here. I think with the design of Ryzen the actual IPC is just harder to measure with complete accuracy.. Because of the CCX's and the "Non Unified Memory Architecture" and probably the Infinity Fabric.. which is running at all different speeds as well were getting fluctuating results.. Sometime's even with the same ram & fabric speed. It's certainly doesn't seem to be an exact science so IPC is probably not the best term for sure. The performance does sounds more fitting...

Does it make a difference when measuring the 4 core 4 thread version.. .as there would be no extra threads an only one ccx, Just a thinking out loud here, I know it's supposed to be single core testing but even if windows itself causes a cache copy it may affect the test results causing random different outcomes.
We don't really know whats goin on under the hood as it were just yet.. but were getting more information all the time I guess.
 


The wall is for x86. AMD is hitting the same wall than Intel. Zen+ (Zen2) will bring minor improvements atop Zen.
 


Multithreading doesn't imply it has to use 16 threads. Something can be multithreaded and use only four threads, for instance. Performance is a combination of IPC, clocks, and number of cores.
 


It does kinda look like they were holding back a bit on their roadmap anyway.. as they have seemed to effortlessly move the hole thing forward all of a sudden...
 


It is a bit weird when Arstechnica's Haswell chip gets slightly higher scores than the Broadwell chip in several of the benches.
 


I'm not sure there's anything in the works for 'post x86'- apart from the existing x86-64 extensions (which AMD developed and licence back to Intel).

Many people have been predicting architectures like ARM v8 will replace x86 / x86-64 as they are 'inherently more streamlined' however the reality is no one has yet to produce a processor that scales up to the performance levels (especially in single thread) that you can achieve with a modern AMD or Intel x86-64 cpu core. Arm is very efficient with small cores, it's advantage doesn't appear to materialise when you scale things up yet. That said AMD do have an ARM licence and do have a high performance ARM v8 core in the wings known as 'k12' which is supposedly very similar to Zen but based around ARM instead. I think it's notable though that they *haven't released it* and instead stuck with Zen for both consumer and servers. If it offered significant performance / efficiency advantages they'd have pushed it first I'm sure...
 
In theory, Intel could decide to just drop legacy x86 features, such as all the hardware necessary for 16-bit operations, 8086 protected mode, and the like that isn't used that much anymore, or even drop the x86 (32-bit) hardware and make their chips 64-bit only. That would free up some die space and streamline the architecture somewhat, though at the cost of software compatibility.

Other architectures are better designed then x86, but the problem is basically Windows and it's software library. The last serious attempt to move on was Itanium, and AMD basically killed that by extending the x86 instruction set. I'll say it again: We'd be in a far better place if AMD never put out x86-64.
 


I've thought about that many times, and still wonder why can't software emulation at the driver level handle legacy instructions.
 
Itanium was inherently weak because they never got the compiler working too well despite working on it until just recently, so I doubt we'd be in a better place. VLIW-based archs are tough to use with complex work, that's why AMD moved away from it with GCN for their GPUs to improve compute performance. While emulation of legacy code after removing hardware compatibility for that legacy code might help shift the wall in x86 performance (I really doubt it matters much, that hardware is probably very small), Itanium was not the way to go.
 


IPC is part of performance. IPC is not only relevant for single thread. Multithread performance depends on IPC as well. Performance depends on IPC, numbers of cores, and clocks. It depends on SMT yields also if you don't count SMT as part of IPC.

CineBench is slightly favoring RyZen. That is the reason why the gap RyZen-Broadwell is closer on CB than in the average. On average, the gap is bigger. I have provided 12 compute benches and 8 games benches of RyZen vs Broadwell vs Piledriver. All them at 3GHz. Clock-for-clock RyZen is about 10% behind Broadwell on compute (259.5 / 235.8) and about 20% behind Broadwell on games (209.0 / 173.3).

On CB15 RyZen is about 8% behind Broadwell clock -for-clock, confirming this bench is favoring RyZen a bit. The reason is that RyZen has 512KB of L2. In fact you will see that SKL-X gets a huge improvement in CB15 compared to Broadwell, because Broadwell has 256KB of L2 whereas Skylake-X has 1MB of L2.

CPU-Z is a broken bench. It was confirmed it has a bug and it is not reflecting real-world performance.
 
Itanium was a failure for a lot of reasons and AMD isn't really one. You can blame Intel for using a closed VLIW uArch instead of cleaning up X86 for AMD getting in bed with MS to push X86-64. But no, they thought using HP as the block bully in the OEM server market and their dominant position would put them in a good place. I'm not condemning the engies that created Itanium nor it's design, but most of the problems were outside of the technical boundaries.

On the other hand; do we really need yet *another* massive ISA? You already have ARMv8 with full Linux support since like day 1. Why hasn't that caught on if the performance is so great? Why no CPU foundry has been selling full PCs with ARMv8 in them and a custom Linux version in it? You know why? Because it's just not enough to displace X86 IN PERFORMANCE; doesn't even matter which camp. Is there even a CPU with ARMv8 that actually goes above 15W anyway? I mean, if no one is really making one, there has to be a reason, right? Windows even tried with RT. So there's *been* an effort already. Plus, I do remember reading a very interesting post somewhere that explained *why* X86 doesn't really need fat trimming. Most of the current old legacy instructions are already being decoded into the new ones; although on this last part I might be remembering wrong.

Going back to Ryzen, I don't really care where this "IPC" none-sense carries on from here on out. The benchies are clear enough: go buy Ryzen stuff in the price bracket you need.

Cheers!
 


Efficiency is noticeable inferior to Broadwell-E, about 15% behind Broadwell (octo-core vs octo-core)

getgraphimg.php


I don't find any impressive here. Piledriver was a speed-demon muarch. Speed demons always have worse efficiency. That is the reason why mobile chips aren't speed demons. And PD used a 32nm SOI process optimized for performance, not efficiency.

The huge efficiency gain that Zen has compared to Piledriver is a consequence of changing the microarchitecture from speed-demon to brainiac, migrating from 32nm to 14nm, using a process node optimized for efficiency (LPP = Low Power Plus) instead using one optimized to get up to 5GHz, and using SoC approach: more integration = lower power.
 


Nice thanks for sharing that i like his test but i did note that he is using DDR4 3000Mhz memory and i believe that makes the issue kind of fade away anyways
https://www.pcper.com/reviews/Processors/Ryzen-5-Review-1600X-and-1500X-Take-Core-i5/CCX-Latency-Testing-Pinging-between-t

Here is ryzen with just 2400mhz memory
https://www.techpowerup.com/231268/amds-ryzen-cache-analyzed-improvements-improveable-ccx-compromises

If one is to buy Ryzen please get the fastest memory as possible. If i buy one i'm getting 3200Mhz memory and hoping for stability.
 
Hey guy's.. I'm buzzing off this Directx 12 "Draw Call Processor" AMD an Microsoft have developed for the new xbox scorpio.. it is supposed to free up severel CPU and GPU cycles every time it's used... This sounds amazing... wonder if we will be getting them built in to our graphics cards in the future.. I have posted as much details as I could find on it in the AMD Future Chips thread...
 


Sorry, but I don't understand this graph. It has no measurement unit, is it larger is better? Work per Watt?

Also, why does the 115* socket CPUs have so much more single-thread efficiency than the 2011 CPUs? It should be comparable, since they are the same arch, no?

I get your conclusion, but to me it is based on a very strange graph.
 


Juan, there are test that show higher performance and lower power consumption than Broadwell- so your assertion that Zen is '15% behind' is far fetched. You keep referencing that same french review- I'll bet there are a few edge cases where Ryzen performs *well below* where it normally does. As with all statistics serious outliers (both good and bad) should be excluded. From what I've seen Zen is pretty close to Broadwell.

Also, how is designing a new 'brainiac' architecture from scratch, on a totally new process node (with a different type of transistor) resulting in a *huge* performance and efficiency gain in one leap not impressive to you? No it doesn't outright better Intel but we are talking a company with less than 1/10th the r and d budget and minuscule market share in comparison- and somehow they've managed to cobble together a pretty competent, very scalable cpu design. That is what is impressive. I really don't get why you are so down on what is repeatedly shown to be a competent design- and one that AMD have priced very keenly at that. To put it another way, if I were to build a new rig today, my money would go on an R5 1600 and a B350 motherboard because Intel has nothing to offer that can compete with that setup in that price range. If you asked me this question a few months back the answer would have been an Intel system because there was no alternative- as things stand *right now* I think the opposite is true. I cannot fathom what *anyone* would take a Kaby lake i5 over something like the R5 1600- the platform is there, the performance is better, the price is cheaper on the AMD side. No doubt Intel will counter this with some much better options in the near future, still right now, AMD have the mid range sewn up.
 
I can agree with that the R5 series is really good compared to the I5 series

In that case i can think of very few reasons to even recommend a I5 over a 1600 in many tests a R5 at 3.9Ghz does actually beat or meet a 4.7Ghz Kaby-lake I5 in modern gaming and of course its better for everything else.

Most people only build gaming rigs with 250$ CPUs or less to begin with. I'm not that market but most are and it basically speaks to Amd fans who currently only have a mid-range GPU that they can buy.

 


Basically...there literally is no reason to recommend an i5 over R5 anything. I could see an edge case argument for a 7700K under a very specific set of conditions, but even then...if you do much productivity at all, the R7 1700 is just outright better overall.
 


The company that is trying to sell the product (was it AMD, IBM, Intel, Apple, Nvidia, Cavium, Samsung, or any other) is the last one would trust. That is why we have sites as Tomshardware checking the claims made by the companies.

Reviews have already found that many claims made by AMD aren't true. Next some excerpts regarding total throughput, IPC, gaming, and TDPs, respectively:

PCWORLD:
My own tests don’t quite match AMD’s results. First, my Core i7-6900K scores are slightly faster than AMD’s. AMD’s own tests, in fact, showed the midrange Ryzen 7 1700X matching Intel’s mighty 8-core.

PCPER:
I would err towards Cinebench being the more accurate representation of the global IPC difference, but Audacity is a true application and workload that many people use (even video encoding has audio rendering). Still, the 8% gap between the 6900K and the Ryzen 7 1800X at 3.5 GHz tells me that AMD’s claims of equal IPC appear to have been overstated.

GamerNexus:
At this point, you might be left feeling disillusioned when considering AMD’s tech demos. Keep in mind that most of the charts leaked and created by AMD revolved around Cinebench, which is not a gaming workload. When there were gaming workloads, AMD inflated their numbers by doing a few things:

In the Sniper Elite demo, AMD frequently looked at the skybox when reloading, and often kept more of the skybox in the frustum than on the side-by-side Intel processor. A skybox has no geometry, which is what loads a CPU with draw calls, and so it’ll inflate the framerate by nature of testing with chaotically conducted methodology. As for the Battlefield 1 benchmarks, AMD also conducted using chaotic methods wherein the AMD CPU would zoom / look at different intervals than the Intel CPU, making it effectively impossible to compare the two head-to-head.

And, most importantly, all of these demos were run at 4K resolution. That creates a GPU bottleneck, meaning we are no longer observing true CPU performance. The analog would be to benchmark all GPUs at 720p, then declare they are equal (by way of tester-created CPU bottlenecks). There’s an argument to be made that low-end performance doesn’t matter if you’re stuck on the GPU, but that’s a bad argument: You don’t buy a worse-performing product for more money, especially when GPU upgrades will eventually out those limitations as bottlenecks external to the CPU vanish.

Hardware.fr (translation from mine):
In charge, the consumption of the 1800X is fairly light, halfway between a 7700K and a 6900K. In full charge under x264, consumption is much higher, AMD this time is halfway between the 5960X and the 6900K, which is still very good but higher than what suggested its TDP of 95w, which is clearly exceeded.

Tomshardware didn't check IPC, but other reviews did. Anandtech compared RyZen IPC with Kabylake:

As you would expect, AMD still lags in IPC to Intel, so a 4.0 GHz AMD chip can somewhat compete in single threaded tests when the Intel CPU is around 3.5-3.6 GHz, and the single thread web tests/Cinebench results show that.

4 / 3.5 = 1.143
4 / 3.6 = 1.111

Kabylake has about 13% higher IPC than RyZen, which is the double than the figure claimed by AMD. Moreover, Anandtech got this 13% using throughput optimized benches such as CB, which favor RyZen. A more general comparison increases the gap, as shown by "The Stilt" in his in-house testing

lK7gSAo.png


He got that the IPC of Kabylake is about 24% higher than RyZen (181.67 / 146.91 = 1.236).
 


No. One of the problems of x86 is its inherent serial nature, which is the true reason why we hit a IPC wall. This serial limitation is not solved by "cleaning up x86" and that is why HP and Intel developed a different arch, which was precisely named EPIC from Explicitly Parallel...

Anyone interested in experimental architectures can check the development of the Mill CPU. This is an evolution of the VLIW concept and current prototypes can perform up to 20 or 30 instructions per cycle.
 


It is measuring efficiency; therefore, the higher the bar the higher the efficiency. The units are performance per watt.

Efficiency depends on the muarch and other factors that I mentioned above as the platfform and the process node used. The 4790K is Devils Canyon or Haswell *refresh* which got new TIM and improvements in the FIVR, which helped to get higher clocks than normal Haswell. Moreover, the 4790k is a simpler platform than the 5960K/5930K/5820K, on single thread measurement, the Haswell E-series chips are consuming more power from all the unused resources.



I keep referencing the HFR reviews because they are the more complete that I know. They tested virtually anything and instead spending time looking on the internet what review measured what, I can go to the HFR site and get the information that I need.

I don't know any other review has measured the efficiency of the CPU. If you know some let me know.

What AMD did is not impressive for me because (i) I expected something like that; therefore not surprise on my part, and (ii) plenty other companies do more or less the same. IBM, Cavium, Apple, APM, Sun/Oracle,... have developed new muarchs on new process nodes.

Power grows linearly with IPC but nonlinearly with frequency. A brainiac desing is always going to be much more efficient than a speed-demon. Take a PD core, double the resources and reduce f_max from going wider and automatically your efficiency increases a huge amount. No mystery here, just basic physical laws that every engineer uses. The same laws that explain why Intel engineers abandoned speed-demons time ago, and the same laws explain why Apple engineers go very wide and low-clock with their mobile CPUs desings.

Port the design from 32SOI to 14LPP and you get another huge gain in efficiency. It would be a nice theoretical exercise to port Piledriver to 14LPP only to see how much of the RyZen efficiency is due to the process node. My estimation is that a hypothetical 14nm PD chip could surely break the 15 mark in the above efficiency graph.

Finally move from from a raw CPU desing to SoC, and you get extra efficiency percents from integrating mobo stuff on the SoC.

Yes, AMD has 1/10th of the budget. but they also have a 1/10th of the product portfolio, not to mention that Intel spend most of the R&D on its foundries, whereas AMD left that to Globalfoundries/IBM/Samsung and TSMC.

I am far from impressed. All the hype and fanfare including rehiring JIm Keller for this? Small companies with a much smaller budget, much less experience in the design of CPUs, and with a 1/10th the hype/marketing have made similar or even better products. Why are datacenter/HPC customers rejecting Naples? Why is AMD demoing Naples against deliberately crippled competitors?

I am not "down". I am being realist. The IPC of RyZen is not 6% behind KBL, no matter how much AMD pretend it is. Gaming performance is not better than i5/i7, no matter how many 4K demos or promises about optimizations and future BIOS AMD does.