Amd Ryzen Threadripper & X399 MegaThread! FAQ & Resources

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


That will be a *little* better performance wise, but you still have a ton of other issues to consider. For example:

1: RAM/HDD starts to become a bottleneck, as you are much more likely to end up with a case where the data a particular thread needs isn't already loaded into main memory

2: Core communication bottlenecks start to expose themselves

3: The OS scheduler itself starts to become a bottleneck, especially if a lot of background tasks interrupt a running thread and forces threads to start jumping between cores (in which case both #1 and #2 above will manifest themselves).

As a rule: The more cores you use, the worse your scaling becomes.
 


On the other hand, we have better efficiency, better OC headroom, better AVX support, the lack of the annoying Creator/gaming usermode selection on BIOS...
 


Handbrake uses more than 12 cores; otherwise the 1950X couldn't be faster than the 1920X despite having a bit lower clocks

HandBrake.png


Adobe Premiere Pro CC also scales up above 12 cores

Premiere.png
 


You mean, how a 10-core product has worse power consumption than a 16-core one under threaded workloads? Or how Intel supports AVX512 (that no one is really using) and shoots the power consumption through the roof? Or having a proper "NUMA" set of options for real power users in a supposedly HEDT CPU?

Outrageous! Incredible how AMD can make those calls and get away with it!

Cheers!
 


Not even close...

On such workloads, the 10C SKL has better power consumption than the 16C TR: 150W vs 171W respectively, despite AMD is using tricks to mask the huge power consumption. One of the tricks consists on clocking cores under base frequency when full loads get power consumption out of control. From the HFR review: "Moving under the base frequency is however something annoying, even if it is not the first time that we see this behavior at AMD." The other trick is reviewers noted that some watts are missed from the wall to the socket. They have found a discrepancy and their current hypothesis is that the CPUs are drawing the missed watts outside the ATX12 channel: "Which makes us wonder if these processors would not draw a portion of their power from the 24-pin ATX connector."

AVX512 has been in use for many years in the HPC arena and is now a kind of standard (together with Nvidia CUDA). It has been in use during months is the server arena (for instance Google servers have been using AVX512 for months now), and it is now coming to desktop:

http://www.sisoftware.eu/2017/06/23/intel-core-i9-skl-x-review-and-benchmarks-cpu-avx512-is-here/
 


Right, the 140W using ~150W and the 180W using ~170W. Yes, of course. AMD is cheating trying to stay inside their TDP while Intel obviously knows you want the performance and beat AMD in single threaded workloads that are the bread and butter of the server world.

And yes, the AVX instruction that is not used in 90% of the workloads and price point EPYC is aimed to help/work in and is *just* starting to roll out onto the rest of the server workload spectrum. When MySQL (DB engines), Weblogic (app servers) or Apache (web servers), PHP (script engines) or VirtualBox (VMs) actually get compiled to take advantage of it, let me know.

Cheers!
 


x264 (Handbrake) supports up to 128 threads, but I'm not sure you'd want to use that many anyway. My crude and unscientific testing indicates that very high thread counts (48 and above, which is where the 1950X is at) materially impact the output and so it's not entirely fair to compare the results on speed alone.
 


Wrong name on quote 😛

But remember, just because you can spawn 128 individual worker threads doesn't mean you'll scale anywhere close to that.
 


Adobe it really depends on the settings you use at times that can still be limited to lower amount of cores for example the picture you showed has the 1950x ahead in adobe I can show where it loses.


 


The ~170W have been measured in the ATX12 channel, and the reviewers claim that there is discrepancy in watts between the wall and the socket. That is the reason why they introduce the hypothesis that the socket must be getting extra power from elsewhere, as quoted above: "Which makes us wonder if these processors would not draw a portion of their power from the 24-pin ATX connector." This means that the ~170W they measured with their method is not representing the real power used by the chip. And it agrees with what CanardPC said in the past when mentioned that the 180W TR in reality draws >200W.

AVX512 will not apply to everything, but the claim "that no one is really using" AVX512 was wrong.
 


I'm pretty sure AMD wired some black magic inside TR to increase the power consumption; a black hole even. Maybe it's the work of the CPU fairy. I wonder why no other reviews I've read so far agree with you and you only take 1 that mentions it. Although, to be fair, AMD does have a track record of going "off spec" on some designs (RX480 does come to mind and old Athlon Thunderbirds).

And in the context we're talking (TR servers group publicized by AMD in official presentations and slides), AVX512 is nothing special nor relevant. I'll read a bit more on what it actually entails to use it in terms of calculations, but I have the notion/intuition it won't really do anything for the average load 90% of the server farms out there actually use for regular calculations. I'll post back later.

Cheers!
 
https://www.techspot.com/review/1465-amd-ryzen-threadripper-1950x-1920x/page7.html

No point on continuing or caring more

https://www.pcper.com/image/view/84887?return=node%2F68269

http://hexus.net/tech/reviews/cpu/108628-amd-ryzen-threadripper-1950x-1920x/?page=12

Out of all those one can tell that max load power consumption is not an issue if anything its using far to much power on idle and that probably matters more.

Not to mention temps are lower on the platform anyways so this is from a pure TDP rating which has very little to do with max power consumption in the first place but total heat-output.


 
Obviously the Intel damage control crew is on patrol. LOL



Well good for those sites, would appreciate it if you could provide a link or three. Can I expect that sometime in the near future then? The information I've seen would seem to indicate that the 7900x fares poorly in the performance/watt department as well as the performance/dollar department when compared to the competition.

BTW, what you're hemming and hawing about is the transition Intel has had to make from LCC to HCC wafers for the 12+ core die chips they never intended to create...until they realized AMD was about to kick them in the ballz. This is why those chips are still vaporware and also why Intel keeps rushing things. Obviously none of this befronts AMD as they're using the same CCX dies across all of their product lines. So yeah, cost *IS* linear...for one company, as I believe I've already demonstrated. That Intel cannot win a HEDT price war with AMD would seem to be self evident at this point.

Cost per core is irrelevant because...reasons. May I have your permission to use that when I go to buy a HEDT processor? "juanrga told me cost per core was irrelevant, so can haz 18 core processor for dual core price? puhlease?" LMAO

Thanks for the laugh dude.
 


The PcPer review is rather complete, and measured not only the impact of RAM speed on performance but interdie latencies. As expected the MCM approach hurts latencies

latency-pingtimes-1950x2400.png


 


Fixed :)

You won't scale that well at all, but there may still be some speed benefits. Even my i7 920 encodes in a little less time with 128 threads than the default of 12, despite the scheduling nightmare that creates. However, the bitrate is noticeably lower, so I wouldn't do it ordinarily.
 

Well price per core is irrelevant what matters in this market is multithreaded performance as no one buys a 7900x or 1950X to run games or other tasks that only uses 4 cores or less.

I mean a 64 core A53 CPU at 2Ghz would still suck compared to an 7900X/1950X meaning yes having the most cores doesn't matter.

 
Clearly it's not irrelevant as the mainstream HEDT processors are both priced/core. ...unless you guys think it's just a coincidence that higher core count processors cost more as you go up a company's product stack. Name dropping an arm processor as a strawman while ignoring context doesn't change a thing.

 
About threads and scaling. Notice how 7900 scales linearly up to 8 threads not 10. Mesh effect maybe? On the other hand 1900 scales linearly up to 16 threads.
cinebench_r15.038_1t_32_ryzen_threadripper_1950x_vs_corei9-7900x-100731230-orig.jpg


Frankly, no one should buy a $1,000, 16-core CPU just to play conventional gaming or run lightly threaded applications. It’s the wrong tool for the job.
 


It is irrelevant because not all cores are identical (different performance) and because the relation performance/cost is not linear.
 
At this point I think you're just being intentionally obtuse. More cores = a higher price for both company's HEDT product stacks. That's a simple fact and repeatedly saying the word "irrelevant" doesn't make it any less a fact. That performance doesn't scale perfectly with increased cores doesn't change a thing either, particularly as scaling is application specific. The only thing "non-linear" about the pricing per core is on the Intel side...where they price gouge even moar than usual(per core) for their 16 and 18 core parts. It's pretty simple maths, but we can go over it if you're struggling.

The bottom line is spending more as you head up the threadripper stack gives you more bang for the buck, whereas with skylake-x, while you do get more cores you also get to pay a higher "Intel" tax. That Intel has lower margins on their higher end parts doesn't mean jack to me, I'm only interested in getting the most for my buck.

An intel 7900, 7920, or 7940 core carries an average premium of $37.51/core(+60%) over an AMD 1950x core.
The 1960x carries a premium of $43.76/core(+70%)
The 7980xe carries a premium of $48.66/core(+78%).

While Intel cores do offer higher IPC they do not have an IPC advantage anywhere near high enough to justify these premiums...imo. Part of Intel's advantage over Zen on the mainstream platform has come from higher clocks, however on the HEDT platform that advantage has diminished significantly as there's simply not as great a disparity between AMD 19x0 clockrates and Intel 79x0 clockrates.
 


No one is disputing that trivial fact. What is being stated is that not all cores are the same. It doesn't cost the same to design and fabricate a Zen core than a SKL core. Getting 10% higher IPC atop an existent design doesn't cost 10% more. Getting 500MHz extra atop a 4GHz core doesn't cost 12% more. A core with 4.5GHz and 10% extra IPC will have 24% higher performance but it will could be 70% more costly to fabricate, because the functional relationships between cost and performance aren't linear. For instance the relationships between the complexity T of a core and its IPC is given by

T = b (IPC)^2

where b is a parameter. You can see that duplicating the IPC, quadruplicates the complexity of the core and its cost. Therefore the faster core will look worse from a (IPC/price) ratio perspective.

Also when many people talks about IPC they exclusively mean "IPC in serial x86 workloads", and avoid IPC in AVX workloads. Consider the 512bit units in SKL cores. Those are 4x bigger than the units on RyZen (you need 4x more transistors), and those 4x bigger units require 4x wider datapaths and caches with 4x higher bandwidth, which again means 4x more transistors. All those extra transistors increase the costs of design, validation, and fabrication of the core, but you don't see any of that extra performance in action if all what you run are legacy workloads as CineBench and C-Ray that don't use 512bit AVX instructions. Again the faster core looks worse from a (performance/price) ratio perspective

Also AMD is using the same Zeppelin die for all the chips, from the lower RyZen model to the top ThreadRipper model. This is not true for the X-series, where KBL-X uses one die, the SKL-X models up to the 10C use another die, and the SKL-X models up to the 18C use another die. Designing and fabricating a 18C die is different than designing and fabricating a 10C die. For instance fabrication yields aren't linear. The bigger die has higher costs because has more transistors, but atop that there are extra costs because larger dies have worse yields. In the end a 18C die only can provide 80% more performance than a 10C die, but the 18C die can cost twice more. Therefore the faster CPU will look worse from a (price/core) ratio perspective.

That is why using performance/cost or core/cost metrics to pretend that Intel is "gouging" whereas AMD is some kind of charity are invalid.
 


It does matter as an example as comparing who gives the most cores is useless unless those cores are 100% the same. Ryzen is still a good 15-20% behind skylake/kabylake in IPC. A lot of programs don't always use 16 cores 32 threads even programs like Adobe.

Not saying Ryzen isn't a better deal it for sure is and I think it basically makes X299 irrelevant except for a small amount of cases.

 


The OP says

Memory Support:

All Threadripper CPUs will run a quad channel memory configuration with a max of 1TB of supported RAM (you can thank EPYC for that). Official memory frequency maxes out at 2667Mhz, though this is just offical, if Ryzen 7 is any indicator Threadripper should be able to hit 3200MHz and above quite easily.

ThreadRipper is not RyZen. Overclocking RAM does very little.

normal-memspeed.png


Ignoring the synthetic memory bandwidth measurement, because obviously this is going to be very sensitive to RAM speed, and taking the average of the remaining 22 benchmarks, the averages are

2400 RAM: 1.01
3200 RAM: 1.04

Therefore the faster RAM provided only 3% higher performance on average, which is nothing. The reason? ThreadRipper has a power throttling mechanism, and overclocking the RAM automatically reduces the turbo frequencies on the CPU for adjusting to the maximum power supported by the socket.
 
Status
Not open for further replies.