Amd Ryzen Threadripper & X399 MegaThread! FAQ & Resources

Page 14 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


I'm not talking about core loading, but about thread latency.

Yes, the "many threads at 100%" is a problem, as it has always been a problem, but outside of benchmarks or really poorly coded programs should never happen.

The problem I'm very concerned about is the following: Take a classic i5 (four physical cores). Right now, i5's are more then capable enough of maxing out pretty much every game under the sun. But that's mainly because there are only two to three really time critical threads the CPU needs to run at any one point in time. Other threads, even within games, aren't that sensitive to time and won't really effect performance much.

Now take your previously single-thread GPU driver and make it fully multithreaded (let's assume six small threads here each doing about the same amount of work). Remember: Even if the total workload is the same as the classical GPU driver, you still can only run four threads at any one point in time on an i5 class CPU. So you now created a situation where not only are parts of the GPU driver going to stall out due to lack of CPU cores, there's also a very real chance the GPU driver will actually interrupt the program you are running due to the driver layer having a higher scheduling priority then the application you are trying to run. While this effect won't be seen much in terms of FPS (remember, 16ms is an eternity for computers), it WILL show up in latency.

So, how do you fix this? Obviously, purchase a CPU with more processor cores. That costs several hundred dollars more. To achieve the same performance as the classical GPU driver [because multithreading for the sake of multithreading adds zero performance].

So yeah, I have some concerns.
 


Ok, I don't fully agree with it being a worry just yet, but I at least understand where you're coming from. I don't think they'll be able to keep up with parallelism as the CPUs get wider and wider, but I do agree the bottleneck will be around thread-to-thread latency.

Cheers!
 

jdwii

Splendid
Gamer in a decent amount of titles even a 8700K at 5Ghz will still bottleneck a 1080Ti such as city skylines if it was made to use more cores i doubt a 8700K wouldn't be able to keep a 1080Ti at a higher GPU usage.

With Intel not even improving IPC we have to go wide even if latency comes an issue software dev's only had a few years with these newer API's its gonna take a good 3-5 years at least.

Also i think CCX latency is the reason directx 12 titles perform badly on ryzen over this.
 

8350rocks

Distinguished


DX12 deficiency is only really a symptom of 2 things.

1.) DX11 is not yet fully abandoned (NV will try to prevent full bore migration to DX12 as long as possible on this front...)

2.) Poor optimization for DX12.

Titles that are extremely well optimized tend to run on Vulkan, and that is primarily because those developers are taking the time to actually do manual tuning. Lots of other devs are simply putting together a coherent render path on the back end that does not bug out all the time and do little beyond that.
 

8350rocks

Distinguished


10% is not a huge price reduction, and that is mostly fueled by the smashing success of EPYC. Their margins went up again this past quarter.
 


I would like to point out NVIDIA outperforms AMD in DX12. As I've noted MANY times now, the reason NVIDIA doesn't gain performance like AMD does is unlike AMD, they have an efficient DX11 driver path.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


EPYC is anything except a "smashing success", but this is unrelated to ThreadRipper.

This 12% price discount on the 1950X is a consequence of the chip not selling well. The 1950X got some momentum when Intel top chip was a 10-core, and the 16-core coujld win in heavily multithread applications. Once Intel released the SKL-X models above 10-core, ThreadRipper lost competitivity, and AMD has to adjust price.
 

8350rocks

Distinguished


Considering they cross segment at workstations, it is highly related.

Microsoft bought them, so did Baidu:

https://www.geekwire.com/2017/microsoft-azure-baidu-embrace-amds-new-epyc-data-center-processor/

Amazon and Alibaba are also in:

https://www.fool.com/investing/2017/09/28/why-amds-latest-win-should-give-nvidia-sleepless-n.aspx

That article discusses Amazon's FirePro purchases as well, but Amazon is adding racks with EPYC processors, too.

I suppose those are not "smashing" successes?

 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790
From first link

Microsoft Azure and Baidu promised to deploy AMD’s new Epyc data center chip for their cloud customers to select as an option,

Both companies aren't acquiring the EPYC for themselves, but only for offering them as option for customers. Also the devil is on the details. The number of acquired EPYC chips is very small, almost imperceptible. We also know that most server chips that Microsoft is acquiring aren't even x86...

There is a reason why Lisa Su didn't give any detailed numbers about EPYC 'sales' in the Q3 earnings call, or why she didn't answer key questions such as "On the EPYC server side, can you help us understand to what extent you're shipping to customers who are going through testing right now versus shipping into customers who are actually deploying EPYC in live datacenter applications". Lisa Su couldn't even confirm if EPYC could get a 2% marketshare by end of year.

But this is all unrelated to how worldwide sales of 1950X aren't the expected and the chip received a huge price discount a week ago.
 

xravenxdota

Reputable
Aug 26, 2017
435
66
4,990
Tbh i read a lot of reviews on the i9 vs threadripper.If i could have afforded one i would take the amd over i9.Price to performance.The TR are 16k here where th i9 are 23k.I can assure you i9's not worth 7k thats way op.But i personally feel a big cpu like that are overkill cept if you run rendering/server lol.
 

8350rocks

Distinguished


That or a lot of VMs.
 

genz

Distinguished


True, but then ThreadRipper is the product that actually was going after a different demographic. HEDTs were not a typically VM small business product prior to TR upping the core count. AMD did the equivalent of re-democratizing the industry again by silently cutting the standard price of a product (High core count VM server) 10x. This is why the buzzword among press releases atm is hyper-centralisation: putting tons of VMs on the new massively multicore options that make sense.



Xeon e7-4890 v2. 15 cores, 30 threads. Ivy Bridge EX Arch. ECC+NUMA 32 Lanes. 2.8Ghz core 3ghz turbo. $3590. Needs a server board so add $300
This was Intel's flagship 2 years after Zen was proposed, when Zeppelin was expected. This is more or less when the expected performance of a singe Zen core would have been known to Intel due to patent cross licensing meaning that basically any meaningful advantage is going to be shared to guarantee nobody is hiding patents that should be part of the agreement. This is not to say that Intel know the chip, just around about how fast it should be based on the tech that AMD have gained and what they match against Intel's clearly superior tech (at the time).

AMD TR 1800X 16/32. IPC that keeps up with Intel 2016 units. 200Mhz more clock and unlocked. ECC+NUMA+HSA. 64 Lanes 4Ghz turbo. $400 + $100 mobo

That's 1/6th the price in 2 years. AMD ARE progress, because if I don't buy TR I know that my program's increasing demands will never be met by Intel alone and their 'pricing strategies'. We've been Quad Core as standard since 2006! for essentially no reason.

Anyone that wishes to talk about heat or clock speed needs reminding that there are 16 core smartphones in the wild right now with 'cooling' setups the size of a dime. For $500.

Doesn't matter what you make if it's too expensive to sell. i9 is a decent chip though rather cavalier in it's 'homemade-ness'. Overheating, splintered product ranges causing for oh so buggy BIOS images (most people writing EFI for KBX/SKX boards were writing logic for things that had never been done or tested before... EFI don't usually have to mode switch for PCI-lane width or bus logic depending on CPU seated etc etc) makes it more of an underdog here and that has bought it fans. Reminds me of early AMD making chips for Intel sockets in it's adventurousness. Should for all intents be a lower priced product to reflect it's obviously lower R+D time.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


This price comparison is invalid by a number of reasons:


    ■ One cannot comparing pricing at different time spans with different nodes. 15 cores on ancient 22nm is more expensive than 15 cores on last 14nm because 22nm requires uses a giant die due to the lower integration provided by the ancient node.

    Also one cannot compare modern 14nm AMD chip with an ancient 22nm Intel chip and claim "AMD ARE progress", when modern 14nm Intel chips have improved even more when compared to the same Intel 22nm chip.

    ■ One cannot compare a Ivy Bridge Xeon E7, which is a top server chip designed for 4S platforms (with support for mission-critical loads) with a 1S chip for desktop as ThreadRipper. Even comparing a E7 Xeon with an EPYC chip is meaningless. The correct comparison would be with Xeon E5 or E3 models depending of the EPYC chip tested. The correct comparison for ThreadRipper is with the HEDT line from Intel: Skylake-X.
    ■ The IPC of Zen is sub Sandy Bridge. It is not 2016 level.
    ■ AMD is using a cheap MCM approach to reduce costs. This MCM approach has latency and power consumption disadvantages.
    ■ 1800X is not TreadRipper.
    ■ There is no HSA support in any ThreadRipper or Ryzen CPU.
    ■ NUMA is not an advantage. At contrary the NUMA approach on TreadRipper requires delicate load balances to obtain maximum throughput. That is the reason why TreadRipper has different working modes for different workloads, what is a pain.
    ■ X399 mobos are cheaper because are inferior.
    ■ PCIe lanes support is inferior on X399.


The are other objections, like the claim that Intel can know the performance of a core from AMD thanks to patent cross-licensing sharing. This is wrong twice. First because the cross-licensed patents are about the architectures not about sharing the details of microarchitectures. Second, because one cannot obtain performance from patents alone. This is obvious, the same patented microarchitecture will perform differently if implemented on 14HP than if implemented on 28SLP.

FINAL NOTE on the difference between E3, E5, and E7 lines:

The E3 line is targeted for single-socket platforms for small business and dense computing needs. The E5 line offers products for the volume entry-level, efficient computing, high-end density servers, as well as workstations, whereas the E7 line delivers servers to the high-end, expandable, and mission-critical segments.

https://pdfs.semanticscholar.org/f337/600bab05ed33c86e3c2ce194757a2484074b.pdf
 
Nitpick: "Second, because one cannot obtain performance from patents alone".

Yes, you can. See why Qualcomm is the undisputed king of LTE and is pretty much in every-single smartphone out there with their modems. IBM as well. Oracle with their DBs have a lot of interesting patented algorithms that are so good, any other way of doing things is just slower or less efficient. And one final example: Rambus.

Cheers!
 

goldstone77

Distinguished
Aug 22, 2012
2,245
14
19,965
DDR4 Memory Scaling & DDR4-3600 Testing With AMD Threadripper On Linux
Written by Michael Larabel in Memory on 24 November 2017.

https://www.phoronix.com/scan.php?page=article&item=threadripper-linux-ddr4&num=1
The memory configurations tested for this article basically came down to:

2 x 8GB DDR4-3200MHz
4 x 8GB DDR4-3200MHz
4 x 4GB DDR4-2133MHz
4 x 4GB DDR4-2800MHz
4 x 4GB DDR4-3066MHz
4 x 4GB DDR4-3200MHz
4 x 4GB DDR4-3600MHz

This was done for basically showing the impact of dual vs. quad channel memory on Threadripper and then also the impact of the memory frequency in different Linux/open-source workloads. The memory timings were not tweaked between the different frequency levels.

This software for modeling and simulation of porous media processes doesn't scale too well past 16 threads at the moment, but will be interesting to see how much further they tune it as we move into 2018.

If you are thinking of assembling an AMD Threadripper this holiday season, these results again reiterate that going for quad channel memory is definitely worth it even if it means getting a slower kit than what you would if going for dual channel. Between say DDR4-3200 and DDR4-3600 speeds there wasn't too much of a difference in many of these real-world Linux benchmarks, compared to the price premium of DDR4-3600, so hopefully this information will be of help when shopping.

The article shows several performance benchmarks with various RAM configurations. Showing quad channel RAM even at 2133MHz is going to provide greater performance than dual channel 3200MHz RAM. And the performance difference of 3600HMz RAM over 3200MHz RAM isn't worth the additional cost.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


For bandwidth-bound workloads because 4x2133 offers 33% more bandwidth than 2x3200. For latency-bound workloads the higher speed dual channel configuration will provide more performance.


The memory controller seems to bottleneck around 3200MHz.
 

goldstone77

Distinguished
Aug 22, 2012
2,245
14
19,965


I agree there will be instances where dual channel at 3200MHz will overcome quad channel at 2133MHz.

Edit: I wish I had some tests to show the comparison.
 

goldstone77

Distinguished
Aug 22, 2012
2,245
14
19,965


Cfir Cohen, a security researcher from Google's cloud security team, on Wednesday disclosed a vulnerability in the fTMP of AMD's Platform Security Processor (PSP), which resides on its 64-bit x86 processors and provides administrative functions similar to the Management Engine in Intel chipsets.

This sounds bad. It's not as bad as you think.

an attacker would first have to gain access to the motherboard and then modify SPI-Flash before the issue could be exploited
http://www.tomshardware.com/forum/id-3609004/cpu-security-vulnerabilities-information/page-3.html#20570826
 

liberty610

Distinguished
Oct 31, 2012
464
4
18,815
Forgive me, as I have just skimmed over most the replies on this thread, as I am not as tech savvy as a lot of you on here as far as knowing a lot about clock cycles, architecture of cpus, ect. I am however a 'power' type user, who built a threadripper machine when the processor launched. Throwing all the nano second tech jargon aside, I thought I might bring my personal experience to the topic where some actual workstation type work is being done on many levels. All I can do is compare my experience with Threadripper to my last Intel chip. So, being a power user who isn't as architecture savvy (basically a solid consumer who likes a faster PC - like most people) I have used threadripper since it launched, with more then happy results.

I run a small project studio out of my home. I handle projects of all sizes doing various tasks. From digitizing old VHS and cassette tapes, to recording band's, to full hd (and now, some 4k) video editing with Vegas Pro 14, and I even do CD/DVD/Blu Ray design and burning. And I even game on my machine as well with some video capturing/streaming. I don't stream often, but I tend to record gameplay and use it in my editing software to practice on my editing skills when i am lacking projects.

This is my build:
https://pcpartpicker.com/b/BqHhP6

And yes, I am aware that using the standard power supply cables take away from how 'pretty' the inside looks, but I was more worried about function over fashion when it was time to put it together. The LED pretties was enough for me hehe.

Anywho, I went from the Intel 6700k broadwell-e chip to the Threadripper 1950x. Now, without knowing the super high end tech jargon of how a processor and it's clock cycles work, and more just looking at render times and speed of the tasks I throw at it, I have seen a dramatic positive change in my PC power.

I have a Sony Nx100 video camera that shoots in full HD (50 mbs a second - 60 fps) that a lot of my footage comes from. I cut almost a full 45 minuets off my rendering times on projects I tested between my last 6800k and the Threadripper. I have also seen a dramatic increase in Handbreak compression times with various video files. I use Reaper audio workstation for recording bands where I use VSTi drum software, VST guitar modeling, as well as countless other plugins for EQ, compression, effects, ect and some of my projects have well over 30 audio tracks going simultaneously at a time. I can not be happier with my results using Threadripper 1950x.

There also have been times where I will limit the amount of cores programs can use for multi tasking. I have successfully rendered full HD video in the background will running Handbreak on another video, all while I was playing an online session of GTA 5. So instead of sitting here waiting for video to render/process (which is like watching paint dry), I can have fun and do other tasks while these things are going on.

From a user perspective who isn't super tech savvy with the architecture portions of a CPU, but wants a more powerful machine for heavy multi tasking without paying almost 2 to 3 grand for an Intel Xion chip (which was my only option with my last mother board), I am thrilled with my results using threadripper.
 
Thanks for sharing, liberty610.

It's interesting you mention streaming and saving the video at the same time. Now that I think about it, that is also quite common todo. Might be something Toms could add to their Streaming article in the future.

Cheers!
 
Should be very interesting to see the performance numbers after the new testing with the S&M patches are done. Depending on how much harder Intel gets hit in performance, it's very possible that AMD could come out much closer in overall performance.

Unless AMD takes a performance hit, a significant performance hit, them this should turn out nothing but good for them.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Meltdown plus Spectre fixes testing on linux. Those are server benchmarks, but you get the general pattern:

There are workloads where AMD takes a higher performance hit

embed.php


There are workloads where Intel takes a higher performance hit

embed.php


and there are workloads where both Intel and AMD are unaffected by the patches.

embed.php
 

That is a very convoluted way of saying "I have no idea what will happen".
 
Status
Not open for further replies.