Question Just installed 3990x - getting same benchmark times as 3970x - is windows 10 holding me back?

jayleonis

Honorable
Nov 18, 2018
108
0
10,590
I just swapped out my 3970x 32/64 core CPU for the 3990x 64/128 core. With the number crunching program I use I am noticed basically the same work time for the exact same task. In fact when I set the program to use 128 threads my CPU in task manager says 70% CPU utilization. I have read that non enterprise windows can limit the amount of threads it uses to 64. How do i check this? and is there any way to solve without reinstalling a OS?


Windows 10 Pro Version 21H1
Motherboard Asus Prime TRX40 Pro S
 
Last edited:
I just swapped out my 3970x 32/64 core CPU for the 3990x 64/128 core. With the number crunching program I use I am noticed basically the same work time for the exact same task. In fact when I set the program to use 128 threads my CPU in task manager says 70% CPU utilization. I have read that non enterprise windows can limit the amount of threads it uses to 64. How do i check this? and is there any way to solve without reinstalling a OS?
Disable hyperthreading and test. If your application can fully saturate the CPU, it can be more efficient with hyperthreading disabled. When I worked in high performance computing, we routinely disabled hyperthreading.
 
Disable hyperthreading and test. If your application can fully saturate the CPU, it can be more efficient with hyperthreading disabled. When I worked in high performance computing, we routinely disabled hyperthreading.
But is still only be on 64 cores right? Which I was getting from the cheaper 3970x. I’d rather just use the cheaper one if it’s the same. But perhaps upgrading to windows enterprise will allow the full 128

Hw monitor is showing 64 cores being used. Just as it showed before.
 
But is still only be on 64 cores right? Which I was getting from the cheaper 3970x. I’d rather just use the cheaper one if it’s the same. But perhaps upgrading to windows enterprise will allow the full 128

Hw monitor is showing 64 cores being used. Just as it showed before.
No you weren't. You may have had 64 THREADS but you didn't have 64 cores. Each of the 32 physical cores had two threads assigned. You now have a physical core for each of the 64 threads you previously had. Depending on the software, your performance could be up to 1.8X better. It is VERY seldom that you would get a full 2X performance improvement.
Your problem could be memory bandwidth. How many DIMMs do you have ? The Threadripper CPUs (both of them) are 4 channel memory controllers. For max performance you should have 4 or 8 DIMMs.
 
No you weren't. You may have had 64 THREADS but you didn't have 64 cores. Each of the 32 physical cores had two threads assigned. You now have a physical core for each of the 64 threads you previously had. Depending on the software, your performance could be up to 1.8X better. It is VERY seldom that you would get a full 2X performance improvement.
Your problem could be memory bandwidth. How many DIMMs do you have ? The Threadripper CPUs (both of them) are 4 channel memory controllers. For max performance you should have 4 or 8 DIMMs.
Okay I disabled smt. Now when I run my program it says cpu usage 100%. But program doesn’t seem to be running any faster then before when it was at 70% showing 64 cores in use on hw monitor. I even updated my windows pro to most recent version where people said it’s supposed to support 128 threads. Maybe I need to try enterprise?

I have 8x32gb ddr4 3600
 
Okay I disabled smt. Now when I run my program it says cpu usage 100%. But program doesn’t seem to be running any faster then before when it was at 70% showing 64 cores in use on hw monitor. I even updated my windows pro to most recent version where people said it’s supposed to support 128 threads. Maybe I need to try enterprise?

I have 8x32gb ddr4 3600
Not all software scales infinitely. Writing parallel software is very difficult. The question to ask is what is your performance compared to your previous processor.
As I said, if the software can keep the physical cores fully utilized, you won't get any performance improvement with SMT enabled and you could loose a few percent because of the excess threads. It just depends on how the software is written.
 
Not all software scales infinitely. Writing parallel software is very difficult. The question to ask is what is your performance compared to your previous processor.
As I said, if the software can keep the physical cores fully utilized, you won't get any performance improvement with SMT enabled and you could loose a few percent because of the excess threads. It just depends on how the software is written.
Ah so you’re saying maybe the software can’t utilize more than 64 cores? I will have to ask the developer. Performance so far is equal. 64 cores no smt vs 32 + 32 threads
 
Not all software scales infinitely. Writing parallel software is very difficult. The question to ask is what is your performance compared to your previous processor.
As I said, if the software can keep the physical cores fully utilized, you won't get any performance improvement with SMT enabled and you could loose a few percent because of the excess threads. It just depends on how the software is written.
Is it not still possible though that windows 10 pro is limiting me? I read a lot that you need enterprise to be compatible with the 64-128 threads
 
Is it not still possible though that windows 10 pro is limiting me? I read a lot that you need enterprise to be compatible with the 64-128 threads

All the reports I saw of that were back when that Threadripper was released. I haven't read anything about that since after a late February 2020 update. That doesn't mean it's not a problem, of course; people may just have stopped writing about it (I do not have a CPU with that many cores to test for myself).

A Windows 10 Pro Workstation license, if need be, is usually cheaper than a Windows 10 Pro Enterprise license.
 
All the reports I saw of that were back when that Threadripper was released. I haven't read anything about that since after a late February 2020 update. That doesn't mean it's not a problem, of course; people may just have stopped writing about it (I do not have a CPU with that many cores to test for myself).

A Windows 10 Pro Workstation license, if need be, is usually cheaper than a Windows 10 Pro Enterprise license.
Usually those are only required for dual socket motherboards.
 
I installed enterprise and same thing. So must be software?
Most likely. As I said, if is VERY difficult to create software that scales to 64 cores and that actually accomplishes something beyond a benchmark.
It could be memory or storage. Have you verified that your RAM is running at 3600 ?
To figure out more, you would have to use some kind of profiling software and look at where your time is being used. It becomes NON trivial.
The only other question to ask is if the software has any tuning options that might improve performance. Large page allocations. Large I/O. Direct I/O. Thread prioritization. Many things can improve performance but it can be trial and error to determine the optimum settings.
When you have a system with a lot of physical cores, this is why disabling SMT is usually done. More threads doesn't improve HCP (high performance computing) in many cases.
 
Most likely. As I said, if is VERY difficult to create software that scales to 64 cores and that actually accomplishes something beyond a benchmark.
It could be memory or storage. Have you verified that your RAM is running at 3600 ?
To figure out more, you would have to use some kind of profiling software and look at where your time is being used. It becomes NON trivial.
The only other question to ask is if the software has any tuning options that might improve performance. Large page allocations. Large I/O. Direct I/O. Thread prioritization. Many things can improve performance but it can be trial and error to determine the optimum settings.
When you have a system with a lot of physical cores, this is why disabling SMT is usually done. More threads doesn't improve HCP (high performance computing) in many cases.
Thank you for all the help. I believe I will go back to the 32/64 thread cpu and sell this 3990x. I was anticipating at least 40-50% increase for the cost but doesn’t look like that’s realistic regardless. Thanks again!
 
Thank you for all the help. I believe I will go back to the 32/64 thread cpu and sell this 3990x. I was anticipating at least 40-50% increase for the cost but doesn’t look like that’s realistic regardless. Thanks again!
Can you run multiple instances of the software simultaneously and limit the number of threads/cores? You may be able to get more total work done if you can do that.
 
Can you run multiple instances of the software simultaneously and limit the number of threads/cores? You may be able to get more total work done if you can do that.
Good idea. Yes when I run two instances it goes up to 100 cpu usage. I do notice however when I divide 64 cores to each instance (the software allows me to set threads) it seems to take double the time for both to complete that I was getting for 1. So not really an increase in productivity. Maybe I can find a spot where I lower the threads on the second instance and don’t affect the performance of the first?
 
Good idea. Yes when I run two instances it goes up to 100 cpu usage. I do notice however when I divide 64 cores to each instance (the software allows me to set threads) it seems to take double the time for both to complete that I was getting for 1. So not really an increase in productivity. Maybe I can find a spot where I lower the threads on the second instance and don’t affect the performance of the first?
You only have 64 physical cores. You have previously said that one instance of the software can fully utilize all cores when hyperthreading was disabled. But you also said that you did not see a performance improvement over the 32 cores you previously had. SO, what you could try is two instances with 32 cores allocated. Leave SMT disabled. You have the potential of getting twice the work done, if there is not a memory bandwidth limitation.
 
You only have 64 physical cores. You have previously said that one instance of the software can fully utilize all cores when hyperthreading was disabled. But you also said that you did not see a performance improvement over the 32 cores you previously had. SO, what you could try is two instances with 32 cores allocated. Leave SMT disabled. You have the potential of getting twice the work done, if there is not a memory bandwidth limitation.
So ran Two instances at 32 cores each (smt) disabled = same finish time as when. I did 2 instances and 64 threads on each with svt enabled.

64 core with svt disable = similar finish time as the 32 core processor with svt enabled (64 threads)

I must have a limitation with the software or memory ?
 
So ran Two instances at 32 cores each (smt) disabled = same finish time as when. I did 2 instances and 64 threads on each with svt enabled.

64 core with svt disable = similar finish time as the 32 core processor with svt enabled (64 threads)

I must have a limitation with the software or memory ?
What does "same finish time" mean? You got twice the amount of work done ? Or it took twice as much time to accomplish twice as much work?
 
twice as much time, sorry.
I also don’t know what this means but I just ran the same computation on 56 cores and on 64 cores for the same duration 1 hour and result was within 1% of each other. I don’t think my software is scaling well with more cores/threads