[SOLVED] Workstation for statistics

May 30, 2020
5
0
10
Hey everyone,

First thanks to everyone who is reading this post.

I am looking at building a workstation meant to run complex SPSS ops (regression, correlation etc with multiple variables), and am unsure how to go about it. How would a conventional desktop setup (e.g. R7 3700X with 64MB DDR4 ram) match up with old server chips (such as a dual E5 3678v3 with 64 MB ECC DDR4 ram) pit against each other? In simple terms, will a new CPU such as R7 be better at the task than a dual setup of old server CPUs?

Appreciate all advice.
 
Solution
Hey everyone,

First thanks to everyone who is reading this post.

I am looking at building a workstation meant to run complex SPSS ops (regression, correlation etc with multiple variables), and am unsure how to go about it. How would a conventional desktop setup (e.g. R7 3700X with 64MB DDR4 ram) match up with old server chips (such as a dual E5 3678v3 with 64 MB ECC DDR4 ram) pit against each other? In simple terms, will a new CPU such as R7 be better at the task than a dual setup of old server CPUs?

Appreciate all advice.
Xeon CPUs and dual socket motherboards give you something that a typical desktop doesn't. Memory bandwidth. Xeons have 4 or more memory channels. A 3700X is a dual memory channel CPU. The 3700X has a...

kanewolf

Titan
Moderator
Hey everyone,

First thanks to everyone who is reading this post.

I am looking at building a workstation meant to run complex SPSS ops (regression, correlation etc with multiple variables), and am unsure how to go about it. How would a conventional desktop setup (e.g. R7 3700X with 64MB DDR4 ram) match up with old server chips (such as a dual E5 3678v3 with 64 MB ECC DDR4 ram) pit against each other? In simple terms, will a new CPU such as R7 be better at the task than a dual setup of old server CPUs?

Appreciate all advice.
Xeon CPUs and dual socket motherboards give you something that a typical desktop doesn't. Memory bandwidth. Xeons have 4 or more memory channels. A 3700X is a dual memory channel CPU. The 3700X has a large on-chip cache to compensate and has a lot of physical cores.
You need to look at your current workload. Does your current CPU have all cores maxed out or 1 or 2? If all cores are maxed, then you can probably benefit from more physical cores. If only 1 or 2 cores, then an Intel CPU with higher single core performance might benefit.
If you are going to look at a DIY, I would recommend a Threadripper CPU rather than an desktop 3700X.
 
Solution
May 30, 2020
5
0
10
Xeon CPUs and dual socket motherboards give you something that a typical desktop doesn't. Memory bandwidth. Xeons have 4 or more memory channels. A 3700X is a dual memory channel CPU. The 3700X has a large on-chip cache to compensate and has a lot of physical cores.
You need to look at your current workload. Does your current CPU have all cores maxed out or 1 or 2? If all cores are maxed, then you can probably benefit from more physical cores. If only 1 or 2 cores, then an Intel CPU with higher single core performance might benefit.
If you are going to look at a DIY, I would recommend a Threadripper CPU rather than an desktop 3700X.

Thanks for the advice. Unfortunately because of budget the threadripper is out of the question. I am running datasets of 10,000s with order of 150+ variables using IBM SPSS. I have to make do with the best I can given my restrictions.
 

kanewolf

Titan
Moderator
Thanks for the advice. Unfortunately because of budget the threadripper is out of the question. I am running datasets of 10,000s with order of 150+ variables using IBM SPSS. I have to make do with the best I can given my restrictions.
You haven't provided any budget, any guidance on what your workload is or what your baseline system is. Giving you much more will be difficult.
 
May 30, 2020
5
0
10
You haven't provided any budget, any guidance on what your workload is or what your baseline system is. Giving you much more will be difficult.

Cheers. My budget is around USD1200 (motherboard + CPU + 128mb ram). Also as I said, my workload is 10,000 sets of data with 150+ variables using SPSS. No baseline system since it's a new setup.
 

kanewolf

Titan
Moderator
Cheers. My budget is around USD1200 (motherboard + CPU + 128mb ram). Also as I said, my workload is 10,000 sets of data with 150+ variables using SPSS. No baseline system since it's a new setup.
You keep using "10,000 sets of data with 150+ variables" but that doesn't mean anything to somebody who doesn't use the software. Also the types of analysis you do matters.
You must have SOME baseline, or you can't say "my workload is 10,000 sets of data with 150+ variables ".
Without some more specific CPU loading or other more generic metrics I can't help with a system recommendation.
I will say that 128GB RAM is not something that is a good match for a dual channel "desktop" CPU and motherboard. Do you KNOW that much RAM will benefit you? Again, this is where giving more "baseline" data can help us (and you) spend your $$$ wisely.
 
Last edited:
May 30, 2020
5
0
10
You keep using "10,000 sets of data with 150+ variables" but that doesn't mean anything to somebody who doesn't use the software. Also the types of analysis you do matters.
You must have SOME baseline, or you can't say "my workload is 10,000 sets of data with 150+ variables ".
Without some more specific CPU loading or other more generic metrics I can't help with a system recommendation.
I will say that 128GB RAM is not something that is a good match for a dual channel "desktop" CPU and motherboard. Do you KNOW that much RAM will benefit you? Again, this is where giving more "baseline" data can help us (and you) spend your $$$ wisely.

thanks, but i don't know how to respond then.
 

kanewolf

Titan
Moderator
thanks, but i don't know how to respond then.
I just reread my previous post. My statement should have been clearer-- 128GB IS NOT a good match for a dual channel desktop CPU/motherboard.

Sorry that we aren't communicating better.

My recommendation is to use the resources of AWS to run your code on different size hosts. Find out where your code doesn't scale. Use that feedback to tailor your purchase.
 
Last edited:
The max number of cores that a spss client can use is 4
.
The following IBM SPSS Statistics analytical procedures (indicated by syntax command) are multi-threaded:

CORRELATIONS
CSSELECT
FACTOR
PARTIAL CORR
REGRESSION
SORT

If you can run multiple instances of SPSS, then a processor with a higher core count couls be useful.

Otherwise, it is best to look for the highest possible single threaded performance.

Today. those processors will be the new intel 10th gen K suffix processors.
I5-10600K with 12 threads, I7-10700K with16 and the I9-10900K with 20.
All can run in the area of 5.0.
Since you are familiar with the Xeon 2670 it has 16 threads and a passmark rating of 9321 and a single thread rating of 1497.
The total rating applies when all threads can be 100% utilized.
By comparison, the R7-3700X has 16 threads and a rating of 22755 and a single thread rating of 2689.
A few samples of passmark for the i5-10600K with 12 threads is 15174 but a very good single thread rating of 3026.
The i7-10700K with 16 threads is 19568/3070.

I would think that the i5-10600K would be appropriate for a single instance of SPSS.
If you can simultaneously run more than 4 copies, then you can go up from there

You will need a Z490 based motherboard
And ddr4 ram.
 
I read through the document.
Yes, it is old, The referenced processor was a T9400.
In some of the suggestions, one had a comparison of 2 4/8 processes.
The interesting thing was that the times were about the same, again indicating that single thread performance is most important for this app.
Other documents I read indicated the same thing.
I only included what I thought was relevant links to the question that was asked.

I might add that in the event that the data sets in question exceed the capacity of the ram, then the performance of the page file will come into play.
Regardless, the OP should include a very good performing SSD . I might suggest the samsung 970 evo plus.