Xeon E5-2670v3(or 80v3) vs i7 6950X for simulation

Yu JW

Commendable
Jul 2, 2016
8
0
1,510
Hi, I am currently building a work station for simulation(MD and DFT).
I will use this workstation for doing LAMMPS, GROMACS, VASP, MATLAB(this is for post-processing).
As a first choice, I was considering to buy E5-2600v4 CPU but there was no retailed version in my country.
So I thought I have no choice but to buy Xeon E5-2600v3(maybe 2670v3 or 2680v3 in my budget) which is older generation CPU.
And then about a month ago 6950X was introduced at Computex.
I am not sure which one is better for my purpose.
Can you compare both(Xeon or 6950X) and help me choose one?


Thanks in advance.

+P.S. I'm sorry for my poor english
 
Solution


Yu JW,

The i7-6950X 10-core @ 3.0 /4.0GHz has a very good CPU perf0rmance. On Passmark...


Xeon CPU -> X99 Board with ECC support

6950X -> X99 Board w/o ECC support ( I think it's cheaper)
 


Yu JW,

The i7-6950X 10-core @ 3.0 /4.0GHz has a very good CPU perf0rmance. On Passmark Performance Test, there are 18 systems tested and the highest CPU mark is 20329. Importantly, that CPU also has a single-thread rating of 2166- one of the higher scores. This is a very good combination for visualization. The high single threaded score is excellent for 3D CAD modeling. This is a reference to the calculation of the placement of polygons. The number of cores too is very good for calculations or processor-based rendering which produces a better image quality.

However, in my opinion, particle physics and molecular dynamics problems require massive parallelization and will benefit from GPU accelerated co-processors. In thinking about Matlab, particle physics, and NAMB (molecular biology) simulations, I use the following document:

"Classical Molecular Dynamics Codes and Coupling of Length Scales"

Peter T. Cummings, Normand Modine, Randy Cygan

That document is older- 2010- but I think is very clear on the best kind of system as it shows the typical scale of the datasets (sample sizes), the nature of the algorithm customization, and the level of precision required.

The parts choices should use the suggestions on the use of parallelization in combination with GPU modeling. This is how all the fastest computers in the world- China has the fastest which uses a special RISC-processor, but many of the fastest are based on Intel Knight's Landing Xeon Phi and also NVDIS Tesla GPU co-processors . In summary, this suggests to me that it is better to use two Xeon E5-2600 processors plus GPU coprocessors.

Three weeks ago, I was at a particle accelerator facility where I had done a small project. Their simulation system includes eleven paralleled, dual Xeon E5-2600 v3 chassis. The two CPU's provide a total of 80 PCIE lanes so the system can have the GPU co-processors. Each system includes four NVIDIA Tesla K20 X GPU co-processors. All RAM is ECC registered, error-correcting. That is necessary given the massive parallel threads that must be synchronized and parity checked. The i7-6950X can not use ECC RAM and that is a very important. The i7-6950X also limits the PCIe lanes to 40. Even with this system, some experimental simulations still require 100 hours running time.

My first idea for a good, general capability:

Two Xeon E5-2650 v3 or Xeon E5-2670 v3 or E5-2680 v3

E5-2650 v3: 15241 single CPU / 20671 (Dual) / 1715 single-threaded

E5-2670 v3 16549 / 22953 (Dual) / 1690 single-threaded

E5-2680 v3 18890 / 25811 (Dual) / 1690 single-threaded

Supermicro X10DAi motherboard for E5-2600 v3
https://www.supermicro.com/products/motherboard/Xeon/C600/X10DAi.cfm

128GB DDR4- 2133 ECC registered
Quadro K4200
2X Tesla K10 or K20
M.2 NVMe 512GB Disk 1
LSI MR9300 series RAID controller
3X Seagate ES.3 4TB (RAID 5)

In my opinion, it would be better to have the CPU with fewer cores at a higher clock speed and add at least one Tesla coprocessor. by the way, the Tesla coprocessors are often sold with moderate use for much lower cost. They are of very high quality and a used one may be completely reliable.


Tesla K20X GPU Accelerator

Chip GK110
Package size GPU
45 mm × 45 mm
2397 pin
Processor clock 732MHz
Memory clock 2.6 GHz
Memory size 6 GB
Memory I/O 384-bit GDDR5
Memory configuration: 24 pieces of 64M ×16 GDDR5 SDRAM
Display connectors: none
Power connectors 8- pin PCI Express power connector • 6 - pin PCI Express power connectorBoard power
235 W
Idle power 25 W
Thermal cooling solution: Passive heat sink

Mean time between failures (MTBF)
Uncontrolled environment: 128440 hours at 35 °C
Controlled environment: 208861 hours at 35 °C


Cheers,

BambiBoom

1. HP z420 (2015) > Xeon E5-1660 v2 (6-core @ 3.7 / 4.0GHz) > 32GB DDR3 1866 ECC RAM > Quadro K4200 (4GB) > Samsung SM951 M.2 256GB AHCI / Intel 730 480GB (9SSDSC2BP480G4R5) / Western Digital Black WD1003FZEX 1TB> M-Audio 192 sound card > 600W PSU> > Windows 7 Professional 64-bit > Logitech z2300 speakers > 2X Dell Ultrasharp U2715H (2560 X 1440)>
[ Passmark Rating = 5581 > CPU= 14046 / 2D= 838 / 3D= 4694 / Mem= 2777 / Disk= 11559] [6.12.16]

2. Dell Precision T5500 (2011) (Revised) > 2X Xeon X5680 (6-core @ 3.33 / 3.6GHz), 48GB DDR3 1333 ECC Reg. > Quadro K2200 (4GB ) > PERC H310 / Samsung 840 250GB / WD RE4 Enterprise 1TB > M-Audio 192 sound card > Logitech z313 > 875W PSU > Windows 7 Professional 64> HP 2711x (27", 1920 X 1080)
[ Passmark system rating = 3844 > CPU = 15047 / 2D= 662 / 3D= 3550 / Mem= 1785 / Disk= 2649] (12.30.15)






 
Solution


Thank you for your kind advice!
The information with the benchmark is very very helpful!

Your suggestion is appealing if my lab doesn't have cluster computer.
If a simulation system is very large, then i will use cluster.
what i need now is a sub-machine for testing a small system or test bed.
For the GPGPU computation, I don't doubt it's performance.
But for some complicated reasons (not a technical one), i cannot adopt it.

Considering the performance in the benchmark you provided, I will buy 6950X as my workstation CPU.
I once googled ECC RAM and there was a controversy over whether it is really contributed to dealing with data with stability. I also heard the gap between non-ECC and ECC RAM is getting narrower. So i think it's not bad option to
choose my workstation with non-ECC RAM. Especially, It's a sub-machine. the cluster covers heavy system.
I think the calculation time for smaller scale(i.e. workstation scale simulation) will not be long for the error to be critical for the entire system though I'm not sure about calculation time and effect of an error.

Anyway, what you mentioned is what I wanted if I have another chance to setup a cluster or workstation.
Thanks!