Server Farm Build Help (Quad Core Hyper Threaded)

to4st3r

Commendable
Mar 21, 2016
4
0
1,510
Long time (15 years!) lurker first time poster. I'm looking to build an in house processing farm of 10 machines from scratch. The task at hand is mostly headless processing. Looking for recommendations on what one of those machines would look like. Budget conscious so doesn't need to be bleeding edge.


  • • Budget range is less than $1000 but the cheaper the better (<$500)
    • Processor: Quad core, hyperthreaded
    • Operating system: Umbuntu (or Cent OS)
    • RAM: At least 8 GB. Range up to 16 GB
    • Case form factor: As compact and quiet as reasonably possible. Bonus if it can be rack mounted (doesn't matter how many slots it takes up)
    • Storage: Simple and basic / no RAID needs.
    • Graphics needs are minimal. No 3D, it's going to be headless + terminal access.
    • Overlocking: no; Gaming/3D: no; SLI or Crossfire: no; Will this be used for virtualization: no
    • Location: Bellevue, WA (Seattle area)

Thanks so much in advanced for your help!
 
Solution


to4st3r,

Building 10 systems is a tremendous amount of work. Perhaps you might consider buying 10X of something like this:

HP Z420 Workstation Xeon Quad Core E5-1620 3.6GHz 8GB RAM 500GB NVIDIA Win 7 Pro

> That is a listing in which "more than 10" identical systems are available at $499 or offer. I don't know anything about the seller, but perhaps purchasing 10 at once could be at $450 each?

That particular one has a Xeon E5-1620 4-core @ 3.6 /3.8GHz and that has a very strong single-threaded performance. Not that it applies here, but these systems are very good in 3D visualization as the Xeon E5-1600 series have very strong single-threaded performance. These can use 64GB of DDR3 1600 ECC, so after adding the drives, any residual budget could be RAM. I have two z420's, and these have been perfectly reliable over the last couple of years. these are not especially compact, but are the quietest systems I've ever had and could be lined up on an industrial metal shelving unit.

If you wanted to really optimize these for mutli-threaded processing use- scientific, database, financial analysis, simulation etc., in which which core count is more important than single-threaded speed, you could change the E5-1620's for Xeon E5-2670 's which is an 8-core @ 2.6 /3.3GHz and remarkably inexpensive used:

SR0KX INTEL XEON E5-2670 8 CORE 2.60GHz 20M 8GT/s 115W PROCESSOR CPU > "More than 10 available" Buy it Now @ $71.25

In this example, that would give you 10X 8-core /16 thread LGA2011 systems for under $5,800 or if the E5-1620 was sufficient- for $5,000.

In my view, that is the best cost /performance solution- that also is an extremely smaller amount of work than building. Plus, workstations are built like servers in terms of reliability and are designed to be very quiet.

Cheers,

BambiBoom

1. HP z420 (2015) > Xeon E5-1660 v2 (6-core @ 3.7 / 4.0GHz) > 32GB DDR3 1866 ECC RAM > Quadro K4200 (4GB) > Intel 730 480GB (9SSDSC2BP480G4R5) > Western Digital Black WD1003FZEX 1TB> M-Audio 192 sound card > 600W PSU> > Windows 7 Professional 64-bit > Logitech z2300 speakers > 2X Dell Ultrasharp U2715H (2560 X 1440)>
[ Passmark Rating = 5064 > CPU= 13989 / 2D= 819 / 3D= 4596 / Mem= 2772 / Disk= 4555] [Cinebench R15 > CPU = 1014 OpenGL= 126.59 FPS] 7.8.15

2.HP z420 (2013)(Revision 2) > Xeon E5-1620 four core @ 3.6 /3.8GHz > 24GB DDR3 ECC 1600 RAM > Quadro 4000(2GB) > Samsung 840 (250GB) WD Black 1TB > > M-Audio 192 soundcard > Linksys WMP600N WiFi
[Passmark system rating = 3815 / CPU = 8985/ 2D= 767 / 3D= 2044/ Mem= 2523 / Disk= 2986]

3. Dell Precision T5500 (2011) (Revised) > 2X Xeon X5680 (6 -core @ 3.33 / 3.6GHz), 48GB DDR3 1333 ECC Reg. > Quadro K2200 (4GB ) > PERC H310 / Samsung 840 250GB / WD RE4 Enterprise 1TB > M-Audio 192 sound card > Logitech z313 > 875W PSU > Windows 7 Professional 64> HP 2711x (27", 1920 X 1080)
[ Passmark system rating = 3844 / CPU = 15047 / 2D= 662 / 3D= 3550 / Mem= 1785 / Disk= 2649] (12.30.15)
 
Solution
Thanks so much for your answer bambiboom! What would you recommend if I were up for building the systems from scratch? Specifically the mobo + CPU? I have considered used systems from Ebay and Amazon. My main concern comes down to repairability and reliability. Used is cheaper but if the mobo fries, I'm stuck with a proprietary board and case (I am assuming the HP doesn't have a standard ATX board and case). By ordering new I have a warranty and should any component fail I can swap it with something comparable. I built gaming systems in the past, putting it together is not a big deal. The research to find the right components at the right price I have found to be far more time consuming 😉
 


to4st3r,

First, I'd say that I've had five used workstations since 2007 - all of them upgraded with used parts and have never had one component in any system fail. The one I used the longest a 2007 Dell Precision T5400 purchased in 2010 had it's second CPU -a $25 Xeon X5460 arrived in a plain business envelope. That system was my main system for nearly five years, sometimes running continuously for a couple of weeks at time. No failures.

To specify a build of this kind- something with a specific use, to optimize for a price, it's would help to know the software, and the kind and size of the projects. Is single threaded or more cores important- that kind of thing.

If for example, the objective was processor count, I'd suggest a single system with multiple GPU processors. This consolidates the capabilities with fewer component duplication, thereby each part has a much higher specification- one motherboard, two CPU's, one set of RAM modules, one or a redundant power supply instead ten of each. Fewer, but higher quality and performing parts, lower probablity of failure.

The median budget mentioned is between 10X $500 and 10X $1,000 or $7,500.

With that budget and assuming that sheer parallel processing power is the objective:

Chenbro RM41300-FS81 No Power Supply 4U Rackmount Server Chassis for Tesla GPU > $146 (Superbiiz)

2X Intel Xeon E5-2630 v3 Eight-Core Haswell Processor 2.4 GHz 8.0GT/s 20MB LGA 2011-v3 CPU, OEM > $1,200 ($600 each)(Superbiiz)

2X Supermicro SNK-P0048AP4 CPU Heatsink For LGA2011 >$64 &32 each)

Motherboard: Supermicro X10DRG-Q (4X PCIe x16 GPU slots) > $499 (Superbiiz)

128GB (8 X 16GB) Crucial DDR4-2133 16GB/2Gx72 ECC/REG CL15 Server Memory > $800 (Superbiiz)

GPU A: 3X HP NVIDIA TESLA K10 KEPLER ACCELERATOR 8GB GK104 GRAPHICS VIDEO CARD 688982-001 > $3,000 (used $1,000 each)( a slot for one more too)

[GPU B: 4X Quadro M4000 (8GB) $780 Each] (This would have to be studied to see if the CUDA processing could be applied to your use.)

Samsung 850 EVO Series 1TB 2.5 inch SATA3 Solid State Drive, Retail (3D V-NAND) > $327

2X Seagate Constellation ES.3 ST3000NM0023 3TB 7200RPM SAS3/SAS 6.0 GB/s 128MB Enterprise Hard Drive (3.5 inch) > ($185 each)

Seasonic SS-1050XP3 1050W 80 PLUS Platinum ATX12V/EPS12V Power Supply w/ Active PFC > $194

________________________________________

TOTAL = $5,881

This is then a 4U rackmount system with 16 cores /32 threads at 2.4 /3.2GHz plus 3X GPU's with a total of 24GB of GDDR5 RAM and 9,216 CUDA cores as coprocessors, 128GB of system RAM. The only used components are the Tesla K10's, but these are of ultra-high quality and constantly changed as each new model supercedes them- not for reliability. There are supercomputers like the TITAN in Oak Ridge that have run 1000's of Teslas at full performance for years at time.

Because of the cost in duplication of components- motherboard, cases, power supply, CPU's, I believe it would be difficult or impossible to assemble 10 systems for $6,000 that would have the processing power of a single one of this specification This also avoids the duplication of assembly effort and the complication of the parallel /cluster configuration /integration.

Or, ten $600 systems with Xeon E3's which would have 40 cores /80 threads, 160GB of RAM, but the motherboards would be lower specification. The other idea that occurred was to create a cluster of 100 Raspberry PI 3's,... Much depends on the programs, projects,and dataset size.

Very interesting project.

Cheers,

BambiBoom






 
Thank you so much for providing two scenarios. You made me rethink used systems. And thank you for bringing up a RaspPI cluster - I also thought about that and will spend some time looking into this. I could find a Rasp PI 2 Model B with 1 GB RAM Quad Core for less than $40. Unbelievable.

To answer your question (sorry if my original post wasn't clear), the ideal architecture is having multiple cores/processors/threads. This is purely processing with no need for GPU.

 



to4st3r,

It's an interesting project. Last year a friend doing an aerospace project asked me to recommend a good system for Matlab to run some flight dynamics problems. The resulting system would have been effective, but too expensive -$9,000. In response, I had it in mind to build a cluster of 100 Raspberry Pi's based on a 12U rack dimensions acrylic case with 5 vertical boards, each with 20 Raspberry Pi's on one side and the network and power connections on the other side. A 750W central power supply supplied by a Powervar power conditioner. To service, the front panel with fans, power switch, USB K/B, mouse, and input connections would be hinged down and the 5 boards would slide forward on top and bottom guide rails.

I found a similar- but more extroverted- solution today, that shows the general arrangement and connections using 120 Pi's. I like the touch screen for each Pi. Think of ten, four-core systems equaling 40 cores whereas the 100 Pi's equals 400 cores,..

In the end, my friend ended up doing the problems, not on Matlab as there would be too much specialized coding, but on Excel using my old Dell Precision T5400 (2X Xeon X5460 /16GB / Quadro FX4800)! It was slow work as a single problem ran for three days,..

His device, now being tested, uses a pair of Raspberry Pi's in the test platform. They're so appealing in an inexpensive, modular way, I keep wanting to think of ways to use them. Kitten Komputing. The Pi 3 ($35) looks interesting and runs about 50 OS variants.

However, as fun as a RaspPi cluster would be to do, the single 4U dual Xeon with Tesla GPU coprocessing still seems the strongest solution.


Cheers,

BambiBoom

 
BambiBoom, thanks again for your help. We ended up getting 15 Dell T5400s (dual processor quad core) and processing to $-wise they were a great investment. Total of 8 cores for $200 each. I owe you several rounds of beer! (PM me your address :) I am back here with a follow up situation. The amount of heat and electricity consumption is pretty high (as was expected). We are all running headless + Umbuntu server (terminal).

Any suggestions to reduce energy consumption + heat dissipation?

My main strategy so far is get rid of the Nvidia 200w Quadro cards with cheap Dell FH868 $5 video cards. The bios doesn't allow me to run without the cards. They are on all the time, so power saving features w/ sleep doesn't apply.

What do you think?

p.s. This winter we'll be making a RaspPi cluster to see how effective they are with our metrics: 1) $ to core ratio 2) energy consumption 3) heat dissipation
 



to4st3r,

I very much appreciate your reporting back on what sound like a very interesting project. Your thanks can be expressed in a PM about your work, which I will keep confidential if you so request.

Performance: For the lead system consider a PERC H310 RAID controller which will convert the disk system to 6GB/s for about $50-60 . In the T5500, I added one and without changes to the Samsung 840 or WD RE4, the Passmark disk rating changes form 1940 to 2694. Also, consider PCIe drives.as you may be able to use M.2 AHCI. I added a Samsung SM951 M.2 AHCI to an HP z420 (E5-1660 v2) and the already fast Intel 730 480GB Passmark disk score of 4794 became 11559.

Heat : Yes, the Dell Precision T5400 is beautifully made and ultra-reliable, but a bit hotter running, partly due to the DDR2 RAM which in my T5400 regularly ran up to 80C. The T5500 uses DDR3 which runs some what cooler- low 60's. Running the cluster without a GPU is a very good idea. My T5400 uses a Quadro FX4800 If the individual system do need to run storage HD's that would reduce heat and power consumption. For continuous running, consider adding a pair of small diameter- 90mm or so- extractions fans to the high end of the back panel connected to a spare 4-pin Molex to draw hot air through the chassis and out the back.

Power: I can't think of another way to reduce power consumption except to shorten the processing time as much as possible- which takes more power,... I'm not sure of the nature of the parallelization of the cluster is better when more or less identical, but for the lead system, consider adding a Tesla C2075 6GB GPU coprocessor. which will add 448CUDA cores processing power And 515 Gflops double precision to the effort. the C2075 also has a single DVI monitor output, the only Tesla with that feature, so it can run the monitor without another GPU. The last time I was at my local particle accelerator I saw a C2075 on the beach in the server room. Unfortunately 225W so more heat and power, but might advance the processing of the initial data stream in a way that will will multiplied by the parallelization, increasing performance. If the processing time is shortened that might also to a degree reduce the power consumption and heat.

At the accelerator facility, they were playing with a Raspberry PI cluster- 10 of them running an MIT program which with a display that converted various physical parameters into colored shapes. I have photos of it and I wish I'd asked more questions. It was not a practical application, but demonstrated a considerable computational density. For the experimental simulations, they use eleven parallel dual 14-core Xeon systems each having 4X Tesla K20X and I think that may be tied to Oak Ridge as well, so if, they're impressed with a Pi cluster,... A friend is running a quite complex aerospace control system off a pair of them.

Well done!

Cheers,

BambiBoom


P.S.>

Current Project:

HP z620 (Original) Xeon E5-1620 4-core @ 3.6 /3.8GHz) / 8GB (1X 8GB DDR3-1333) / AMD Firepro V5900 (2GB) / Seagate Barracuda 750GB + Samsung 500GB + WD 500GB
[ Passmark System Rating= 2408 / CPU= 8361 / 2D= 846 / 3D = 1613 / Mem =1584 / Disk = 574 ] 7.13.16

Now:

HP z620 (Rev 2) 2X Xeon E5-2690 (8-core @ 2.9 /3.8GHz) / 40GB (4X 8GB +4X 2GB DDR3-1600) / Quadro K2200 (4GB) / Seagate Barracuda 750GB + Samsung 500GB + WD 500GB / 800W > Windows 7 Professional 64-bit >
[ Passmark System Rating= 2589 / CPU= 20703 / 2D= 728 / 3D = 3542 / Mem =2397 / Disk = 587 ] 8.2.16

System: $270
CPUs: $320 ($160 each)
CPU Riser: $150
RAM: $165 (32GB DDR3-1600 ECC)
GPU: From Precision T5500
Disk: HP Z Turbo Drive M.2 256GB: $150 Not yet installed. The Passmark score for this drive is 12602 so that should improve the 587 score of the 2010 Seagate 750GB!