[SOLVED] Data Precessing Build Advice

Arzhur

Distinguished
Sep 28, 2009
174
4
18,695
Hi,

I'm a PhD student in astronomy and am planning on building a new system to run simulations and reduce large data. Some of the codes that I use for simulations are parallelised efficiently, while others aren't. However, the data reduction package that I use, CASA, is mainly bottlenecked by memory per core, but hyperthreading reduces it's performance. I will be using a Linux OS, Scientific Linux, or Ubuntu. Keeping this in mind I have put a list of parts together with reasons of why I picked them. Any advice or input would be great. I have to stick to registered suppliers from my university, so I'm listing parts from komplett.ie. My budget is about €1500, but I'm trying to keep it lower, if possible.

CPU - AMD Ryzen 7 2700 (€279). This is a solid CPU. When overclocked it's performance is approximately equal to an overclocked AMD Ryzen 7 2700x (€359), if I'm correct. It's a bit more work, but should be worth the €80 difference. There could be an argument for an Intel Core-i5 9600k (€299) for it's single core performance, but the loss of two cores and hypethreading doesn't seem like it'll be worth it. I could be wrong though. Threadripper CPUs many be over kill core wise, because the codes that I use aren't fully parallel, or require significant too much memory per core, making the whole system much more expensive.

Motherboard - MSI B450 Tomahawk (€109). The main difference that I can see between B450 and X470 motherboards is SLI support. I won't be using GPUs in this setup, so that's irrelevant. However, I could be over looking other benefits of getting a more expensive board.

RAM - Corsair Vengeance LPX 32GB (2x16GB) (€249). CASA, one of the codes that I use recommends a minimum of 4GB per core, so 32GB is a minimum. From what I read Ryzen CPUs benefit from higher speed memory, so I selected 3000 MHz memory. I am considering getting a second set of this memory, for a total 64GB, to avoid any memory bottlenecks.

SSD - Intel SSD 660p Series - 1TB (€129). Some of the smaller data sets that I work with will use less that 1TB, so having a fast SSD would be beneficial here. This particular SSD has good reviews, but it's main fault is durability. It is covered by a 5 year warranty though, so is this OK, or would you advise something else?

HDD - 3 x Western Digital Red - 4TB in RAID 5 (3 x €130 = €390), or 3 x Seagate IronWolf - 4TB in RAID 5 (3 x €135 = €405), or 1 x Seagate IronWolf - 8TB (€250). Some of the larger raw observations from ALMA are above 1TB and can easily expand to fill the majority of these drives while they are being processed. The RAID options would increase write speeds, but at a cost. Do you know if it's possible to add an additional drive an already configured RAID array, without deleting it already? This would make it easy to extend storage, if necessary. Do you have a preference between the WD or Seagate drives?

PSU - Seasonic Focus Gold 550W (€80). Seasonic PSUs tend to be very solid. 550W should be plenty for this system, as there is no GPU. Would it be much better than the cheaper Corsair VS550 (€49)?

Case - Cooler Master N400 (€69). The only reason that I selected this case is for the 3.5" drive slots. The ability to hold 7 of them will allow for additional expansion, if required.

I've built my own personal desktop, so I have some experience, but it was geared towards gaming. Am I missing anything in this system, or do you have any other advice?

Edit: It seems like I still need to get a GPU, because the 2700 doesn't have one. Before putting one on my list, I'll have a look to see if there are any spare ones laying around.
 
Last edited:
Solution
Yes, you will HAVE to have a GPU.
I wouldn't recommend RAID on a home/student system. The single larger drive will simplify things significantly and improve reliability. If you need to write checkpoint or other intermediate data user your SSD.
If you are going to get 64GB, get it as a 4 DIMM set. Don't get two 2-DIMM sets and just assume they will work together.

kanewolf

Titan
Moderator
Yes, you will HAVE to have a GPU.
I wouldn't recommend RAID on a home/student system. The single larger drive will simplify things significantly and improve reliability. If you need to write checkpoint or other intermediate data user your SSD.
If you are going to get 64GB, get it as a 4 DIMM set. Don't get two 2-DIMM sets and just assume they will work together.
 
Solution

Arzhur

Distinguished
Sep 28, 2009
174
4
18,695
Thanks for your replies.

Yes, you will HAVE to have a GPU.
I wouldn't recommend RAID on a home/student system. The single larger drive will simplify things significantly and improve reliability. If you need to write checkpoint or other intermediate data user your SSD.
If you are going to get 64GB, get it as a 4 DIMM set. Don't get two 2-DIMM sets and just assume they will work together.

Does RAID not introduce additional redundancy in comparison to single drives? Or is it just too messy to setup normally? I've set an array up before, but that was on a sever, so all of the hardware and software was already supplied. Is setting one up on a desktop much more difficult?

I working form an old recommendation for CASA fro m2016 and they suggest 25 to 50 MB/s per core. The single 8 TB drive seems to peak at about 230 MB/s, which would give ~29 MB/s per core for the 2700. The suggested CPUs listed there, the E5-1620v3, E5-2620v3 and E5-2640v3 seem to have slightly weaker single core performance (I'm not sure if that's correct, because I can't find full comparisons), so assume that I should be closer to the 50 MB/s per core.

Also, the 8 TB might not be enough, so I was considering getting a 4th drive (and possibly removing the SSD to keep the cost down). This could easily be added if I was using RAID, but would be messier if I was using an 8 TB drive.

Getting a 4 DIMM 64 GB set could be difficult, as komplett.ie doesn't sell any. They only go up to 32 GB. I could try contacting them to see if that can get 64 GB sets in.

Yes you will need a GPU for that build. If you want a case suggestion I'd recommend the Phanteks Enthoo Pro, it;s €30 more and it has tons of drive storage: https://www.newegg.com/global/ie-en/Product/Product.aspx?Item=N82E16811854003&Description=phanteks enthoo pro&cm_re=phanteks_enthoo_pro--11-854-003--Product

Unfortunately I have to stick with komplett.ie and they don't have any Phanteks cases. I do like them though. I just got a Phanteks Enthoo Pro M Tempered for my personal desktop.
 

Arzhur

Distinguished
Sep 28, 2009
174
4
18,695
I contacted the supplier (komplett.ie) and they won't sell more than 32GB sets. If I get two of them and they fail, returning them will be awkward, because I'll be ordering it as a company, not a consumer. However, it turns out that Cruical is a registered supplier in my university (and they are cheaper). They sell several sets that work with the MSI Tomahawk B450. In the 3000MHz category there are 15-16-16 (€479.69), 16-18-18 (€423.11) and 17-19-19-38 (€423.11) latency sets. I'm planning on getting the 16-18-18 set as it's CAS is even (apparently it'll be increased if it's odd) and it's one of the cheaper ones, unless there would actually be a visible difference.