Which Xeon CPU? HPC - FEM3D - Parallel Computing

fabio_86

Honorable
Oct 23, 2012
6
0
10,510
Hello,
I have to buy a new Workstation to perform heavy 3D Finite Element Analyses. Considering my budget, I am wondering which of these option could be better, and I could not find any better information online.

1- TWO Intel® Xeon® E5-2620 (six-core, 2 GHz, 15 MB cache, max mem 1.333 MHz)

2- ONE Intel® Xeon® E5-2643 (quad-core, 3,3 GHz, 10 MB cache, max mem 1.066 MHz)

3- i7 instead of Xeon??

The price of 1 and 2 is the same. The program that I am using can use multiprocessing.

The ideal solution would be TWO 2643 but they are too expensive.
I know there are also E3 but I tought that maybe E5 BIprocessors are the best solution..

I will also need 16GB of RAM and 1GB for the Video.

Do you think that there are other options (e.g. TWO 2690) that could still fit so that I could save some money and buy an SSD ot 10000rpm Hard Disk?

My main difficulty is to compare Processors with different max clock speed but different number of Processors.
Does the Intel® QPI Speed parameter (GT/s) give any overall idea of the global speed of a processor?

Thank you very much, I hope you can help me asap.
 

fabio_86

Honorable
Oct 23, 2012
6
0
10,510


Do you mean 1 and 2?
Do you exclude 3 or other options?

Can you help me with this

"My main difficulty is to compare Processors with different max clock speed but different number of Processors.
Does the Intel® QPI Speed parameter (GT/s) give any overall idea of the global speed of a processor?"

?

Thanks
 


Yes, I was referring to your first two options. I don't think that QPI is usually very important except maybe in multi-socket and/or extremely high I/O systems.

An i7 would mean no ECC memory support and really wouldn't perform any better, although it might be cheaper.
 
If you can manage to get 2x 2690s somehow for the same price as you are considering for the other stuff, that would certtainly be the ideal considering the 2690 is the best processor there is.

Even one of those would be better than 2x 2620s.

The 2x 2620s should thoroughly crush the 1x 2643.

An i7-3930k OCd to 3960X levels would fall about halfway between the 2690 and where 1x a 2620 would be.

Even an i7-3770k will be better than any of the Xeons at stock just based on architecture and it costs only a couple hundred bucks vs 1000+

I would get the highest one on this list that you think you can manage

Xeon 2690
I7-3930k
I7-3770k
Xeon 2620
Xeon 2643

Benchmarks aren't great, but they are the best way to compare processors. Here is one such listing with all the processors mentioned:

http://www.cpubenchmark.net/cpu_list.php

Sort it by processor rank to more easily see things how you want to see them.
 

fabio_86

Honorable
Oct 23, 2012
6
0
10,510


Thank you,

2690 is way too expensive and it is difficult to find.

On the other hand, I agree that Processors such as i7-3770 are really good. But did you consider that in my case I am considering TWO Xeon2620, not only one?

In the benchmarks (http://www.cpubenchmark.net/cpu.php?cpu=Intel+Xeon+E5-2620+%40+2.00GHz&id=1214) I don't think they are taking this fact into account.

If it was so, i7-3770 could help me to save much money and buying for instance an SSD that could improve my calculations when the code writes the solution to the HD (more than once during my calculations).

What do you think?

Best,

Fabio
 

fabio_86

Honorable
Oct 23, 2012
6
0
10,510


I think that the main difference from E3 and E5 is the fact that you cannot use 2 processors with E3 but only with E5.
Therefore considering only one processor I agree that E3 are a great possibilty because they are much cheaper (even if less "expandible"..).

Another disadvantage is that they have less Cache: 8MB vs 15for E5-2620 (even if I cannot say how much it would affect my numerical calculations).

Froma practical point of view, at the moment I will probably buy a DELL workstation because in my institution we can save much money through them and unfortunately I can't find any DELL with E3 (I might be wrong).

I'm still doubtful.. :??:
 

InvalidError

Titan
Moderator

The two 2620s do indeed give you more "core-Hz" than any other single-socket option.

To match that with a 3930k/3960X, you would need to overclock it to 4GHz at which point those CPUs might outperform the 2620 pair since single-socket systems eliminate the need for costly snooping across CPUs.
 


I don't know of any application where the extra cache of the LGA 2011 CPUs makes a serious difference. There are some where the quad-channel memory controller can help, but IDK for the cache. I've noticed that even between little Celerons/Pentiums with only 1MB, 2MB, or 3MB of cache, their performance per core per MHz of the CPU frequency is pretty much the same as even the i7s with 8MB of cache. L3 cache capacity doesn't seem like a big deal with Intel's current CPUs. Heck, looking at AMD's new Vishera CPUs and Trinity, it becomes obvious that the 8MiB of L3 really didn't make much of a performance difference versus no L3 at all, ranted AMD's L3 is much, much slower than Intel's (about four times higher latency measured in clock cycles) and might not be a good point of comparison.

For example, comparing the LGA 2011 CPUs with the LGA 1155 CPUs in gaming performance (games are notorious for eating up cache capacity and performance), cache capacity really doesn't seem to be a big deal between the 6MiB on the LGA 1155 i5s and the 15MiB on the i7-3960X because they still have similar gaming performance (and what advantage the i7-3960X does have can easily be attributed to its additional two cores and Hyper-Threading).

Point is that I don't think that cache capacity is really a big deal after around 6MB or 8MB, at least not for these CPUs. Maybe I'm wrong, but from what I've seen, this is correct.

As for the loss in expansion capability, I can see that being a valid concern, but when it's time to upgrade, would you rather sell the current system and replace it or buy a new vastly overpriced CPU (and motherboard if you need another board for it)? I'd just sell and replace it because that'd mean saving around $1000 (or even much more depending on what you actually buy) on the first system and not needing to throw around that much money away again just to get a second CPU that won't even scale performance fully (two quad core CPUs don't scale as well as an otherwise identical eight-core CPU, this is a problem known to multi-CPU systems).

I'm pretty sure that HP has a great computer that has your choice of E3 Xeon. I'll take a look for it.
 


Scaling isn't perfect in multi-CPU systems, so it's unlikely that a 3930K/3960X would need to be at 4GHz to match it. They can probably match it at stock, at least for the 3960X.
 

fabio_86

Honorable
Oct 23, 2012
6
0
10,510


Are you saying that it is not sure that the TWO 2620s perform better than any other single-socket option?

I am concerned because, of course, not all the processes can be parallelized, therefore I might not use all the processors at full during my analyses. Therefore probably less threads with high GHz may sound a "safer" solution. Don't you think.

Thank you blazorthon and InvalidError for your precious help!!
 


If you're doing work that's not highly threaded, then getting highly-threaded CPUs probably isn't going to help.

The scaling problem is because of the distance between the CPUs AFAIK. It gives them a fairly high latency connection and highly threaded tasks that can stress both CPUs and need to share data between them really don't work all that great with such connections (the bandwidth of the connections may also be an issue in at least some cases). Sure, you can expect fairly good scaling, but it isn't even near 100% last I checked. It's something like 50-75% IIRC.
 

InvalidError

Titan
Moderator

That depends entirely on the nature of the problem being solved and how much effort the programmers are willing to put into optimizing their code.

Finite Element Analysis (what OP wants to do) is what supercomputers with hundreds of thousands of cores (top500's #1 has 1.5 million cores) are built to do. It is one of few areas where performance scaling can be close to perfect.
 


I had considered that, but a good system with a 3930k in it will be about equal to 2x 2620s and it will cost a whole lot less, leaving room for SSDs etc and a good case, PSU, and processor heat sink.

Also, have you thought about just buying 2 whole 3770k PCs? With some of the figures we are talking about here you could get two PCs and possibly come out ahead in processing power for the same amount of $.
 

fabio_86

Honorable
Oct 23, 2012
6
0
10,510


Actually there is http://premierconfigure.euro.dell.com/dellstore/config.aspx?cs=RC1084457&oc=WT16501 which arrives up to
E3-1290V2 http://premierconfigure.euro.dell.com/dellstore/config.aspx?cs=RC1084457&oc=WT16501

4 cores - 8 threads
3.7 clock speed

how do you think this can compare with the other options?

I see that in this case I have 32GB of RAM max, but I was planning to go for 16 and I hope it will be enough
8 MB of cache but you said it is not so relevant
SAving money will allow me to spend a little bit more on HD probably going for an SSD which should speed my analyses up in the writing phase


(yes, FEM codes can exploit parrallel processes quite well, but depends on the codes and they are far from perfection... Unfortunately on this topic there is still a lot of work to do!!)
 


alright then.



Like I said earlier, do remember that i7s mean that you can't use ECC memory. That might not be a feature that you'd mind not having for a computer doing such a task.
 

KenwoodGT

Honorable
Jul 31, 2012
90
0
10,660
Just throwing it out there - The Xeon E3-1230 V2 is a slightly lower clocked version of the i7-3770 without integrated graphics, supports ECC ram, and is a good deal cheaper. I can get one on Newegg for less than $250. Something to think about if you're trying to save money.
 
The reliability of non-ECC RAM is so high these days (assuming you don't receive flawed sticks, that can happen with ECC too) that ECC RAM has no practical advantage over non-ECC RAM.

It does, however, have a practical disadvantage. The parity check causes overhead not present in non-ECC RAM. Thus will the performance be some small amount worse across the board. Maybe not a huge amount, but its a quantifiable downside vs no quantifiable upside.

Being that this is science/math work, the results will probably need to be verified anyway regardless whether the computing PC is using ECC RAM or not.
 

InvalidError

Titan
Moderator

While RAM's error rate may be low, when computations take days/weeks to run, it is quite possible that a single-bit error once a week could be enough to prevent results from converging. Having to run the same computation a couple more times due to doubting the RAM costs a lot more than simply using ECC RAM and whatever performance loss associated with it.
 
Sounds like a lot of assumptions that may or may not reflect reality.

I would rather see some results from people that tried running these programs on non-ECC and to hear about their real world experiences before I jump on the ECC train.
 

TRENDING THREADS