Intel Xeon and AMD Opteron Battle Head to Head

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Scenerios is the thing.
There are soo many different scenarios for server use these days.
For render farms, these new Intel chips are great.
With a lot of the scientific progs learning to use SSE, it's another win for Intel.
When it comes to pure alu/fpu, the opterons still do quite well. They really shine where a data base fits into available memory.
We will be seeing soon, how the quad cores from Intel, do.
Next thing is to get those chips certified for all the diferent progs those diferent scenarios patch together.
 
What going to be interesting is the 8 socket Opteron line-up.
Most companies I currently deal with (all fortune 500 comps) are
looking majorly into server consolidation. So looking at 8+sockets
and using VMware or solaris containers, etc for virtual servers.
Now come this time next year with quad core Opteron and 8 socket servers you have an instant 32way system.
Intels FSB will just not scale to this kind of system. Heck the Itanium 2 does not scale well in this scenario.
If you check out some true realworld server benchmarks at http://www.sap.com/benchmark/sd2tier.asp (look for the column SAPs for platform performance) and you will see that Opteron in 4 Sockets beats any other 4 Socket server (itanium, Xeon, Sparc and even Power)
It is this medium to high end where the big bucks are and intel has nothing at the moment that will compete.
If I were looking at purchasing x86-64bit servers in the next year when 4 cores are available, i'd be looking at the following

2 Socket - Intel (4 to 8 core)
4-8 Socket -Opteron (8 to 32 core)

After that you have to look at the proprietry systems (Itanium, power, sparc) No idea when the Itanium will go 4 core.
You'd need at least a 24 socket server for Itanium to perform any thing like a 8 Socket Quad core opteron
 
I would have liked to see a test performed on a MySQL database and other server specific apps.
Video and audio encoding are a nice benchmark but not something that the real world would use frequently on a server.
 
If you check out some true realworld server benchmarks at http://www.sap.com/benchmark/sd2tier.asp (look for the column SAPs for platform performance) and you will see that Opteron in 4 Sockets beats any other 4 Socket server (itanium, Xeon, Sparc and even Power)

Xeon-based HP DL580 G4: 2127 Users, 10650 SAPS
Opteron-based HP DL585 G2: 1978 users, 9920 SAPS

The arrival of Tulsa now means Intel has regained the performance crown in key enterprise benchmarks at 4S, and unlike Opteron, scales well past 4S.
 
Meh, nothing in that article that we haven't already read about or aren't already aware of...

Where is the socket F opterons ? they are (thoretically) available for 2-3 month now, they use lower voltage memory...
I thought it curious too that they chose 2xx series optys rather than the socket 1207 22xx series...I would like to have seen how a 22xx opty with DDR2-800 ran against the Xeon rather than the 2xx with DDR-400 used in the test...then maybe, it would have been an interesting article...maybe...
 
They would have done no better, with those applications, both architectures have been already benchmarked over and over and over.
What would have been interesting is seeing DDR2 Opterons VS Xeons with server applications.
 
Well, bear in mind that the 'peak' performance done by
hand-tuned codes won't happy to your daily work. The
programs you use daily are generated by compilers with
couple critical subroutines hand-coded at most. Using a
code hand-tuned for one architecture on a different
architecture might hamper the later due to different
pipleline length, etc.
 
Are you so sure? Do some simple calculation -- if, say,
the CPU can perform one instruction every clock cycle.
Then how many bytes it needs per second in a, say,
3.2GHz CPU? What's the bus speed on Core 2 dual?
Will that sustain two hungry CPUs? What will happen
if you have 4?
 
New technology beats old ones...
Thats not nescesarly true, remember the P3 and P4, at the same clock the P3 is faster.
Well, P4 was the WORST processor EVER period.
 
Well the code does not necessarily need to be "hand-tuned".

Intel has had their hands smacked recently for writing "Bugs" into their compiler so that they do not properly read the supported functions of "AMD" processors and fail to use them.

There are standard IEEE directives that are supposed to be used to determine what functions such as SSE and SSE2 that a CPU supports.
Intel has been repeatedly caught adding syntax to the compilers so that these features only function on Intel CPUs and not AMD cpus. A few quick hacks into the compilers so that the compiled program checks non-Intel CPUs as well has shown strong improvements in the AMD CPU.

Intel has often "addressed" the fixes but slipped another error elsewhere in the logic so it breaks there for Intel CPUs.

So the poster is not necessarily asking for AMD tuned software, but software written by somebody other than INTEL which is both tweaked for Intel and has code added to reduce the performance of AMD.

All of that being said, Intel has the faster and more efficient processor at this point in time. AMD may balance things in scalability.

Those who question the article are mostly concerned with examing the "System" as a whole and scalability.
 
What going to be interesting is the 8 socket Opteron line-up. Intels FSB will just not scale to this kind of system. Heck the Itanium 2 does not scale well in this scenario.

If I were looking at purchasing x86-64bit servers in the next year when 4 cores are available, i'd be looking at the following

2 Socket - Intel (4 to 8 core)
4-8 Socket -Opteron (8 to 32 core)

Well, the 51xx series are only for 2 socket systems. Any Intel systems with more sockets still use older Netburst architecture Xeon MP processors. And yes, we all know that Netburst can not compete.

So it is too early to tell if any Intel systems with more than 2 sockets are FSB bound or CPU architecture bound - we will see when "Tigerton" 4 socket Core architecture CPU/Platform debuts.
From anything I have seen, the memory controller and FSB architecture does not have any real world implications in slowing down the Intel Core platforms. Yes, FB-DIMM introduces a lot of latency, but that is not because of the placement the memory controller in the chipset.
 
New technology beats old ones...
Thats not nescesarly true, remember the P3 and P4, at the same clock the P3 is faster.
Well, P4 was the WORST processor EVER period.
I dont understand why people keeps on saying that the P4 was trash or worst processor ever etc.
The P4 was much more competitive with the Athlon 64, than what the Athlon 64 is now with Core 2.
And there have always been tasks like media encoding where it was very strong.. sure the Athlon 64 was a bit better overall, but the P4 was a good CPU.
IMO people in these forums take too much black&white positions concerning CPUs, i.e.
Athlon 64 ownz everything!!!
Core 2 rocks my socks lolz!!11!!
 
Where have you seen benchmarks of the new Xeons past 4S? I just think it would be FSB bound unlinke the opteron. If you could the site showing how it scales past 4S I would appreciate it.

Thanks
wes
 
I am sorry, benchmarks are incomplete. Without Quake 4 results, it's impossible to say which CPU is better for server operations. Quake 4 is only second to the most common server task, DIVX Encoding. Thank god, this was included in the test.
 
I think the P4s bad name is becaue Intel clearly chose a bad design vs P3.

If Intel had put the effort into maintaining the P3 that they had the P4, they would have made cooler CPUs that would have had more processing power than the matching P4s.

The P4 was simply an attempt to get High Clock speeds w/o gaining performance.

At some point Intel realized their mistake but chose not to address it.
They could have been releasing Pentium-Ms with much higher power usage than the Laptop versions and given us desktop horse-power with much lower power than the P4s where using.

Eventually this caught up with them and they started losing serious market share to AMD which was making a better chip than what Intel was selling. Not better than what Intel could produce, but better than the one they were sticking to for what was likely internal political issues.

So when people say the P4 was a bad chip, I dont think they would have preferred an x8086 in the PC instead. Rather they are saying is that for years Intel kept shipping chips that were not the best they could do and they knew it.

Of course I dont have any secret internal docs to support this, but rather is my opinion based up what I do know.
 
You do have some points here.
But actually, i'm not really sure if Pentium-M for the desktop would have been really better than Pentium 4.
Too many people think that Core 2 is a Pentium M on steroids, but my opinion is, if Core 2 is not a new core architecture, then i really don't know what is a new core architecture.
The problem with Pentium 4 is that they hit a wall concerning clock speed due to thermal density issues; otherwise, today we'd have 7GHz Prescott (and 10GHz on the horizon), and such a beast would "own" pretty much everything.
The pipeline was designed for high clock frequencies, sure, but high clock frequency is a way to obtain high performance, perhaps their concept was a bit extreme, but then again, for a lot of time the Netburst architecture has been successful, and even outperformed the competition.
My belief is that P4 was as good as it could get for Intel for a long while, then of course they've seen that their car was fast, but running into a dead end road.. so they changed direction.
 
> Well the code does not necessarily need to be "hand-tuned".

I don't know about other ones but this linpack benchmark
sounds suspicious I will leave the author to clear it out. When
you handle code it, it's easy to do all kind of tricks such as
prefetching, software pipelining, etc. so that your data sits
comfortably in Intel's relatively large 2nd (or even 3rd)
level cache and can feed the CPU fast enough to get high
FLOPS. Most of the current compilers are not there (yet.)
 
Well, it's not even that honest.
It's not that the compiler is written to use the features of Intel Chips.

The Compiler is written to shutdown features in AMD chips but use them if they are Intel.

http://www.swallowtail.org/naughty-intel.html

Now, this article does not discuss every compiler Intel produces, it does give you the distinct impression the Intel software division is doing all they can to slow down AMD chips.
 
I am sorry, benchmarks are incomplete. Without Quake 4 results, it's impossible to say which CPU is better for server operations. Quake 4 is only second to the most common server task, DIVX Encoding. Thank god, this was included in the test.
Nice... poor THG guy that wrote this article is probably crying right now :lol: .
 
Are you serious? They ran synthetics and power consumption. Wow.

This article is a waste of bandwidth.

This is a server comparison. Show some database and web server metrics. This article tells us what we already know: Woodcrest has higher GFLOPS and AMD's HT slaughters the memory benchmarks. Whoopdie-doo. OMG Stop the presses :roll:

Too bad this doesn't mean much when you're talking 100+ active threads and mitigating heavy I/O.

When I get mmy next server, I want to see what's going to give me the most bang for my buck. This doesn't show me anything.

Worthless. :evil:
 
I found it a good article. Yes, it is not a server benchmark, nor purports to be one. It was a clear CPU test, using benchmarks which are CPU bound to compare CPU's as closely as possible, with platforms as comparable as possible. It does not say anything about I/O which is important in servers, nor graphics which are important for workstations (remember Opterons and Xeons are used in workstations big time 😀 ). It compares CPU's and platform power usage. Thats all, and I think it does it well. Should you base your server buying decisions solely on this article? Of course not, but it is an interesting and well done article none the less.
 
The more I read, the more entertaining it gets :)

Under load, however, the Woodcrest uses less power; this can probably be attributed to the micro/macro-op fusion mechanism used in the Core 2 architecture, which significantly increases the efficiency of each core.
These don't realise, micro/macro fusion "improve the core efficiency" and consequently keep the calculation units busy which effectively increases the power consumption.

Jeez!
 
The more I read, the more entertaining it gets :)

Under load, however, the Woodcrest uses less power; this can probably be attributed to the micro/macro-op fusion mechanism used in the Core 2 architecture, which significantly increases the efficiency of each core.
These don't realise, micro/macro fusion "improve the core efficiency" and consequently keep the calculation units busy which effectively increases the power consumption.

Jeez!

Missed it during first read. That is certainly absolute non-sense :roll:
While op fusion certainly can improve performance, it has nothing to do about power usage.... 65nm production and a lower clock has more to do about the lower powerusage under load.
 
Surely, it's 65nm vs 90nm... But words like Micro/Macro op-Fusion are much sexier than numbers like 65, 90. They sound "nuclear". :)