My take on Intel vs. AMD

moozh · Mar 9, 2006

With all the hooplah going on with respect to the current Intel Developer Forum I would like to offer my opinions.

First off I am very happy that Intel has changed their focus and done a complete 180 degree turn from marketing initiatives to technical initiatives.

1. My personal preference between companies

I personally own an AMD K7 system and an Intel Netburst system. For my most recent system I chose the Intel P4 550 3.4GHz mainly because I got a very good deal on it. I am not unhappy with the performance as I like to multitask and perform a large amount of compression/decompression applications. You really notice the difference between an AMD Venice-based processor and an Intel Prescott processor when you're pumping a 1.6GB DivX movie through the processor.

However I also do a lot of gaming and the performance is definitely worse than what it could be.

As a personal preference I will always prefer Intel over AMD so long as they continue to promote and advance new technology as they have been. I gained a lot of respect for AMD when they designed their K8 architecture. I lost most of my respect for AMD when they flatly said they have no plans for upgrading their technology because it was superior to Intel. They should have considered improving the design of their SSEn floating-point units to make better cost-effective workstations, and investing in 65nm process technology would further improve the thermal envelope of their processors, improve the ability to overclock, increase the cache sizes of their processors and finally improve the manufacturing yields to lower costs.

However, from a business standpoint I understand why AMD is taking its time. It's the same reason that Intel stuck with its Netburst technology for so long (which started out strong but by the end Intel was more concerned with propaganda than real performance to maintain their market share). The market for personal computers is slowing down every year and R&D costs as well as manufacturing costs need to be covered before they can afford to make advancements.

I am glad that the microprocessor market is as competitive as it is and while the market will sway back and forth from Intel to AMD every few years, its the competition that drives the technology.

One of the reasons I admire Intel is, as a developer, I was able to order online free 80x86 developer manuals, which comprise of 5 full text-book size manuals referencing the 80x86 machine code language that Intel pioneered and AMD adopted for compatibility. I was shipped these free of charge. I did not even have to pay shipping, and received them less than a week later.

2. Why it's good to have a choice

The other reason I admire Intel is, as a system designer, Intel provides an extensive amount of resources for designing systems (from low end home PCs, office PCs, workstations, and servers) including tested memory lists, tested controller cards, and in-depth datasheets for everything. From a design and engineering standpoint, Intel is like the open-source of hardware.

When I want to build, say, 25 office systems for a business, it is much easier warranty and support wise to pick an Intel processor, an Intel motherboard with an Intel chipset and southbridge, using Intel graphics and and Intel network card, and going through Intel directly for all support, than it is to buy all those semiconductor components from 4-6 different companies on an AMD system and hope they work.

However, that's just the medium business market segment. Although I personally will buy Intel hardware for my own use--I gladly design systems for other people with AMD processors. That is, friends, family, and customers. At this time and for the past while I have simply been able to build faster systems cheaper using AMD Sempron's (754-pin) for entry level systems and AMD Athlon 64's for mid-end systems. On the high-end it goes either way depending on usage.

3. Integrated Memory Controller vs. Memory Controller Hub

One of the biggest differences between Intel and AMD from a design standpoint is the memory controller. As we all know, AMD's is integrated into the chip while Intel's is based on the motherboard.

Many people believe the integrated solution is completely superior from a performance standpoint, and this is not true.

There are mainly two prime types of system utilization:

The first is sporadic CPU to memory usage, which is what happens in any open-ended, user-driven applications like Windows itself, Office and productivity software, and all games. The bandwidth capabilities of even DDR400 memory is rarely saturated in this type of use. What's important is to have a large amount of memory and a low latency processor.

There are other types of latency which improve this type of processing other than the memory latency (which is always accessed in bursts or, often, a series of bursts, and the latency differences are very minor). These latencies comprise of the L2 and L1 cache latency... in addition to fetch, decode, execute and retire latencies which take the form of the processor's pipeline (which in the case of a K8 processor is less than half the length of a Netburst architecture pipeline).

In this type of usage pattern, hard drive access is rare, and most of the data is directly fueled from I/O to video memory over the HyperTransport bus. So for this type of use, there is little to be desired.

The other type of system utilization is sustained I/O while processing data. This is what will happen when you do a lot of content creation such as audio and video editing (especially HD video), engineering and CAD, 3D animation, etc. Systems that utilize higher bandwidth I/O setups such as buffered SCSI RAIDs (especially with multiple controllers over PCIe or PCI-X bus interfaces) will have extreme amounts of bandwidth going from I/O to memory, from I/O to video memory, and from CPU to memory primarily. Due to the larger amounts of data being transferred into system memory, the bandwidth of the memory is now critical. On a K8 system, since only the processor can access the memory, and the microarchitecture is limited to DDR400 memory without overclocking, this soon becomes a bottleneck.

Intel chipsets utilize DMA (Direct Memory Access) technology which allows I/O devices to transfer directly to memory without having to bog down the processor's bus interface.

Even though these systems will still be less responsive than an AMD counterpart due to the Memory Controller Hub architecture, it is a clear advantage in high bandwidth setups.

I have not read anything that would lead me to believe Intel is planning on implementing an integrated memory controller in the future. And I am glad because in my opinion it is a bad idea. I guess my whole point to this in depth description is that the integrated memory controller is not the prime constituent of the AMD advantage, it is the other aspects of the processor core microarchitecture. And I believe if we were to compare two versions of a K8 processor, one with an integrated controller and one with a seperate chipset controller, the decrease in performance would be minor in some areas while there would be an increase in performance in other areas.

However, the fact that AMD has to come up with a whole new socket and a new lineup of processors in order to change memory support is a pretty big downfall to the design and it forces unnecessary upgrades. I feel sorry for all the people who bought good 754-socket AMD processors and were quickly abandoned on all fronts with no upgrade paths because AMD had to change its entire socket and processor design for dual channel memory support. All you need on an Intel system is a new board. For people who invest $500 or more in one of the top of the line chips this is a big deal.

Once DDR2-800 comes out--what next? Quad-channel DDR2? Rambus XDR or something new from them? Even DDR3? If AMD adopts new memory too soon they screw over any of their customers who recently bought a high-end chip hoping for a good upgrade path. If they take too long in adopting it they lose a bandwidth performance advantage.

4. DDR vs. DDR2

There are design advantages to DDR2 from an engineering and manufacturing standpoint. But one thing I don't understand is why people always refer to DDR2 as being higher latency than DDR. It's the SAME LATENCY! This is because CAS latency is expressed in clocks, not time. DDR2-800 has 800 million clock cycles a second, and DDR400 has only 400 million. So each DDR2-800 clock consumes half as much time as a DDR400 clock. So while the effective fetch time for a single value is the same, DDR2 is capable of a boost in latency and bandwidth when bursts of data are requested.

CAS Latency (aka CL) is the time, expressed in memory clocks, that it takes in order for a memory request to appear on the bus. When you timeline it, it basically works like this:

In DDR400 memory running at a CL of 2, requesting 4 QWORD (64-bit) values looks like this...

Clock #1 - CPU requests DATA1: a READ request is placed on the control bus, the address of DATA1 is placed on the address bus, and the data bus is ignored.
Clock #2 - CPU requests DATA2: READ is requested, address is given, data bus is ignored.
Clock #3 - CPU reqeusts DATA3: READ is requested, address is given, data bus contains DATA1.
Clock #4 - CPU requests DATA4: READ is requested, address is given, data bus contains DATA2.
Clock #5 - CPU stops requesting: no command is put on the control bus, no address is put on the address bus, data bus contains DATA3.
Clock #6 - CPU stops requesting: no command is put on the control bus, no address is put on the address bus, data bus contains DATA4.

Therefore, the total amount of memory clocks needed to retrieve those 4 QWORD values is 6 clocks, or n+CL where n is the number of requests given. In realtime, this computes to 0.000000015 seconds or 15ns.

The same timeline with DDR2 memory running at 800MHz @ CL4:

Clock #1 - CPU requests DATA1: a READ request is placed on the control bus, the address of DATA1 is placed on the address bus, and the data bus is ignored.
Clock #2 - CPU requests DATA2: READ is requested, address is given, data bus is ignored.
Clock #3 - CPU reqeusts DATA3: READ is requested, address is given, data bus is ignored.
Clock #4 - CPU requests DATA4: READ is requested, address is given, data bus is ignored.
Clock #5 - CPU stops requesting: no command is put on the control bus, no address is put on the address bus, data bus contains DATA1.
Clock #6 - CPU stops requesting: no command is put on the control bus, no address is put on the address bus, data bus contains DATA2.
Clock #7 - CPU stops requesting: no command is put on the control bus, no address is put on the address bus, data bus contains DATA3.
Clock #8 - CPU stops requesting: no command is put on the control bus, no address is put on the address bus, data bus contains DATA4.

Therefore to receive 4 QWORDs from memory it would take 8 memory clock cycles. Therefore the n+CL equation remains true. However, because the memory bus is running at 800MHz, the realtime it takes to receive the data is 10ns. This makes sense because 8 DDR2-800 clocks equals 4 DDR400 clocks in realtime, which explains why 6 clocks would be 50% longer in realtime.

Of course, this comparison does not factor additional latencies such as the MCH and caching latencies. However these are seperate from strict DDR to DDR2 comparisons. There is and was nothing wrong with DDR2 in terms of real latency. It is the Netburst architecture which limited Intel performance in games. Which is why I believe Conroe will show a significant improvement in games vs. current AMD architecture. However, AMD should have their AM2 processors out by the time Conroe is out, if not sooner, so it will be a more apples to apples comparison then.

moozh · Mar 9, 2006

Anyway, I think I covered everything I wanted to say on the subjects that have been floating around. I welcome any competitive debate or questions regarding my suppositions. No flaming, please.

chime · Mar 9, 2006

Hi Moozh,

You haven't touched the subject of Hypertransport yet. I went to hypertransport.org and read a few documents in there. It seems the AMD solution is better because hypertransport allow the CPU to speed up the communication in a bi directional full duplex. While the Intel FSB is still a half duplex bus.

Can you explore a little bit more for an analog guy on the effect of hypertransport? What about if Intel changes the FSB to a PCIexpress like without the frame to reduce overhead, would it improve the system?

Thanks.

Chime

psyno · Mar 9, 2006

Hey moozh, nice post. I disagree with you on some of the memory issues.

Systems that utilize higher bandwidth I/O setups such as buffered SCSI RAIDs (especially with multiple controllers over PCIe or PCI-X bus interfaces) will have extreme amounts of bandwidth going from I/O to memory, from I/O to video memory, and from CPU to memory primarily. Due to the larger amounts of data being transferred into system memory, the bandwidth of the memory is now critical. On a K8 system, since only the processor can access the memory, and the microarchitecture is limited to DDR400 memory without overclocking, this soon becomes a bottleneck.

This argument is a fallacy. That bandwidth is in use either way. Moving the memory controller around doesn't change bandwidth, it changes latency. The CPU is very sensitive to latency. Disk I/O is not. And even so, even with a very high bandwidth RAID (sustained 320 MB/s maybe?) bandwidth usage by the I/O system is still a full order of magnitude less than capacity (one-way 3.2 GB/s for dual-channel DDR 400).

However, the fact that AMD has to come up with a whole new socket and a new lineup of processors in order to change memory support is a pretty big downfall to the design and it forces unnecessary upgrades.

LGA 775?

(edit: spelling)

turpit · Mar 9, 2006

Very nice, very intelligent post. I dont agree with every part, but it was clearly well thought out, prepared and presented.

Prepare to be assaulted by the AMD fanboys who will probably not read the whole article, but nonetheless will be more than willing to give you their opinions. Fear not though, the Intel fanboys will then retort on your behalf, with their opinions. Then, the thread will degenerate into a factless flame war, which will bear no resemblence what so ever to your well thought out post

Sorry man.

Guest · Mar 9, 2006

Another fanboy post. FUCK!

turpit · Mar 9, 2006

Another fanboy post. ****!

How do you come to this conclusion? And, might I query as to fanboy of what?

Atolsammeek · Mar 9, 2006

I think moving the Memory controler was the best idea amd did and yes It dose speed up the computer.

I would like you to go back to the 1980s When people had this same argument with the Cache and onboard Cache. I remember that one oh so well The motherboard was cover with cache chips. So it was CPU then Motherboard to onboard cache. Then they move it to slotted cpu which speed things up. Then to onboard cache Why to speed up the computer.

Now amd took the next step putting the memory control on the cpu. Which Can run faster where the mother board slows it down. That why Amd Can use the ddr.

So next time please read a little history of computers and see how computer change over the years.

butitoy · Mar 9, 2006

Another fanboy post. ****!

hey!! no flaming man. this is a very intelligent post. if you don't disagree.
you post your facts and or ideas. don't just conclude.

maybe your post will help us here too.

P.S.

i'm using a p4e 3.2 @ 3.52 and an AMD64 3200+.
tell you. they both have strength and weakness.

peace.

Guest · Mar 9, 2006

Another fanboy post. ****!

How do you come to this conclusion? And, might I query as to fanboy of what?
Just ignore fluff and the rest of us for now. We've all cought this fanboy-phobic virus.

Yep. I didn't really mean to be rude. I'm generally a nice guy. I just get upset about all this speculation over something that's ON PAPER. I can't believe you take this all so seriously...

fainis · Mar 9, 2006

ladies ..ladies ..relax.....my ears will explode......
same of our friends need tehnical support not ladies fight over amd and intel.....just kiss your cpu intel or amd whatever and try to concentrate over same real problems...same dude has had his card exploded maybe we should concentrate an that and give him same support...nobady posted to that yet 8O

turpit · Mar 9, 2006

Exploded? Like this?

http://video.google.com/videoplay?docid=1999499897470398424&q=explodes

http://video.google.com/videoplay?docid=5393904704265757054&q=cpu

Totally fake, but slightly humerous

fainis · Mar 9, 2006

:lol:

endyen · Mar 9, 2006

Some is personal choice, only the one real error.
I too admire Intel for some of thier work, though thier intelcentric compilers is not part of that list.
Since the PFTC found them guilty of monopoly practice, they have lost some of thier shine.
Since ECS is now the largest producer of Intel mobos, I avoid them. Not that they ever were a prefered board, due mostly to poor tweakability.
I wouldn't wish Intel graphics on my worst enemy, when Ati and Nvidia offer chipsets that work so well with Intel chips.
Why pay extra for Intel nic, when most chipsets include one or two, and reasonable product can be had for 1/2 the cost.
Your error is about DMA.
When a DMA call is initiated, the chip (Intel or Amd) goes into a reduced work state. It can, at best operate at 20%, so DMA has no effect on efficientcy.
While running heavily memory related progs, the latency affects system performance greatly, since the chip is often in a wait state. The ODMC has reduced this to less than 1/2 the effective latency for Amd chips.
Remember, the netburst architecture initiates more mem calls.
The only place where Intel chips are competative today, is where the software has been heavily optimized for thier chip only. Most of that type of software is overpriced, while freeware, or cheaper product works better on Amd systems.
Companies that continue to optimize solely for Intel will get a rude awakening soon.
Even where Intel architecture/software optimization favors a certain usage, the comperably priced Amd chip is usually as good.
If merom is the better architecture, that may change some things slightly.
No matter what happens though, there will always be the JFTC.

endyen · Mar 9, 2006

You have to wonder what the judge is going to say.
I mean, if those benches aren't right, Intel is gonna pay. If those benchmarks are right, do you think the judge will think crushing the oposition is fair?

Search

My take on Intel vs. AMD

moozh

Distinguished

moozh

Distinguished

chime

Distinguished

psyno

Distinguished

turpit

Splendid

Guest

Guest

turpit

Splendid

Atolsammeek

Distinguished

butitoy

Distinguished

Guest

Guest

fainis

Distinguished

turpit

Splendid

fainis

Distinguished

endyen

Splendid

endyen

Splendid

TRENDING THREADS

Latest posts

Moderators online

Share this page