AMD or Intel for the highest CPU-Memory Bandwidth?

joelbaby

Distinguished
May 18, 2007
15
0
18,510
Hi,

I am looking for the highest possible native bandwidth between CPU and Memory (latency not important) .

I need to read only about 250Mb of data from Memory to CPU, but it needs to churn in and out very rapidly. The data itself does not change. I continuously loop through the whole 250Mb data set doing a string of mathematical computations.

What is a good CPU/Mobo/RAM combination to use for this task without overclocking?

From my investigations - a Core2Duo would suit 533MHz DDR-2. I heard that AMD currently supports 800MHz DDR2 - is this correct?
To me - this would indicate that AMD is the 'faster' choice for my limited scenario. I don't do gaming.

Should I seriously consider overclocking - could I double the bandwidth?

I realise that I could wait for an AMD chip with HyperTransport 3.0 to be released - but when will these actually be released? Will HT3.0 work with DDR2 chips or will it require DDR3?

Thanks,

Joel
 

Mondoman

Splendid
It would help if you supplied more details. For example, how do you know your app is memory bandwidth-limited? If it's extremely specialized, it may be worth looking into running it on a graphics processor, as they tend to be specialized for high-bandwidth operations.
 

Track

Distinguished
Jul 4, 2006
1,520
0
19,790
Hi,

I am looking for the highest possible native bandwidth between CPU and Memory (latency not important) .

Why?

From my investigations - a Core2Duo would suit 533MHz DDR-2. I heard that AMD currently supports 800MHz DDR2 - is this correct?
To me - this would indicate that AMD is the 'faster' choice for my limited scenario. I don't do gaming.
Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.
And Core 2 Duo supports any memory, there is no limit.
If u want, u can buy DDR2-1265 memory and have 10,100MB/s of bandwidth.
Should I seriously consider overclocking - could I double the bandwidth?

Should u consider overclocking? Of course.. u dont have to buy DDR2-1265 memory, but DDR2-1000 memory and overclock to DDR2-1265.

I realise that I could wait for an AMD chip with HyperTransport 3.0 to be released - but when will these actually be released? Will HT3.0 work with DDR2 chips or will it require DDR3?

I wouldnt wait for HyperTransport 3.0 if I were u. AMD probably isnt going to release it any time soon.. if at all.
 

Mondoman

Splendid
...
Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.
Now this is just silly. :wink:

...
And Core 2 Duo supports any memory, there is no limit.
If u want, u can buy DDR2-1265 memory and have 10,100MB/s of bandwidth.
Come now. Since the memory bus feeds into the FSB, the max useful memory bandwidth is pretty much limited to the FSB bandwidth, as the OP implied.

Should I seriously consider overclocking - could I double the bandwidth?
There's a good chance you could increase the bandwidth by 50% (to 400MHz FSB). Newer C2D's don't overclock as well as the originals, so YMMV.
 

JonathanDeane

Distinguished
Mar 28, 2006
1,469
0
19,310
lol this thread is just lame. I usualy dont say things like this but some one who knows enough to say "I have such an amount of data and it runs in a loop" should know enough about CPU's and bandwidth to know whats going to work best for them...


Edit: Just a thought if the data set isnt changeing then why bother running it ? just save it to a disk and be done with it... unless your just interested in a benchmark program in that case AMD would be the winner.
 

quartzlock

Distinguished
Apr 16, 2006
18
0
18,510
Well, I assume you are programming the app yourself and you have enough understanding of the caching scheme of both AMD and Intel systems, so you should decide based on the raw data:

Intel's fsb goes to 1333Mhz @ 64bits = 10.6GBs
AMD's IMC @ DDR2-800 dual channel = 12.8GBs

Of course these are theoretic limits and keep in mind the much lower latency of the AMD system. With the intel system it makes no sense to use modules faster than DDR2-666 because the system is bottlenecked by the FSB but you can use faster modules and choose a lower latency still at 666 to gain some speed by keeping 1:1 fsb/ram ratio(seems to work).

With these facts you would say the amd will win: maybe not....
If your app is only reading the data and not modifying it in an iterative way it becomes important if you are doing this with sse2 or not. AMD K8 doesn't load 128bits the same way as c2d does and if you are using sse2 assembly, the whole gain of the amd system could get lost (bottlenecked) by amd's inferior sse2 performance compared to c2d.

More specifics about the algo 'd shed some light about the last..

Salud!!!!
 

quartzlock

Distinguished
Apr 16, 2006
18
0
18,510
It would help if you supplied more details. For example, how do you know your app is memory bandwidth-limited? If it's extremely specialized, it may be worth looking into running it on a graphics processor, as they tend to be specialized for high-bandwidth operations.

In fact the graphics engine could give a times better performance depending on if the algo he is running can be implemented in a GPU. viva GPGPU!!

Edit: Just a thought if the data set isnt changeing then why bother running it ? just save it to a disk and be done with it... unless your just interested in a benchmark program in that case AMD would be the winner.

Maybe the data (source) is not changing but the result does..... I guess he is iterating ala FFT or something similar.
 

joelbaby

Distinguished
May 18, 2007
15
0
18,510
Hi,

thanks for the answers. As quartzlock says.. I am iterating through the same data with varying calculations to arrive at different results.

I am not the programmer .. but if I was I would spend time trying implement the GPGPU solution.

In answer to jonathan - Just because I know what I want to do.. doesn't mean I know about the ins-and-outs of current processors. Henry Ford said he could find the answer to any question - and he did this by surrounding himself with experts in different fields. That's one of the great advantages of this forum.

I have decided to go the Intel route. I have not overclocked before - but from the information I read on the different sections of this forum - with a decent cooling fan, and slightly faster rated memory... I can experiment with increasing the FSB speed.


Thanks,

Joel
 

ninjaquick

Distinguished
Jun 22, 2006
215
0
18,680
AMD hands down has the the highest speed connection with the ram. Thanks to the IMC, data has less distance to travel from the memory to the CPU. Intel may have an fsb of 1333, but it still cant compare to an AMD mem bandwidth,
 

turboflame

Distinguished
Aug 6, 2006
1,046
0
19,290
Hi,

I am looking for the highest possible native bandwidth between CPU and Memory (latency not important) .

Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.
And Core 2 Duo supports any memory, there is no limit.
If u want, u can buy DDR2-1265 memory and have 10,100MB/s of bandwidth.

I wouldnt wait for HyperTransport 3.0 if I were u. AMD probably isnt going to release it any time soon.. if at all.

*sigh* :roll:
 

Track

Distinguished
Jul 4, 2006
1,520
0
19,790
Everything you post is trash; we should nickname you the forum garbage man. :lol:

No, the garbage man is the person who takes out the trash, not brings it in.
Then again, ur still here so i must not be doing a very good job.. :lol:
 

kwalker

Distinguished
May 3, 2006
856
0
18,980
If you’re interested there is a new chipset in town the G35/P35 Intel chipset. Although not new to this forum or other sites the venders are slow to stock this motherboard.
http://www.asus.com/products.aspx?l1=3&l2=11&l3=534&l4=0&model=1646&modelmenu=1
I can’t do the reviews justice so I'll just link a site for now.
The bandwidth is increasing proportionally to the FSB eliminating the walls and straps seen by its predecessors.
With DDR3 on the horizon bandwidth bottlenecks will most defiantly not be an issue
http://www.xtremesystems.org/forums/showthread.php?t=144199
http://www.xtremesystems.org/forums/showthread.php?t=143133
 

Track

Distinguished
Jul 4, 2006
1,520
0
19,790
...
Dosent matter what u do, buying an AMD CPU will give u horrible performance in any case.
Now this is just silly. :wink:


Why is it silly? Youre not going to get Core 2 Duo performance with an Athlon 64 X2 no matter how fast ur memory is.
 

quartzlock

Distinguished
Apr 16, 2006
18
0
18,510
I guess you should analyze the app deeply and find the real hotspot. For pure cpu performance at the moment c2d is the answer, but if the hotspot is the membandwidth you should go AMD without hesitation, barcelona is around the corner anyway which means you can invest in an am2/+ in advance. The whole latency issue is realy important here to get a system bandwidth closer to the theoretic limit.
 

shadowmaster625

Distinguished
Mar 27, 2007
352
0
18,780
What you really need to do is run your application on a C2D system and an X64/X2 system. If you can do that and post some times here, we can then tell you which system it runs better on.

If your not concerned about cost, the P35 would most likely give the best results. (7000~8000 MB/s Sandra) But if you want a good cheap $50 mobo that is going to still perform admirably (6300 MB/s Sandra), go with the AMD. AMD's design is much more cost effective.
 
G

Guest

Guest
yes thats why C2D run faster then AMDlol. C2D are better then anyAMD right now. until AMD comes with somthing new
 

flasher702

Distinguished
Jul 7, 2006
661
0
18,980
Keep in mind that latency will affect bandwidth. Since the CPU can't hold 250mb of data in it the read transaction is actually broken up into pieces. Inbetween each transaction you take a penalty for latency (even if it's a "sequential" read I think because it's a matrix not a linear storage medium like a disk drive).

In very general terms the CPU will pull a few MB from the RAM, process it, possibly write some results back to the ram, and then read the next few MB. If there is a write transaction than you take a 2*latency penalty to bandwidth.

AMD has the fastest peak theoretical memory bandwidth that is separate from it's HT bus but it doesn't sound like the HT buss will affect your application. At stock the AMD solution with some very nice DDR2-800 should provide the best memory performance. However C2D systems have been known to clock up to 500mhz FSB and run with DDR2-1000 which would put a stock AMD system to shame but with reliability and accuracy obviously being very important to you this might not be the best thing to bank on.

Depending on how much pull your company has you should ask some OEMs to send you some evaluation units of different configurations and benchmark your application on them. If you could get Dell and HP to each send you 1 AMD and 1 Intel system then you could tell US the answer to your question and we would all appreciate it ;)

AMD isn't in any hurry to roll out DDR3 and have no plans to implement it "until OEMs are asking for it". They will let Intel plow the way as they did with DDR2. The first round of DDR3 products will be expensive, have no performance advantage, and use less electricity. Think of it as a repeat of DDR2. Also, I don't think HT3 will benefit your application. It's more for multi-socket and/or IO intensive systems. So I wouldn't worry about it if I were you.

If you are sure that an particular AMD chip will provide enough crunching power and that memory bandwidth is the biggest determining factor of performance for you application AND you can't afford an OEM server-class rig then I would get the AMD chip and slightly OC it if I were you. Outside of latest (really expensive) Intel hardware and/or extreme OCing it should be the fastest.
 

antikristuseke

Distinguished
Jun 15, 2006
66
0
18,630
yes thats why C2D run faster then AMDlol. C2D are better then anyAMD right now. until AMD comes with somthing new

i was only refering to memory bandwith, not overall processing power. Right now for pure processing power C2 has the upper hand.
 
G

Guest

Guest
okay then what would you do with memory bandwidth if your processor can't crunch the data