76.8GB/s of memory bandwidth in 2004

Ncogneto · Oct 24, 2001

I never actually stated it would be available with the current Pentium 4 in 2004. You just assumed this. It will definately be available to companies that want it, but noone yet knows who that will be. I certainly hope Intel reaches for some of this bandwidth, but there are never any guarantees.

Please re-read the thread I never made no such assumption, I merely pointed out that such bandwith would be pointless for the p4. You then took great pains to try to demonstrate how the p4 could actually use this bandwith to which I took issue with

I do believe I clearly stated that to utilize that amount of bandwith Intel would need a completely different core design, but then I do believe Intel has plans for the p4 to ramp to at least 10 gig and well into 2004.

To properly use all 76.8GB/s of memory bandwidth would require a 64-bit FSB with an effective rate of 9.6GHz. The Pentium 4 has a 64-bit FSB that currently operates at 400MHz. If instead of a multiplier, a divider was implemented, we could set the CPU's divider to 3 and have it running at 3.2GHz on a 9.6GHz FSB. Eventually the

Pentium 4's core is expected to scale beyond 10GHz, so the divider may be unnecessary, depending on how long it takes to get there. This would unlock the full potential of 76.8GB/s of memory bandwidth.

Video editing?? Ha, I don't even own a camera!

Ncogneto · Oct 24, 2001

Now is an itntersting time to bring up a solution to the problem ( that is considering that amount of bandwith would be available at that time) How about a 128 bit path with the memory controller on die at full cpu clock? ( hammer anyone?). What a bizare twist of fate that could be?

Video editing?? Ha, I don't even own a camera!

Raystonn · Oct 24, 2001

"... why didn't Rambus just announce what you posted?"

I believe they did.

This is what Rambus had to say:

Yellowstone operates at Octal Data Rates (ODR), transferring 8 bits of data per clock. ODR enables 3.2GHz data rates with a 400MHz clock and provides a scalable path to over 6.4GHz as bandwidth needs increase.

Note that they are indicating PC3200 (3.2GHz) on a 400MHz clock. This scales to PC4800 (4.8GHz) on a 600MHz clock, which is planned for 2004 according to their current roadmap.

A 64-bit PC4800 channel clearly has 38.4GB/s of bandwidth, with dual channels bringing 76.8GB/s.

Note that they were announcing 3.2 - 6.4GHz, not 3.2 - 6.4GB/s. There is a big difference there. One is 64-bit PC6400 RDRAM. The other is either 64-bit PC800, 32-bit PC1600, or 16-bit PC3200 RDRAM. These last examples are quite pathetic in comparison and are not what they announced.

-Raystonn

= The views stated herein are my personal views, and not necessarily the views of my employer. =

Guest · Oct 24, 2001

I don't understand why integration of the memory controller hasn't been done already. Not enough room perhaps, or maybe since up until now it hasn't been necessary due to the relatively poor memory performance from stuff you can actually buy.

I tend to like to take this thought one step further and imagine that someday the RAM will be on die too. That could sure make things simpler, not to mention faster.

I think 128 bit memory interfaces have been avoided due to routing considerations, besides the fact that this requires adding 64 more pins to your CPU package. It all sounds nice in theory, but I sure would hate to be the guy trying to route the traces to a CPU like that, there are already way to many pins tightly packed in there. Say goodbye to 4 layer PCB's and hello to the associated costs of increasing layer counts.

somerandomguy · Oct 24, 2001

I tend to like to take this thought one step further and imagine that someday the RAM will be on die too.

I agree, but I think that's much further off than even integrated graphics. Maybe in 20+ years when we all have quantum computers (quantum RAM?).

"Ignorance is bliss, but I tend to get screwed over."

Guest · Oct 24, 2001

I think it was Infineon I was reading about (others most likely have this as well) that was selling their fab services for embeded DRAM at pretty decent densities, like ~32MB or so on die with custom designs. I don't recall if it was run at full speed, but just thought that it was an interesting approach, and perhaps an inkling of what the future holds. I imagine that the fact that it was DRAM would allow for excellent densities. I just can't imagine that it was runing full speed though due to the huge power requirements of DRAM. Maybe if there were some major breakthroughs with regards to thermal management, then we would have something really interesting.

I could be way off in my predictions however.

Guest · Oct 24, 2001

What in the world is CRAM? I've never heard of it. Do you have any links?

munkey · Oct 24, 2001

Back to the latency issue for a bit. I’m just trying to make sure I’m clear on this. With SDRAM the latency increases linearly with the clock speed, effectively meaning that while the clock speed increases the latency within a given time frame does too. But, the latency per X number of clock cycles does not change.

Yet with RDRAM the amount of latency stays static. Meaning that as the clock speed is increased the latency stays the same so that you get less latency within a given timeframe, or a given amount of clock cycles.

<A HREF="http://www.textfiles.com/100/actung.hum" target="_new">Relaxen und watch das blinkenlights...</A>

mala · Oct 24, 2001

I read the first 5 pages and the last 2, so pardon me if I'm repeating something from the other posts.
All messages in this thread is based on the assumption that by 2004 we will have 76.8GB/s of bandwidth.

That is not what Rambus said, did they?
AFAIK four different technologies is said to be available by 2004.

double channel RDRAM
RDRAM ODR technology
150Mhz FSB
64-bit channels

Raystone multiplies all numbers together and gets 76,8GB/s.
Just because the technologies exist one by one doesn't mean that all of them will be used in the same system.

I'm thinking more like.

today: 2 channels x 16bit x 2x memory data burst x 4x bus x 100Mhz
tomorrow: 2 channels x 32bit x 2x memory data burst x 4x bus x 133Mhz fsb Faster but more expensive
after that: 2 channels x 64bit x 2xmemory burst x 4x bus x 133Mhz fsb Even faster, very expensive
2004: 2 channels x 16 bit x 8x memory burst x 4x bus x 100Mhz fsb a bit slower, very cheap

Or something like that. That was just an example of combining the factors.

The maximal theoretical bandwidth of todays processors is about 30 bytes/cycle (I didn't count. But something like that). So, in theory you could starve the 76.8GB/s memorybandwidth as described in post #1 with a 10GHz+ processor. But in practice you would have to write extremly stupid handwritten assemblycode to make it happen. Most of the time you move a block of memory into the cache work with it for a while (millions of clockcycles) and then flush it out to main memory proceeding to the next block.

So; I think that any memory subsystem that can deliver a couple of bytes/clock would be sufficient.

/Markus

LoveGuRu · Oct 24, 2001

lol, mala so many posts here with lots of BS and i finally got it after your post.
Thx a'lot.


*******
*K.I.S.S*
*(k)eep (I)t (S)imple (S)tupid*
*******

Guest · Oct 24, 2001

You don't even suggest any technical reasons why there is a problem with what has been said.

While the figure given in this thread may not be quoted explicitly in the manufacturers announcements. It's speculation based on what was announced combined with what has been done many times in the past, with a supporting argument behind each assumption clearly spelled out. I for one have the opinion that the poster made fairly reasonable assumptions, and therefore a fairly believable argument.

Which part of what the poster assumed do you have problem with? You don't think that RDRAM will allow a 64 bit interface? You don't think RDRAM can have increased clock rates? You don't believe the manufacturers claims that an ODR clocking technique can be employed? Dual channels already is something done by the controller, and has nothing to do with the memory technology per se, besides that is already implemented.

I really don't have any problems with the fact that you may think the posters figures are complete BS, or whatever. What I take issue with is the dig you take at the argument without attempting to make counterpoints.

You did make some good semi OT points in your post. However, either due to my own ignorance, or perhaps my superior intelligence, I take issue with some of them. See below.

I think your argument was: we have cache to use, or something so we won't be needing big memory bandwidth. I can tell you that the only way to gaurantee optimum performance is by having a memory subsystem which can provide the bandwidth to the CPU which it requires to have data available for every execution cycle (you said 30 bytes/ cycle, and that sounds reasonable, but I wouldn't know). By their very nature the caching techniques in place today can't fit that bill due to the fact that at some point a RAM access will be required. When that happens your CPU waits.

I agree that the case where absolute maximum bandwidth will very rarely be needed with good coding, and sufficient cacheing. I also think as CPU's continue to get faster we have to have memories keeping up speedwise or cache size is going to get out of hand. Besides if it was available wouldn't the best case scenario of maximum bandwidth be best without the need for no stinking cache?

Kevin

bum_jcrules · Oct 24, 2001

Computational RAM

<A HREF="http://www.ee.ualberta.ca/~elliott/cram/" target="_new">http://www.ee.ualberta.ca/~elliott/cram/</A>

<A HREF="http://citeseer.nj.nec.com/elliott92computational.html" target="_new">http://citeseer.nj.nec.com/elliott92computational.html</A>

Did you see a little naked man running around with $100? - The Golden Child :lol:

Raystonn · Oct 24, 2001

Back to the latency issue for a bit. I’m just trying to make sure I’m clear on this. With SDRAM the latency increases linearly with the clock speed, effectively meaning that while the clock speed increases the latency within a given time frame does too. But, the latency per X number of clock cycles does not change.

Yet with RDRAM the amount of latency stays static. Meaning that as the clock speed is increased the latency stays the same so that you get less latency within a given timeframe, or a given amount of clock cycles.

For SDRAM, the number of clocks of latency increases as you scale it up in speed. Increased speeds means each clock takes less time to complete, but there are so many more clocks of latency that real-time (measured in nanoseconds) latency actually still increases.

For RDRAM, the number of clocks of latency remains static as you scale it up in speed. Increased speeds means each clock takes less time to complete, so real-time (measured in nanoseconds) latency decreases.

-Raystonn

= The views stated herein are my personal views, and not necessarily the views of my employer. =

LoveGuRu · Oct 25, 2001

why would that be the case with RDRAM, doesnt it use the same technology to increase speed?


*******
*K.I.S.S*
*(k)eep (I)t (S)imple (S)tupid*
*******

Raystonn · Oct 25, 2001

why would that be the case with RDRAM, doesnt it use the same technology to increase speed?

RDRAM uses completely different technologies than SDRAM for various functionality, such as signaling, etc. It was designed from the beginning to scale to extremely high speeds. SDRAM was designed from the beginning to deliver low latency at a certain level of bandwidth. That bandwidth has now been surpassed and, because of this, the technology behind it is now obsolete.

[opinion]Standards comittees do not have the best track record for developing technologies that will scale well far into the future. Private companies are best at this, as it benefits them to make the most of their R&D dollars. Standards comittees benefit by having to constantly revisit old standards to update them, in much the same way that any bureacracy functions: the first goal of any bureacracy is to increase its own size and ensure its own survival. After this they worry about that for which they were created.[/opinion]

-Raystonn

= The views stated herein are my personal views, and not necessarily the views of my employer. =

FatBurger · Oct 25, 2001

Nope, RDRAM and SDRAM are very different technologies.

I post so you don't have to!
9/11 - RIP

somerandomguy · Oct 25, 2001

Just looking at Rambus' latest roadmaps, it looks like 76.8Gb/s is more likely by 2006. Just IMO.

"Ignorance is bliss, but I tend to get screwed over."

LoveGuRu · Oct 25, 2001

ok suppose i enderstood that, then what your saying there isnt actually any technology to copete with RDRAM?


*******
*K.I.S.S*
*(k)eep (I)t (S)imple (S)tupid*
*******

Raystonn · Oct 25, 2001

A couple years down the road nothing that is out today or has even been mentioned/announced so far will be able to compete with what is on Rambus's roadmap. There may be technology that noone has yet publicly announced. But until something else is announced, no there is no competition.

-Raystonn

= The views stated herein are my personal views, and not necessarily the views of my employer. =

LoveGuRu · Oct 25, 2001

what about QDR or more advanced technoongy?


*******
*K.I.S.S*
*(k)eep (I)t (S)imple (S)tupid*
*******

BrotherJohn · Oct 25, 2001

Hi Raystonn

I have been following this thread with much interest , unfortunately my knowledge of memories is not in-depth enough to understand it all , could you post some links that would help me get up to speed!
Thanks

It'll be ready tomorrow

bum_jcrules · Oct 25, 2001

Good morning everyone!

What is this...day 3 or 4 of this discussion?

Anyway...

LGR(LoveGuRu)...

QDR has more conections. Each time you add more bandwith you have to add more connections to the other chips. Parallel verses series. In series it goes in one side and out the other, one in and one out. Compare that to SDRAM/DDR/QDR/ODRSDRAM, there are always at least two connections in and out.

Right now the total number of latencies in RDRAM is higher than SDRAM and DDR. SDRAM 2-2-2 aka CAS2/CL2 and DDR 2-2-2-2 CAS2/CL2. As you see there is one more latency added each time you scale up. So if we go to QDR there should be another one added.

Ray or FB correct me if im wrong...or fix my inacuracies.

BTW I forget how may latencies are in RDRAM... anyone have the # of steps it takes to complete one cycle?

He's 6'4", 6'9" with the afro - Fletch :lol:

bum_jcrules · Oct 25, 2001

I found one of the links for DDR II aka EDDR...or last that I knew it was going to be called. I don't know if this is still the next progression on the DDR roadmap. Next step might be 333MHz standard or maybe embedded SRAM like this link shows.

<A HREF="http://www.lostcircuits.com/memory/eddr/6.shtml" target="_new">http://www.lostcircuits.com/memory/eddr/6.shtml</A>

Jedec just set up some guidelines for DDR II and DDR III. Don't ask me what those standards are... I have no idea what they are. Anyone out there know???

<A HREF="http://www.lostcircuits.com/memory/eddr/8.shtml" target="_new">http://www.lostcircuits.com/memory/eddr/8.shtml</A>

This is a few pages further in the article where it is called DDR II...

He's 6'4", 6'9" with the afro - Fletch :lol: Edited by Bum_JCRules on 10/25/01 10:18 AM.

FatBurger · Oct 25, 2001

Let me add to what you said by just saying that the crossing point (according to Raystonn, I don't know personally) is PC2100/PC1066. That's when the latencies are the same, and RDRAM starts to pull ahead. Which, coincidentally, is right now.

And yes, this is day 4 of this discussion. This thread exploded, I think it had like 50 posts before I even got here

I post so you don't have to!
9/11 - RIP

charliec2uk · Oct 25, 2001

How can you actually calculate latency? DDR 2100 has lower latency than PC800 RDRAM okay, but RDRAM has far more bandwidth per pin c. 100Mbytes/s compared to around 33 Mbtyes/s. So what is the latency of PC1066 compared to PC2700.

Charlie

Democracy Bernad, it must be stopped!

76.8GB/s of memory bandwidth in 2004

Distinguished

Distinguished

Distinguished

Guest

Guest

Distinguished

Guest

Guest

Guest

Guest

Distinguished

Distinguished

Distinguished

Guest

Guest

Distinguished

Distinguished

Distinguished

Distinguished

Illustrious

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Illustrious

Distinguished

Share this page