G
Guest
Guest
Archived from groups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel (More info?)
In comp.sys.ibm.pc.hardware.chips George Macdonald wrote:
> On Thu, 07 Apr 2005 00:01:31 GMT, Robert Redelmeier wrote:
>>Frankly, I'm a little surprised no-one runs any latency
>>benchmarks on RAM. A little pointer-chasing exercise isn't
>>hard to write, and would be very revealing.
> Dave Wang has discussed it in some detail - one of his pet
> subjects I believe. IIRC he was measuring round-trip times
> the "hard" way with probes.
Well, the deed is done (code below). Perhaps not as sharp as
bus-snooping, but at least this gives program-visible read latency:
Latency System CPU@MHz mem.ctl RAM
ns
144 P3@1000 laptop SO-PC133?
148 2*P3@860 Serverworks ??
178 P4@1800 i850 RDRAM
184 K7@1667 SiS735 PC133
185 P3@600 440BX PC100
217 2*Cel@500 440BX PC90
234 P2@350 440BX PC100?
288 P2@333 440BX PC66
I do need to find & test some more modern systems, but I'm
underwhelmed by the slowness of latency improvement.
compile: $ gcc -O2 lat10m.c
run: $ time ./a.out [multiply user time by 100 to give ns]
/* lat10m.c - Measure latency of 10 million fresh memory reads
(C) Copyright 2005 Robert Redelmeier - GPL v2.0 licence granted */
int p[ 1<<21 ] ;
main (void) {
int i, j ;
for ( i=0 ; i < 1<<21 ; i++ ) p = 0x1FFFFF & (i-5000) ;
for ( j=i=0 ; i < 9600000 ; i++ ) j = p[j] ;
return j ; }
-- Robert
In comp.sys.ibm.pc.hardware.chips George Macdonald wrote:
> On Thu, 07 Apr 2005 00:01:31 GMT, Robert Redelmeier wrote:
>>Frankly, I'm a little surprised no-one runs any latency
>>benchmarks on RAM. A little pointer-chasing exercise isn't
>>hard to write, and would be very revealing.
> Dave Wang has discussed it in some detail - one of his pet
> subjects I believe. IIRC he was measuring round-trip times
> the "hard" way with probes.
Well, the deed is done (code below). Perhaps not as sharp as
bus-snooping, but at least this gives program-visible read latency:
Latency System CPU@MHz mem.ctl RAM
ns
144 P3@1000 laptop SO-PC133?
148 2*P3@860 Serverworks ??
178 P4@1800 i850 RDRAM
184 K7@1667 SiS735 PC133
185 P3@600 440BX PC100
217 2*Cel@500 440BX PC90
234 P2@350 440BX PC100?
288 P2@333 440BX PC66
I do need to find & test some more modern systems, but I'm
underwhelmed by the slowness of latency improvement.
compile: $ gcc -O2 lat10m.c
run: $ time ./a.out [multiply user time by 100 to give ns]
/* lat10m.c - Measure latency of 10 million fresh memory reads
(C) Copyright 2005 Robert Redelmeier - GPL v2.0 licence granted */
int p[ 1<<21 ] ;
main (void) {
int i, j ;
for ( i=0 ; i < 1<<21 ; i++ ) p = 0x1FFFFF & (i-5000) ;
for ( j=i=0 ; i < 9600000 ; i++ ) j = p[j] ;
return j ; }
-- Robert