DDR3 1600mhz cl7 vs 2666Mhz cl12

Tom O Bedlam

Reputable
Feb 13, 2015
19
0
4,510
Hello Tom's people,

Okay, I've gone down a real rabbit hole with latency versus frequency in RAM, and know the "it depends" answer and appreciate that... but

If you guys could please give me some advice or experiential feedback on RAM with 1600mhz CL7 (or slower frequency but lower latency) versus RAM with 2666mhz CL12 (or faster frequency and higher latency.) The latter is significantly more expensive; is the speed vs latency performance more and is it worth the price premium?

My CPU is a 4790k that likes to overclock at a lower voltage. My mobo is a Z97 ASUS sabretooth mk2. My usage is mostly Photoshop with large files (2gb and over) and some gaming (Dark Souls 3, Divinity Original Sin 2) etc.
 
Solution
Latency (CL) is measured in memory clock (half of data rate) cycles. So 1600 MHz CL7 means 7/800MHz = 8.75 ns latency. 2666 MHz CL12 means 12/1333MHz = 9 ns.

So the 2666 MHz RAM only has 0.25 ns (~3%) higher latency, while having ~67% higher bandwidth. Performance-wise, the 2666 MHz RAM is the obvious choice. It's not a matter of speed vs latency, it's just a matter of whether you want to spend the extra money.
Hi iamacow, thank you for posting the link. I like that it goes into detail about 1440p (I'm 1440p 980gtx) but it does not cover the photoshop side of my question; If I am not doing video editing, or 3d rendering but placing emphasis on working quickly in photoshop with large files, is frequency more important than latency; and following that is it worth a premium in real world results? Thank you for answering the gaming side of the question!
 
Latency (CL) is measured in memory clock (half of data rate) cycles. So 1600 MHz CL7 means 7/800MHz = 8.75 ns latency. 2666 MHz CL12 means 12/1333MHz = 9 ns.

So the 2666 MHz RAM only has 0.25 ns (~3%) higher latency, while having ~67% higher bandwidth. Performance-wise, the 2666 MHz RAM is the obvious choice. It's not a matter of speed vs latency, it's just a matter of whether you want to spend the extra money.
 
Solution
Well, as I said, the 2666 MHz CL12 RAM does have slightly worse (higher) latency. But it's a quarter a nanosecond difference. One quarter of one *billionth* of a second. 9 is only 3% higher than 8.75, that's negligible.

However, 2666 MHz means 67% higher bandwidth than 1600 MHz. That's significant.

For DDR memory, transfer rate is double memory clock. So 1600 MHz RAM means 800 MHz memory clock, 2666 MHz RAM means 1333 MHz memory clock.
 


Looking at Tcl in isolation is the routine amateur response to this question. However, DDR3 SDRAM is both bursty and pipelined.

A DDR4 burst is 8 words in length. The time from a read command to the first word being strobed onto the IO bus is indeed 9 nanoseconds. However, each subsequent word is strobed every half clock. The time from the read command to the last word of the burst being strobed is Tcl + 7*(Tck/2).

Tck for DDR3-1600 is 1.25 nanoseconds

Tck for DDR3-2666 is 0.75 nanoseconds

Time for a random read on an open column on DDR3-1600 with Tcl 7 is 13.125 nanoseconds

Time for a random read on an open column is DDR3-2666 with Tcl 12 is 11.625 nanoseconds

Furthermore, a read command can be issued to a selected DRAM rank every Tccd cycles. For DDR4 this gets a bit tricky due to DDR4's adoption of bank groups from GDDR5. However, for DDR3 Tccd is always 4 cycles. This means that a read command can be issued to any bank on the selected rank such that bursts follow one another tightly with no IO bus inactivity.
 
Thank you to all who answered and to Pinhedd for the in depth explanation. TJ Hooker, choosing yours as the solution as the fog started parting for me there. Still a little bit cross eyed from it all but I'm going with the 2666mhz 32gb kit, as it's likely to be the last significant update that I make that will be tied to the 1150 platform. Again, thanks!
 


Most modern microprocessor cores do not have direct access to the memory, all memory accesses are performed through a cache architecture.

x86 microprocessors use 64-byte cache blocks. Each memory channel is 64-bits wide and a burst is 8 words deep. <64 bits per word> x <8 words> = 64 bytes. Thus, a single read command is sufficient to load a cache block from DDR3 or DDR4 memory.

Generally speaking, temporal and spatial locality suggest that the entire cache block should be loaded before any portion of it its sent back in response to a memory instruction. I'm not 100% sure that this is the case but I believe that it is.