News Patriot developing high-speed DDR5 RAM that guarantees speeds up to DDR5-7200 regardless of CPU IMC quality — unlocks higher overclocking potential...

Admin · Jan 15, 2024

Patriot prepares high-frequency DDR5 memory with a memory clock driver (CKD) that promises ample compatibility.

Patriot developing high-speed DDR5 RAM that guarantees speeds up to DDR5-7200 regardless of CPU IMC quality — unlocks higher overclocking potential... : Read more

bit_user · Jan 15, 2024

Very informative. I really appreciate the background info. I had seen mention of RCDs mentioned before, but I thought it was a nice touch to relate these CKDs to them.

Also, thanks for mentioning the relation between voltage and timings. It's not surprising to me, but I didn't know as hadn't been tracking that.

I hope this sort of thing becomes virtually standard on all but the cheapest DDR5 DIMMs.

thestryker · Jan 15, 2024

I'd love to see if this has any real world impact on overclocking. I'm not aware of any high frequency XMP/EXPO kits featuring a clock driver outside of RDIMMs where it's required.

The timings referred to in the article are just JEDEC 6400 timings so I could see where it could help with frequency while keeping voltage at 1.1v now since no IMC natively supports 6400. If the clock driver is effectively required to maintain those clocks to meet JEDEC specs I could see them becoming fairly common once AMD/Intel move up in base CPU memory spec. Of course at the same time most DDR4 high frequency memory used 2133 baseline instead of 3200 so who knows.

yeyibi · Jan 15, 2024

Rather than highest speed, memory needs to become smart, and automatically doing tasks like filling an area of memory with zeros, or a constant, basic bitwise operations, like a constant XOR, mask, vector addition or some basic operations. Even carryless products or carryless additions would be useful, and save a lot of computing time.

LouIV · Jan 15, 2024

Zhiye Liu,

I have enjoyed this article as well as others you have written. Quite simply, I would like to know where this new Patriot memory falls into the "Best RAM for Gaming: DDR4, DDR5 Kits for 2024" article you wrote.
Don't short-circuit and burn the place down trying to figure it out. 😉

bit_user · Jan 15, 2024

yeyibi said:
Rather than highest speed, memory needs to become smart, and automatically doing tasks like filling an area of memory with zeros, or a constant, basic bitwise operations, like a constant XOR, mask, vector addition or some basic operations. Even carryless products or carryless additions would be useful, and save a lot of computing time.

I think that wouldn't help much. First, those aren't things CPUs spend much time doing. Zeroing memory is the only one worth taking seriously, but even that is really tweaking at the margins.

Second, the operations can't break cache-coherency, which means work for the CPU, anyhow. You'd have to flush or invalidate all of the cache lines for the address range + bring in the new data from memory, since it would presumably happen when you're about to use it. Together, you've now spent a large fraction of the cost of just having the CPU do it.

Third, if the operation would be truly expensive, then it's going to be asynchronous and that would introduce synchronization overhead.

To provide a significant benefit, you have to look at more complex operations, like convolutions. That's what Samsung and SK Hynix have been working on.

https://www.tomshardware.com/news/sk-hynix-to-showcase-gddr6-aim-memory-next-month

https://www.tomshardware.com/news/sk-hynix-unveils-cxl-computational-memory-solution

https://www.tomshardware.com/news/sk-hynix-plans-to-stack-hbm4-directly-on-logic-processors

bit_user · Jan 15, 2024

thestryker said:
I'd love to see if this has any real world impact on overclocking. I'm not aware of any high frequency XMP/EXPO kits featuring a clock driver

Did you see this part?

"It isn't a novel idea, nor is Patriot the first to do it. TeamGroup did the same thing with its mainstream Elite and Elite Plus lineups."

Are those kits not XMP/EXPO?

thestryker · Jan 15, 2024

bit_user said:
Are those kits not XMP/EXPO?

They are not, but the 6400 is only officially compatible with refreshed Z790 while 6000 is listed as fully compatible with Intel.

bit_user · Jan 15, 2024

thestryker said:
They are not, ...

How are there so many makers of CKD chips, when none of the performance-oriented DIMMs are using them? Seems contradictory, but maybe it's just the next big thing and enough chipmakers figured that out?

yeyibi · Jan 15, 2024

bit_user said:
First, those aren't things CPUs spend much time doing.

... because they are too expensive.

bit_user said:
Second, the operations can't break cache-coherency, which means work for the CPU, anyhow.

Not a problem, because the programmer would know, before using it, and would know when to use it.

An use case is to process an array speculatively assuming that only 5 bits are required, but then finding that actually 6 bits are required. Then the past calculations become useless, so it better to assume 8, or 16 bits, even when it wastes space, leaves most bits unused, and is slower to process.
But a smart memory could be commanded to convert the past data to 6 bits (which involve carryless arithmetic), so the CPU would continue processing the rest of the array meanwhile the RAM reformats the already processed array. There is no cache conflict, because the CPU is occupied with a different area of memory.

thestryker · Jan 15, 2024

bit_user said:
How are there so many makers of CKD chips, when none of the performance-oriented DIMMs are using them? Seems contradictory, but maybe it's just the next big thing and enough chipmakers figured that out?

I thought it was just part of the spec in more of a "if you need it" type capacity, but according to Renesas it's a requirement once you hit 6400. So I'd assume everyone making CKDs are also companies making RCDs and they just got a jump on manufacturing since DDR5 base specs have been moving rapidly. Granite Rapids is 6400 which leads me to believe ARL will be as well which means volume production will already be necessary.

instawookie · Jan 16, 2024

🤔 Curious as to how well this would go on a MB like a Aorus Ultra z690 having a kit default to 6400.

bit_user · Jan 16, 2024

yeyibi said:
... because they are too expensive.

No, I meant they're too cheap & infrequent to tie up a significant amount of CPU time. For something to make sense to embed into DRAM, it has to be a common operation (in some context) and data-intensive (relative to the amount of computation involved). That's how you achieve a worthwhile savings by keeping the data in memory instead of shipping it back & forth between memory and the CPU.

I provided you 3 links of efforts SK Hynix has made to tightly couple compute with memory. There are similar examples from Samsung. If you're interested in the subject, I'd recommend reading those linked articles, as a starting point.

As far as I'm aware, these are the main efforts currently under way to do compute-in-memory that are likely to bear fruit.

yeyibi · Jan 16, 2024

bit_user said:
No, I meant they're too cheap & infrequent to tie up a significant amount of CPU time.

Carryless arithmetic has applications in: sorting algorithms, data compression, hash tables, remapping arrays.
Those are basic tasks that computers do 24/7, and present hardware doesn't supports (not even the ALU).

You fail to tell the difference between "is not used because is not supported", and "is not used because is not needed".

bit_user · Jan 16, 2024

yeyibi said:
Carryless arithmetic has applications in: sorting algorithms, data compression, hash tables, remapping arrays.

The type of arithmetic you're describing is cheap for CPU cores to do. Data movement and latency are expensive. Make sure you're solving the right problem.

As technology evolves, the costs & bottlenecks in the system shift around. What seemed like a good idea 30 years ago might no longer make sense. Conversely, some things which seemed trivial in the past are now becoming increasingly expensive.

Again, you don't have to take my word for it. Look at what the memory makers themselves are working on.

snemarch · Jan 17, 2024

yeyibi said:
Carryless arithmetic has applications in: sorting algorithms, data compression, hash tables, remapping arrays.
Those are basic tasks that computers do 24/7, and present hardware doesn't supports (not even the ALU).

You fail to tell the difference between "is not used because is not supported", and "is not used because is not needed".

If you move a workload from the CPU to the memory modules, it has to be a task that makes sense to run asynchronously – if your CPU sits waiting for the workload to complete, you gain nothing. Doing async stuff like this has overhead and complexity, and your examples make no sense for our current hardware and operating system designs.

From regular application code, even memory copying / zeroing (which sounds simple to implement, and is something all applications deal with) don't make sense to offload. It probably doesn't even make sense for your OS zero-deallocated-pages scrubber – that use sounds like a good fit for async usecase, but the individual pages are *tiny* and the implementations are highly optimized.

Go read the articles @bit_user linked to

bit_user · Jan 17, 2024

snemarch said:
From regular application code, even memory copying / zeroing (which sounds simple to implement, and is something all applications deal with) don't make sense to offload. It probably doesn't even make sense for your OS zero-deallocated-pages scrubber – that use sounds like a good fit for async usecase, but the individual pages are *tiny* and the implementations are highly optimized.

There's an interesting footnote to the memory-zeroing case, which is that Apple has apparently built-in some special case for optimizing transfer of zero-blocks across its SoC interconnects. However, that's presumably not only for writing runs of zeros, but also reading them?

Another whole aspect we haven't even touched on is memory encryption. Compute-in-memory would seem to be incompatible with it, unless you build encryption hardware into the actual memory chips and send a key with these operations, which sounds expensive and could have implications on the security of the system (i.e. if someone can sniff those keys being sent off chip, they should then be able to decrypt the data the CPU is writing out).

On the other hand, if you stack the memory dies with the processing elements, and Nvidia seems to be working on, then it should be almost as hard to sniff encryption keys between the compute & memory dies as it is to sniff them within existing compute dies.

Pierce2623 · Jan 18, 2024

bit_user said:
Did you see this part?

"It isn't a novel idea, nor is Patriot the first to do it. TeamGroup did the same thing with its mainstream Elite and Elite Plus lineups."

Are those kits not XMP/EXPO?

Nah those are just commodity JEDEC kits

LouIV · Jan 20, 2024

Admin said:
Patriot prepares high-frequency DDR5 memory with a memory clock driver (CKD) that promises ample compatibility.

Patriot developing high-speed DDR5 RAM that guarantees speeds up to DDR5-7200 regardless of CPU IMC quality — unlocks higher overclocking potential... : Read more

I would like to know where this new Patriot memory falls into the "Best RAM for Gaming: DDR4, DDR5 Kits for 2024" article.

bit_user · Jan 20, 2024

LouIV said:
I would like to know where this new Patriot memory falls into the "Best RAM for Gaming: DDR4, DDR5 Kits for 2024" article.

In time, yes. Right now, they only have an "engineering preview". I'm not sure how long it takes to go from that to a launched product available for reviewers & purchase, but expect to wait several months.

News Patriot developing high-speed DDR5 RAM that guarantees speeds up to DDR5-7200 regardless of CPU IMC quality — unlocks higher overclocking potential...

Administrator

Titan

Judicious

Great

Distinguished

Zhiye Liu,​

Titan

Titan

Judicious

Titan

Great

Judicious

Reputable

Titan

Great

Titan

Distinguished

Titan

Commendable

Distinguished

Titan

Share this page

Zhiye Liu,