Question Recovery period of SSD write cache

Mawla

Reputable
May 21, 2021
83
3
4,535
I've seen reviews of several SSDs and have some idea of how the write cache affects performance. What I have not found out so far is how quickly the cache recovers in a typical DRAM-less SSD, particularly when copying from a very fast source.

To give a hypothetical example, suppose model X fills up its cache after a continuous write of 50GB and the speed drops precipitously thereafter. If the user arranges the write operation to be done manually in batches of 50GB or less, how long does the rest period in between write sessions have to be? Seconds, milliseconds, etc.?
 
The question is moot.
Ultimately the 50gb of data needs to be written to the underlying nand chips.
As the data is written, some of the the cache becomes available so it really never empties.
The writing of several instances of 50gb will ultimately be limited by the nand writing performance.
And, also, by how large the ssd is. If there are sufficient free nand blocks, a write is a simple write.
If the ssd has data on all of the nand blocks, the write will be a longer read/rewrite process .
 
I've seen reviews of several SSDs and have some idea of how the write cache affects performance. What I have not found out so far is how quickly the cache recovers in a typical DRAM-less SSD, particularly when copying from a very fast source.

To give a hypothetical example, suppose model X fills up its cache after a continuous write of 50GB and the speed drops precipitously thereafter. If the user arranges the write operation to be done manually in batches of 50GB or less, how long does the rest period in between write sessions have to be? Seconds, milliseconds, etc.?
It's not the same for every drive, or even for every workload. Some drives recover very quickly and others more slowly. Modern drives can also detect the workload and will change modes for this. The reason is the need for a balance between performance and endurance. If you recover slow you benefit from reads from SLC but are less prepared for writes. If you recover quickly you risk having higher write amplification. Also, there's both static and dynamic SLC which also recover differently. Samsung's TurboWrite is hybrid (becoming common these days) with the simple method of static filled first, dynamic last, and then FIFO (static empties and recovers first/quickly, dynamic more slowly). The size of the cache, which is dependent on free space for dynamic, can also impact recovery in relative terms (e.g. a smaller cache with the same workload).

It's always ideal to empty cache when the drive is idle and expected to remain that way. This way the folding doesn't interfere with incoming/host I/O. The drive must also consider power and temperature, the first because power states have enter/exit latency and can impact efficiency, the latter because temperature ties to throttling which can involve power states indirectly. However, sometimes you need to move cache during operation which is where the scheduler comes into play, which is also dependent on workload and firmware optimization. An example would be Phison's I/O+ DirectStorage firmware which aims to have high sustained performance (specifically, reads).

If you are referring to reviews, or at least ones where they try to measure this, the idle period can be varied and might be quite long without seeing any results. It's difficult to control for this if you are considering a wide range of systems. In general, smaller caches with good sustained/TLC performance will recover to that mode as sequential writes have low WA and it's a good balance for performance while large caches (for slower drives, e.g. QLC) will try to recover some immediately to hide bad perf but may save some for reads (QLC reads slower). Being able to write to SLC first with a wide cache helps endurance if writes are deferred and not always committed since these drives will go straight to folding (no middle state) which has no WA (sequential and slow writes).
 
Last edited: