Performance scaling with SSDs in raid array

tom thumb · May 11, 2010

Does a raid0 array of 2 SSDs detract from any aspect of their performance? Sustained read and sustained write are obviously increased but there are other things like random write speed, response times and ‘snappiness’ that are also important. How are these affected?

Given how expensive SSDs are, is there any reason not to buy 2 SSDs of half the size you were originally looking for and raid0 them? (provided you can support them)

And finally,

How does performance scale with the number of drives in the array? On the off chance that someone reading this has a raid array of more than 2 SSDs (kapow!), how does your read/write/response time/burst speed/"snappiness" compare with a single drive?

Thanks!

sub mesa · May 11, 2010

You lose TRIM when putting SSDs in RAID on Windows, and generally it complicates your setup. You would have to reserve 20-40% of the space to the SSD themselves as spare area; leaving you with less space to actually use.

Random read/write can be enhanced by RAID0, but only if you have a good RAID0 driver and use a high stripesize. Generally, on Windows, the onboard RAID drivers have low-level optimizations that kill random IOps performance. This is caused by them reading the full stripe size, even if a small part of that stripe was requested. So if your stripesize is 128KiB and you read 4KiB, the RAID driver would read 128KiB instead; wasting 124KiB of read because it was not requested and thus does not count. This reduces performance scaling on RAID0 on Windows.

Also, you would need perfect alignment. Alignment can be tested with the free downloadable AS SSD tool, use google to download it. This can also be used to test random read/write performance. RAID0 can improve sequential read/write, random write and multiqueue random read; it cannot improve singlequeue random read, labeled "4K" in AS SSD. Things like booting and other 'blocking I/O' won't get faster, but parallel access to the SSD will be boosted.

In my tests, done on FreeBSD, show RAID0 can have a ~90% scaling boost on random read/write given enough queued I/O's.

tom thumb · May 11, 2010

sub mesa :

You lose TRIM when putting SSDs in RAID on Windows, and generally it complicates your setup. You would have to reserve 20-40% of the space to the SSD themselves as spare area; leaving you with less space to actually use.

Random read/write can be enhanced by RAID0, but only if you have a good RAID0 driver and use a high stripesize. Generally, on Windows, the onboard RAID drivers have low-level optimizations that kill random IOps performance. This is caused by them reading the full stripe size, even if a small part of that stripe was requested. So if your stripesize is 128KiB and you read 4KiB, the RAID driver would read 128KiB instead; wasting 124KiB of read because it was not requested and thus does not count. This reduces performance scaling on RAID0 on Windows.

Also, you would need perfect alignment. Alignment can be tested with the free downloadable AS SSD tool, use google to download it. This can also be used to test random read/write performance. RAID0 can improve sequential read/write, random write and multiqueue random read; it cannot improve singlequeue random read, labeled "4K" in AS SSD. Things like booting and other 'blocking I/O' won't get faster, but parallel access to the SSD will be boosted.

In my tests, done on FreeBSD, show RAID0 can have a ~90% scaling boost on random read/write given enough queued I/O's.

why would I have to reserve 20-40% of the disk? What OS's don't loose TRIM when installed on an SSD array?

tom thumb · May 11, 2010

sub mesa :

Wait, sorry. I googled this "issue", and I came upon another thread on Tom's Hardware where you suggested creating a separate partition for the windows installation.

I was actually planning on doing that anyway, but thanks for the heads up!

Any other advice you could give would be appreciated.

sub mesa · May 11, 2010

Only Linux and BSD can use SSDs in RAID while still having access to TRIM capability, but only in a limited number of circumstances, meaning not all filesystems on Linux support this, and ZFS on FreeBSD doesn't either.

Because you have no TRIM capability, using the SSD without TRIM would lead to 100% capacity being used after a period of time, even though your filesystem is only 20% full. At that stage the SSD starts to slow down, and lifespan would also be reduced by higher "write amplification". In other words, the SSD doesn't like this at all.

The SSD likes to know about free blocks. That either means you do not use some blocks AT ALL. Or that you use all blocks for one big filesystem, but let the SSD know which blocks are unused and thus free to be used by the SSD internally.

It may be what confusing. It all has to do with some limitations of SSDs when writing small parts. Normally that is very very slow, but with a trick by using spare/free cells or blocks, they can use the spare area to 'remap writes'. So while Windows says to store the file at location 123, the SSD actually stores it in 567 because that is faster. The SSD will remember this and Windows will never know the file was actually stored in 567.

For this trick to work, you need free cells.

So if you are going to use this in RAID0, you would need to reserve about 20% space, or live with performance degradation as a result of lack of spare area after a month or so of normal use.

tom thumb · May 11, 2010

What partition size, stripe size and sector size would you reccomend for 2 128Gb SSDs in raid supporting windows 7 ultimate? - such that the windows partition has, as you said, 20-40% free space.

You are suggesting not to use the entire drive, which makes perfect sense. Wouldn't a defrag fix this problem?

sub mesa · May 11, 2010

Stripesize is a hot topic. Generally you should not go lower than 128KiB. But some RAID drivers perform poor at high stripesize levels due to low level optimizations; i'm not sure yet which Windows drivers do this and which don't. Either way, either 64KiB or 128KiB is a setting that is very balanced and should give a reasonable performance increase. Theoretically, the higher the stripesize the better. No real reason it can't be 10 megabytes - though you won't ever see that on a Windows system. Not sure why really; Windows just isn't a very RAID-friendly OS.

Sector size cannot be changed; it is 512 bytes always; or 4KiB for newer "Advanced Format" drives. It is not something you can change.

Partition size is a good one. I stated 20%, that counts for both drives. Meaning if you have a RAID0 of 2x120GB = 240GB, then you take 20% of 240GB = 48GB or 50GB. So that would mean a 190GB partition; with 50GB unused.

Though the unused space is not usable, it will allow you to use the SSDs in RAID0 without any TRIM at all, and still be fast and keep the SSD tidy. Reserving space is a very good practice. The downside is that SSDs are so tiny already that you really need the space.

Also, please: NEVER - EVER - EVER - DEFRAGMENT YOUR SSD; doing this even just once will make a jungle of the SSD internal blocks. You would be doing the exact opposite instead and fragment the SSD excessively which cannot be fixed other than a Secure Erase + complete reinstall.

SSDs do not need to be defragmented, and doing so is harmful for SSDs. For that reason, Windows 7 automatically disables defragmentation on SSDs and won't let you do it manually either (according to third sources; no personal experience with that).

If you did defragment, you would notice the random read and write scores will be significantly lower than before defragmenting. In this case a Secure Erase is highly recommended to reset the performance of the SSD to factory defaults and start over again.

tom thumb · May 12, 2010

So,

If I stripe 2 SSDs into raid0, I should:

-look to 64 or 128 kib stripe size for good performance balance
-give it 2 partitions, one for Windows and one for everything else
-not defragment, as that is not necessary
-leave 20% free space or else suffer performance drops

Hmm...
I can figure out the windows install size, I don't need help with that.

You said never to defragment an SSD, did you mean that for all SSDs or just those in a raid?

But the main question here is:
What kind of performance drop in r/w/random access would I see on a 100% full raid0'd SSD partition vs one that's 80% full?

And also,
Does it effect the performance of a single SSD (not a raid) if its completely full?

sub mesa · May 12, 2010

-give it 2 partitions, one for Windows and one for everything else

You don't have to use two partitions. You can just make one big C: which is 80% of the capacity of the SSD. However, if you prefer a separate C and D so you can reinstall Windows easily, that should be no issue.

Remember to create all partitions inside Windows 7 setup; Win7 creates partitions the right way with a correct 'alignment' for SSDs. Windows XP creates partitions the 'bad' way, and suffers in performance as a result.

About defragmenting: this is inherit to mechanical storage; harddrives are very slow with seeking (finding a file) and the first 10% of its capacity is very fast while the last capacity is the slowest. All this isn't true for SSDs; its all the exact opposite. Defragmenting an SSD has the exact opposite effect: it will create a jungle from a tidily ordered SSD before you started the defragmentation.

One exception is controller-less SSDs which do not have any optimizations like write remapping. These are often found in USB pendrives and Parallel ATA solid state drives and most SSDs under 8 gigabytes. As these are always slow, defragmenting them will not degrade performance further. It will just waste write cycles.

So all modern SSDs are affected, whether in RAID or not, and should never be defragmented.

Windows 7 should detect you are running an SSD and refuse to defragment that volume.

The performance difference between no capacity reserved at all and 80% is pretty high. You can compare this chart which gives a basic idea. The lower the write amplification, the better:

The 0.1 spare factor would be close to using the drive at 100% capacity. Since Intel SSDs have 7% space reserved already, invisible to Windows, it is close to a spare factor of 0.1 or 10% reserved blocks.

As you can see, moving up to 20% and higher quickly lowers the write amplification effect. The lower, the better. Without doing anything, you end up with high write amplification, thus lower performance and shorter lifetime.

tom thumb · May 13, 2010

http://images.anandtech.com/reviews/storage/Intel/34nmSSD/Review/separatedataplacement.jpg

Sorry about the wait.

Let's see if I interpreted the graph correctly - and please correct me if I'm wrong.

You're saying that we are aiming to reduce the write amplification factor (Af) - Ok, so this write amplification factor is the factor by which write times are increased? For instance, suppose you want to copy a 240Mb file on an SSD whose set write speed is 120Mb/s, if your amplification factor is 0.5, will this therefore take 2(1+0.5) = 3 seconds?

Spare factor, as I take it, is the proportion of the drive that is reserved - the 7% you mentioned intel SSDs have by default. How do I "reserve" data on a disk? Is it automatically reserved if I'm not using it?

What exactly is "% static data"? Is it merely the proportion of the disk that is not used in a write operation? - or is it free space?

All this is to do with write performance. How would read performance be affected on an SSD either by itself or in an array if it was being heavily used; (ie 80-100)%?

I realize I've got a lot of questions, but I've never looked into drive performance this deeply. Thanks for your help so far.

trajan · May 13, 2010

Sorry to jump in on this a few days late, I had the EXACT same question. I started out over at Anand tech and ended up over here on a related issue. I am absolutely NOT an expert or even competent yet in this area but I found some resources that were helpful which you might be able to use as well.

* Article on performance increase from using RAID and two 40 GB Intel drives

* My post in a different forum regarding the same but using two 80 GB Intel drives. Many commentators note that losing TRIM is not such a bad thing with the Intel controllers (and the same could be true for other SSD controllers). Basically, the newer controllers are much better at recycling sectors even in RAID and even without TRIM. This is probably very controller specific however.:

* Lastly, there are some comments in this thread that are controversial. I'm not an expert and I'm not saying they're wrong, just that I believe there is disagreement among the experts on these remarks and you should be aware of it:

Random read/write can be enhanced by RAID0, but only if you have a good RAID0 driver and use a high stripesize.

Stripesize is a hot topic. Generally you should not go lower than 128KiB. But some RAID drivers perform poor at high stripesize levels due to low level optimizations; i'm not sure yet which Windows drivers do this and which don't. Either way, either 64KiB or 128KiB is a setting that is very balanced and should give a reasonable performance increase. Theoretically, the higher the stripesize the better.

My knowledge on this topic comes from an exchange in the comments section (near the end) of the first article I linked to above. My potentially flawed understanding is that in THEORY, smaller stripe sizes are better. This is because striping is where you get your speed increase from RAID -- you WANT your data to be spread across both drives so that when it's read, both drives kick in. With a large stripe size, small files can end up being located on only one drive instead of both. So if you are working a lot with 10mb files, and have a 128k stripe size, you are going to get a significant performance boost from RAID. But if you work with many small files, say 10k, 20k, 100k files, then with a 128k stripe size, you may get a very weak performance boost, because these files are smaller than your stripe size.

That's THEORY. Practical considerations though are different. People think that with SSDs, the controllers react differently to different stripe sizes, and for each controller there may be an optimal stripe size (or range of stripe sizes) that the controller will handle best. For example, people have suggested that with the Intel controller, a stripe size of 16kb - 64kb is better. You should research your controller.

In addition, there's a lot of debate over whether using smaller stripe sizes risks excessive wear on your drive. The argument here is "yes, smaller stripes would theoretically improve performance, but they also mean a lot more write activity, and this is going to burn out your drives much faster. It's not worth it for the small performance gain. Stick to larger stripe sizes like 128k". The other side argues that due to characteristics of the newer SSDs, this isn't actually an issue, or it may be an issue, but the extra wear is too minimal to really be concerned with.

Anyway, going back to this comment:

Theoretically, the higher the stripesize the better.

I could be missing a lot here, but I believe this is not correct. I think most of the combatants on the SSD RAID debate agree that in *theory*, smaller stripes are better. The arguments are just over (1) whether or not smaller stripes -- despite being faster-- cause excessive wear, and (2) whether or not the SSD controllers have optimizations that make smaller or larger stripe sizes work better in practice, despite the theoretical advantage of very small stripes.

Anyway, just wanted to pass along this info, as someone who is just went through the exact same decision making process. I opted, in the end, to get two SSDs and RAID rather than one larger SSD. Since I went with Intel, I'm also choosing to trust the advice on smaller stripes for that specific controller, and I'm going with 16kb stripes. My next step, which may or may not be an issue for you, is getting my existing Windows 7 install to work once I switch to RAID mode

(there's a great article here at Tom's that I hope is going to fix the problem).

sub mesa · May 13, 2010

tom thumb :

No. The Write Amplification is the extra writes physically happening on the SSD, compared to the writes that were sent through the cable (issues by the OS).

Assume running Windows 7 and you modify 4KiB in an existing file. That would cause the SSD to read (128-4) = 124KiB, erase the entire 128KiB erase block, then write 128KiB including the updated 4KiB data. So the physical writes are 128+128=256KiB while the logical writes are only 4KiB.

The write amplification in this example is: 256 / 4 = 64.0; extremely high. That's just of one I/O though, when doing sequential writes the write amplification is usually 1.0; meaning you write 128KiB and the SSD writes 128KiB; write amplification factor 1.00 (though the graph starts at 0.00 - still the same idea).

So, if your SSD can write a total of 10 PetaBytes before it becomes read-only due to exhausted write cycles on all flash cells, and assuming your write amplification is 4.0x, then you will already reach that at 2 PetaBytes or actual writes issued by Windows; but it will make the SSD write 10PB in total; causing much more rapid lifespan reduction.

Also, because of all the extra work, having a high write amplification means it is also slower. But worse is permanent lifetime reduction under high write amplification.

Spare factor, as I take it, is the proportion of the drive that is reserved - the 7% you mentioned intel SSDs have by default. How do I "reserve" data on a disk? Is it automatically reserved if I'm not using it?

The Intel SSD considers all storage space to be spare space, unless you write to it. So as soon as Windows writes, at least once, to a location, it will be marked as "in use" by the OS. All the other blocks are like 'gaps' that the SSD knows it can use for itself since the OS/Windows does not use it.

Now this means two things:
1) Windows has to let the SSD know which blocks are not in use anymore (TRIM)
2) you need to reserve a place where windows will never write to, not even once (no TRIM)

Even with TRIM, the advice in point 2 may yield performance benefits, as you can see in the graph. If you do not have TRIM and just create one big C:, then you will have spare factor 0.07 and you can see the write amplification go up here.

All this is to do with write performance. How would read performance be affected on an SSD either by itself or in an array if it was being heavily used; (ie 80-100)%?

The higher the write amplification, the lower the read performance on that block of data. When doing HDTune benchmarks on your SSD, you can often see clearly where data was written, and performance is alot lower if you have used the SSD for some time without TRIM and without using reserved space. In that case you would see low scores wherever data was written 'the dirty way'.

But as said, i'm more concerned about cutting the lifetime of your SSD in half or even worse. The graph also shows that you quickly benefit from giving a few gigs more to the SSD to use for itself.

sub mesa · May 13, 2010

BTW this is a nice example of what happens with read performance - even with TRIM:

You can easily spot static from dynamic data here; the remapped/fragmented portions have a dip while static data has a horizontal line.

* Article on performance increase from using RAID and two 40 GB Intel drives

Yes, but no random read performance increase; i did manage to get ~1250MB/s with 5 Intel X25-V 40GB in RAID0; that means 250MB/s per drive; close to 100% performance scaling with RAID0.

So it could be that RAID0 random I/O performance gains are only apparent on BSD/Linux systems, not on the Intel RAID0 drivers.
I would be interested in seeing such benchmarks though; but often the benchmarks are either not done properly or show low performance. Its a shame; Windows just doesn't like RAID.

* My post in a different forum regarding the same but using two 80 GB Intel drives. Many commentators note that losing TRIM is not such a bad thing with the Intel controllers (and the same could be true for other SSD controllers). Basically, the newer controllers are much better at recycling sectors even in RAID and even without TRIM.

I would point to the erase block fragmentation issue on Intel SSDs, and is still under investigation by Intel.

* Lastly, there are some comments in this thread that are controversial. I'm not an expert and I'm not saying they're wrong, just that I believe there is disagreement among the experts on these remarks and you should be aware of it:

That is true, but really in Windows-world there are alot of assumptions are just aren't true. Or they have been true for other reasons than people think.

I should note that the claims i make are perfectly valid for ME, as i've tested my Intel X25-V 40GB battery of 5 SSDs in combination with FreeBSD+ZFS; and of course also done benchmarks on this array. I also have decent experience benchmarking RAIDs; and generally feel to have a deeper understanding of how performance works and which numbers are important and which are not.

I do agree some statements are controversial. But the controversy would be about FreeBSD RAID0 performance being irrelevant to Windows users who want RAID0. That is certainly true. Just know that RAID0 *CAN* scale properly in random IOps; but that at least some Windows drivers do not actually deliver this performance potential.

My potentially flawed understanding is that in THEORY, smaller stripe sizes are better. This is because striping is where you get your speed increase from RAID -- you WANT your data to be spread across both drives so that when it's read, both drives kick in. With a large stripe size, small files can end up being located on only one drive instead of both. So if you are working a lot with 10mb files, and have a 128k stripe size, you are going to get a significant performance boost from RAID. But if you work with many small files, say 10k, 20k, 100k files, then with a 128k stripe size, you may get a very weak performance boost, because these files are smaller than your stripe size.

You think well, i like that.

Let me help you:

In your example, with 128k stripe and say 8k files, you say that when reading one file, it is only stored on one HDD and thus only one HDD will be accessed when reading this file. TRUE!

BUT - that also means the other three HDDs in the RAID0 array are entirely free and can handle other requests if they are available. For this to work, you need a higher queue depth than just 1. I'll talk about queue depth later. But assume alot of requests are waiting, now look what will happen:

HDD1: seek to file 1, read file 1
HDD2: seek to file 567, read file 567
HDD3: seek to file 890, read file 890
HDD4: seek to file 234, read file 234

Now thats one seek+ 8k read for each HDD, assuming seek time is 11,1ms (5400rpm) and transfer time to read the 8KiB is (8 / (100MB/s * 1000) * 1000 = 0,08ms.
Total time: 11,1 + 0,08 = ~11,2 ms ( = 357 IOps, calculated with: (1000 / 11,2) * 4 = IOps)

Now look at how a single drive would do this:

HDD1: seek to file1, read file 1, seek to file567, read file 567, seek to file 890, read file 890, seek to file 234, read file 234

That's's 4 seeks and 4x8k read. seek times will account for (4*11,1) 44,4ms and the transfer time will be (4*0,08ms) 0,24ms.
Total time: 44,4ms + 0,24ms = ~44,64 ms ( = 89,6 IOps; close to 25% that of 4-disk RAID0 array)

Now look how a 4-disk RAID0 array with 2KiB stripesize would do this:

HDD1: seek to file1, read portion A of file 1, seek to file 567, read portion A of file 567, and the other two as well
HDD2: seek to file1, read portion B of file 1, seek to file 567, read portion B of file 567, etc
HDD3: seek to file1, read portion C of file 1, seek to file 567, read portion C of file 567, etc
HDD4: seek to file1, read portion D of file 1, seek to file 567, read portion D of file 567, etc

Each portion would be 2Kilobytes; so 2 * 4 = 8KiB which is the size of all files in our example. So when reading each file, all HDDs are used. However, this also means they have to seek to each file. Each HDD handles 4 seeks and reads 2k four times. That means 44,4ms for seeks, and (4*0,02ms) = 0,08ms.
Total time: 44,4ms + 0,08ms = ~44,5ms ( = 89,9 IOps)

As you can see, the performance of RAID0 with small stripesize is very bad in this example. That is because you force all the disks to seek to the same file, they all seek to file 1, read a tiny portion of it, then all seek to the next file. That is not efficient as seek times are WAY higher than transfer time; the time to ACTUALLY read data is very short compared to the time it takes to seek to the file.

So the idea of having a large stripesize is that each I/O done (generally not larger than 128KiB) will be handled by one disk alone, so the other disks are available to handle other requests if they are available.

If those are not available, you would get near single-disk performance. Whether multiple requests are available depends on what applications you use. When reading large files from your filesystem, the filesystem will use read-ahead and read 8 blocks at a time, allowing multiple disks to be supplied with work to do.

The tricky part is called blocking I/O or synchronous I/O. If the pattern is random (non-predictable), only one disk in the RAID can do work with a higher stripesize. So things like booting, application launch and some applications do not scale with RAID0 at all; as they do unpredictable I/O where only 1 request is done at a time. So only one disk at a time is at work, with the others idling due to no other requests available.

You can compare this with multicore CPUs. Some applications only use one core, some use multiple. In either case, having two applications at work will also allow 2 CPUs to work even though the applications can only use one CPU. The same is true for RAID with a high stripesize.

That's THEORY. Practical considerations though are different. People think that with SSDs, the controllers react differently to different stripe sizes, and for each controller there may be an optimal stripe size (or range of stripe sizes) that the controller will handle best. For example, people have suggested that with the Intel controller, a stripe size of 16kb - 64kb is better. You should research your controller.

Its certainly true that RAID0 offers theoretical - potential - performance, but individual RAID implementations (actual drivers/firmware that implements RAID) determines how much of this potential is actually achieved.

In addition, there's a lot of debate over whether using smaller stripe sizes risks excessive wear on your drive. The argument here is "yes, smaller stripes would theoretically improve performance, but they also mean a lot more write activity, and this is going to burn out your drives much faster.

I disagree on this argument. HDDs are particularly sensitive to heat variations (contraction/expansion of the metal) and to vibrations/physical shock. The seeking of your HDDs would not directly translate in lower lifespan. Likely, the HDD will die of something else first.

trajan · May 13, 2010

Thank you Mesa! I'm glad to know that my basic understand is correct, but you kindly pointed out a huge concept that I missed - the queing. I was really thinking in terms of speed at reading a single file, and wasn't taking it to the real world situation of having many files that are being requested.

Now I'm thinking that (once/if I get Win7 to boot after switching my controller to RAID, grrr) I'll go with a 64k stripe instead of 16....

tom thumb · May 14, 2010

Thanks for sharing that Sub Mesa.

I'd like to clarify a few things;

"Now this means two things:
1) Windows has to let the SSD know which blocks are not in use anymore (TRIM)..."

Let's consider only those SSDs that support TRIM. This means that if you've got an SSD either by itself or in an array with just 1 partition - 1 big C drive, and you've installed windows there, then windows will tell the SSD that all free space not in use is available to the drive and therefore counts as "spare factor". Correct?

Point #2 seems redundant if TRIM is supported consider the following three scenarios:

i) 128Gb SSD TRIM supported, 1 partition, windows installed, 64Gb free space
ii) 128Gb SSD TRIM supported, 1 partition, windows installed, 20Gb "reserved", 44Gb free space
iii) 128Gb SSD TRIM not supported, 1 partition, windows installed, 20Gb "reserved", 44Gb free space

according to how I understand "reserved space", scenarios i and ii both have 64Gb spare (and usable to the ssd) and thus a spare factor of 0.5, whereas scenario iii has 20Gb "reserved" and a spare factor of 20/128 = 0.16.

... it seems that if your SSD supports TRIM there is no need to manually reserve space for your SSD to use, and that all you need to do to retain most of your write performance is to keep your usage below 80% or so.

Am I getting this right?

Given that, as you said, windows doesn't like raid, if I create a raid array from 2 SSDs that support TRIM, should I place windows in its own partition and give it 20% or so free space, or should I just leave everything on one partition?

Or putting it another way:
What is the performance scaling of SSDs in a raid0 array if that array is not used to store windows (just data and apps)?

and, what if windows is installed on a separate partition? What is the performance scaling of the non-windows partition?

As you can see, I'm trying to maximize performance for a raid array whilst simultaneously avoiding any raid-windows unpleasantness. SSDs are expensive and it makes sense to get the most out of them.

----

"BTW this is a nice example of what happens with read performance - even with TRIM: ...the remapped/fragmented portions have a dip "

So, even if you have TRIM, you suffer a loss in read performance if you have data on the disk and its fragmented, yet you can't defragment it because it's an SSD! How do you fix this? How do you retain the read speed that the SSD had when you first got it (in your case ~225Mb/s) when you're using most/much of the drive for data?

The only way I can see this done is to install all the data that I intend to store on the SSD array all at once so that its not fragmented and continuous. Would this work? If so then it would make sense to put windows on its own partition, such that it wouldn't need to be reinstalled if I had to format the other partition due to fragmentation.

Oh, and isn't the data on an SSD inherently fragmented? Is there a way to get an SSD to re-order the data it stores so that it restores its own performance?

And now the final question - merely an extension of the previous one:

Suppose that my SSD TRIM-supported raid0 array partition (without windows) is nearly entirely full (almost %100 data), [and so my write speed is going to be abysmal - since the amplification factor is going to be through the roof, not that I could write to a drive that was already full], yet the data is continuous as it has been written all at once, would my read speed be conserved?

sub mesa · May 24, 2010

Sorry for the delay; this was still in my submit queue:

tom thumb :

Sort of. Windows doesn't tell the SSD anything when it is freshly installed. But imagine you download an update for Firefox; that update gets downloaded to the windows temp folder, executed and then deleted. Once deleted, Windows would send a TRIM command that will let the SSD know those blocks aren't in use anymore. So only space that was written to in the first place will be TRIMed when deleted.

i) 128Gb SSD TRIM supported, 1 partition, windows installed, 64Gb free space
ii) 128Gb SSD TRIM supported, 1 partition, windows installed, 20Gb "reserved", 44Gb free space
iii) 128Gb SSD TRIM not supported, 1 partition, windows installed, 20Gb "reserved", 44Gb free space

according to how I understand "reserved space", scenarios i and ii both have 64Gb spare (and usable to the ssd) and thus a spare factor of 0.5, whereas scenario iii has 20Gb "reserved" and a spare factor of 20/128 = 0.16.

... it seems that if your SSD supports TRIM there is no need to manually reserve space for your SSD to use, and that all you need to do to retain most of your write performance is to keep your usage below 80% or so.

Am I getting this right?

You're right that with TRIM, the free space on your filesystem is spare space for the SSD. BUT; this spare space isn't very useful to the SSD if it is heavily fragmented. The SSD would prefer a solid block of 10GB free spare space over 20GB of small chunks that are free. In all those chunks, the number of 100% empty erase blocks may be limited. What ultimately counts is having free erase blocks; not free space.

So, a filesystem with 60% full on an SSD with TRIM enabled would have 40% spare + its own internal spare (7%); but it might still show more performance degradation than when you would have reserved 20% of dedicted free spare space to the SSD; in addition of having TRIM on a 60% full filesystem.

In other words, the SSD prefers dedicated space because it consists of one large block with many free erase blocks. That's why reserving space even when you have TRIM capability is a good idea if you think performance has a higher priority than storage space. You don't have to reserve the same amount of space than without TRIM, though. Give it 10% extra - to fight performance degradation over time. I think that's a good investment. You bought SSDs to be fast, anyway.

Given that, as you said, windows doesn't like raid, if I create a raid array from 2 SSDs that support TRIM, should I place windows in its own partition and give it 20% or so free space, or should I just leave everything on one partition?

It does not matter where the free space is; you can have two partitions with C: 100% full and D: 50% full with TRIM. The SSD doesn't know about partitions; it only knows where the Operating System has previously written data to.

Or putting it another way:
What is the performance scaling of SSDs in a raid0 array if that array is not used to store windows (just data and apps)?

RAID0 can improve both random IOps and sequential throughput.

SSDs aren't the medium of choice for large file storage. The performance difference is small compared to HDD and this really is something that 5400rpm HDDs excel at.

Applications and other types of access that is not high-sequential, such as certain games (WoW comes to mind) and plenty of applications, yes those would benefit alot from SSD. Whether RAID0 really makes this type of access faster depends on the application. I would argue Intel SSDs are so fast already they are hardly a bottleneck anymore. Perhaps for writing large files, they are. But that's not all that common on a system disk.

So, even if you have TRIM, you suffer a loss in read performance if you have data on the disk and its fragmented, yet you can't defragment it because it's an SSD! How do you fix this? How do you retain the read speed that the SSD had when you first got it (in your case ~225Mb/s) when you're using most/much of the drive for data?

Some degradation is not avoidable! Small writes WILL have lower than the maximum performance; no way around that. Data written sequentially (in most cases) will end up as static data; which would yield maximum performance.

The slightly lower sequential read is only light degradation which shouldn't be humanly noticeable. However, heavy erase block fragmentation when you've not taken good care of your SSD in terms of spare space/free erase blocks, would. The degradation would affect both reads and writes and generally the SSD becomes quite slow, reverting to the limitations of NAND requiring a read-erase-program cycle whenever part of a 128KiB block (Intel) or 512KiB block (Indilinx) has to be updated; but not fully. Modern controllers use write remapping to avoid this low performance and 'write heavy' method. So as long as the SSD can do its trick and actually write to free erase blocks, you would have high performance.

The only way I can see this done is to install all the data that I intend to store on the SSD array all at once so that its not fragmented and continuous. Would this work? If so then it would make sense to put windows on its own partition, such that it wouldn't need to be reinstalled if I had to format the other partition due to fragmentation.

Nah, that wouldn't work. Just use the recommended guidelines and use the drive per normal; at least that's what i think.

However, making an 'image' of your SSD, then doing a Secure Erase, then restore the image to the SSD... yes that would work. In that case the image is written sequentially, so all data would become 'static'. Please take care not to write to any 'reserved' space; i prefer to do this with a Linux livecd; but many Windows users use commercial programs like Acronis. Take care that it respects the partition alignment and not change it by itself; i assume most recent software would do this.

Oh, and isn't the data on an SSD inherently fragmented? Is there a way to get an SSD to re-order the data it stores so that it restores its own performance?

Garbage Collection; but it comes at the cost of additional writes, thus lower lifetime. Everything is a trade-off in life; you rarely get anything for free. ;-)

Suppose that my SSD TRIM-supported raid0 array partition (without windows) is nearly entirely full (almost %100 data), [and so my write speed is going to be abysmal - since the amplification factor is going to be through the roof, not that I could write to a drive that was already full], yet the data is continuous as it has been written all at once, would my read speed be conserved?

Nope, only static data has 100% read performance, dynamic data (small writes or overwrites/partial writes/etc) would show lower read performance. This can be seen with a simple HDTune benchmark scan; you would see dips where there is dynamic data, and a solid horizontal line where the data is static (written sequentially; though its more complicated than that).

sub mesa · May 24, 2010

You guys may also appreciate these two posts, where i explain things a bit more thoroughly, and perhaps easier to understand:

http://forums.anandtech.com/showpost.php?p=29872623&postcount=41
http://forums.anandtech.com/showpost.php?p=29873077&postcount=53

tom thumb · Jun 21, 2010

sub mesa :

My initial questions have been answered, and I can't thank you enough for your insight. I do have a few more, but I will ask them in a fresh thread.

tom thumb · Jun 21, 2010

Best answer selected by tom thumb.

adampower · Jun 22, 2010

Wow this is a great thread.

Sub Mesa... How does wear leveling come in to play? If I have a 120 GB ssd and partition only the first 100GB will the ssd wear level over the other 20GB. Would this not be much like trim and leave partial writes on my unpartitioned space?

Thanks for all the information

RMH222 · Jul 1, 2010

Hello; Great thread. I question the use of RAID with SSD's for any reason. Redundant Array of Independent(Inexpensive) Disks was developed to overcome limitations of spinning disks with magnetic coatings. SSD's have none of these things. It seems to me a technology for Redundant Array of Independent Solid State Devices (RAISSD (raised)) is what is needed.

Performance scaling with SSDs in raid array

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Share this page