Samsung 950 Pro SSD Review

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Could you include a test for heat? The older SSD which I have lasts only a few minutes at full speed before it overheats and slows down. A speed test over time should be done. You mentioned using these to handle video. Well you cannot render video with the 951. Ultimately it over heats and slows down to less than USB 1.0.
 
Building a new machine. Fearing stability issues with this unit - everything is so new. Will there be any real-world difference between this and a 850 Evo?
 


Depends on what you do. If gaming mostly, it's best to just get the best deal between those two. But if you work with data sets, the 950 pro would be worth it.
 


use the money for better gpu or cpu or more ram or bigger ssd.
 
Very cool, I was/am excited for this. However the real world bencmarks say it all. Best case scenario 12 seconds on a 6 minute job is not much and many of those tests were showing half a second difference in performance. Nowhere near good enough for me. Still a nice step but I'll wait for cheaper m2 drives.

I noticed the same thing; in the demanding applications I use (AAA games) and the normal web browsing/email/photo collection there is NO IMPROVEMENT over an 850 Pro. I built my new Z170 system with my old 64GB SSD as a system drive while waiting for the 950 Pro. After seeing the real world numbers I ordered the 256GB 850 Pro for less than half the cost and, apparently, all the real-world speed.

The 950 crushes synthetic benchmarks, but those real world numbers are where it matters.
 
Except the Adobe Photoshop Heavy Workload, the real world numbers are no different than using an 850 Pro SSD with twice the capacity for half the cost.

UPDATE: Apparently, most real world software isn't optimized for SSD's this fast. I understand that when a program knows it's going to call data from a disk there is a built in pause for eight to ten milliseconds to allow the data to show up. If a superfast SSD on a PCIE 3.0 path gives that data to the program in 2 milliseconds...the program holds the allotted time anyway.

Anyway, that's the best explanation I've seen on why SSD's on faster busses crush synthetic benchmarks and raw testing, but give almost no improvement in real applications.
 


Don't even think Photoshop would use the disk that much. Better off with lots of RAM.

The (sort of) exception would be Lightroom, particularly if you have a large photo library. That will be very taxing on an SSD, and having a very low latency drive will be beneficial.

I say "sort of" because Lightroom is a database program, and I already said you'd benefit from the 950 Pro if using datasets. Lightroom is really just an example of a dataset application. It's just prettier than your normal example.
 
What, precisely, are the "Latency Tests" measuring? The description in the "advanced workloads" link is very imprecise, and doesn't specifically describe the latency measurement.

Is "overall latency" the sum of the latency measured on each request made to the drive? That seems like a good guess given the description, but since the "degradation" test gets five minutes longer each run, the total latency would increase at least proportionally, if that were true. What is the "Total Latency" chart precisely measuring?

The text doesn't describe the I/O model of the tests, either. Are all the tests performed with a single thread, meaning a queue depth never greater than one? Is any type of overlapped I/O being used?

Is the source code for this test available for analysis and inspection? Can a better description be provided? Is it possible to download the test so that the results can be verified and compared independently?


 


Getting at that exact answer would require an inspection of the code, but comparing results across sites shows that people are getting about 10x faster on the NVMe PCI-e drives than on the SATA and PCI-e AHCI drives.

http://www.anandtech.com/show/9702/samsung-950-pro-ssd-review-256gb-512gb

All I can say is that the generally understood definition of latency is the time between an input and a response, which you may have already known. Perhaps a more efficient way of verifying whether something is measuring that correctly would be to compare results across different test suits from various sites. It's highly unlikely they all would use the same methodology. If variation is low, then it's likely that the test suits are all accurate, though it would technically remain possible that they are all similarly flawed (perhaps from using the same underlying technology or something).
 
Yes, I already know that. Problem is, I can't apply that definition to anything I know about SSDs, and nothing about the description of the test as provided. What is it, specifically, that takes around 200 seconds (not milliseconds, or microseconds!) in these tests? The description of the test doesn't explain how its latency result is measured or computed, so anyone who reads the article is in the dark about what the numbers actually mean.



A path to the exact answer doesn't necessarily go through looking at the code. A detailed description of a test is something I expect to come along with the test results. The author of this piece, having not provided that, shows us that they're mindlessly running tests with no understanding of what the results mean. They're therefore necessarily incapable of providing any insight into what the results actually mean.

If this test is a commercial product, then certainly the supplier can answer these fundamental questions without showing us the code. Some of the questions I ask are also verifiable by observing the test as it runs. The queue depth on the device, for example, is reported by the operating system and that quantity can be monitored during the test run. If it's never greater than one, we know the test is doing blocking I/O on a single thread. Higher numbers tell us that there are multiple outstanding requests.

 
Yeah, I referred to the code because we don't know how those reports about queue depth and such are calculated as they are displayed in real time. But that aside, I'm sure the Tom's people are capable of explaining the methodology, and I'm sure you know that too; otherwise, why ask for it? If I recall, though, I think they have published a detailed article about how they test these things in the past. It was pretty dense, and it was properly it's own piece. Perhaps they should have a link to that piece at the beginning of every relevant article. In any case, you definitely have a point about readers either needing be sophisticated or else do independent research in order to appreciate the significance of these tests.

EDIT: Here's what they have on methodology. Hope it helps.

http://www.tomshardware.com/reviews/how-we-test-storage,4058.html

Also, not sure if these resources will help, but here they are in case (haven't read thoroughly, so I can't vouch, but they seem potentially relevant)

http://www.thessdreview.com/featured/ssd-throughput-latency-iopsexplained/

http://www.snia.org/sites/default/education/tutorials/2010/spring/solid/LeviNorman_Latency_The_Heartbeat_SSD.pdf

http://www.tomsitpro.com/articles/enterprise-ssd-testing,2-863.html
 
You don't need the source code to measure the queue depth. You just need to monitor the drive's IO through a software tool There are several.

The Latency test is a calculation of all test combined. The software is from Futuremark. It's the Advanced Storage Test 2.0 that is run from a command line. If you read our How We Test article there is a link to the technical information.

http://www.futuremark.com/downloads/pcmark8-technical-guide.pdf
 


What's your plain English sense for latency and other things from dealing with all these drives? I've read your reviews and the testing methodology articles, but am curious about your general feel from first-hand experience. For instance, were there any drives that made you giggle with nerdy excitement, as opposed to just reading that the result numbers were this or that? Information available is largely objective, and I appreciate that, but the subjective aspect helps paint a fuller picture, so anything you personally have to ad in the regard would be appreciated. Thanks!
 
To be honest the latency test (that measures and combines all latency from the 1 hour test) is a very good measurement. It almost always falls in line with what I experience while setting up MobileMark (the notebook battery life test). The process of setting the test up takes around a half hour. It's a fair around of time for hands on experience given the deadlines we go up against. I run the MobileMark test before I even chart the benchmark results but I have a good idea of how they will look from the time I spend with the drive as an OS boot drive.

There are some products, like the one I'm writing right now, that I wonder if a good 7,200 RPM hard disk drive is faster than. In some cases the numbers are not too far apart.

I think everyone has been conditioned to look at throughput and the amazement of 500 MB/s compared to the 100 MB/s performance we had before SSDs. The latency is what we all feel as speed though. We're not too far away from publishing some advanced latency results that show in more detail where the products fall latency wise. The reporting charts are holding us back right now. I'm looking for a more effective way of showing what everyone actually feels or perceives as far. I almost released it on the 950 Pro review but held back.

You can see some of the results here: https://www.facebook.com/photo.php?fbid=10206357590477126&set=gm.1708517652714776&type=3&theater
 


That's exactly the kind of thing I was curious to read. Thanks!

The biggest perceived improvement that I remember feeling was replacing a 5,400 rpm HDD with an 840 EVO 500 GB in what was an otherwise-fast laptop . Doing that probably had a psychological effect similar to winning a slot machine the first time you gamble. That is, things may never be that good again, but the initial win leaves you hoping for another because you remember how good winning was. With storage, though, it's relieving a bottleneck instead of winning on a slot machine.

My hope this next time around is for a perceived performance improvement in storage similar to HDD-SSD. Experiencing that seems like it can be a reality when post-NAND memory merges with these new NVMe controllers running over PCI-e lanes tied directly to the CPU. Those technologies coming together on a boot drive have potential for a real game-changing experience, because of the low queue depth and latency improvements, similar to what upgrading from an HDD to an SSD current does on current machines.
 
I think our next big jump needs to come from the software side. An Intel document circulated a few years ago that talked about how software accesses storage and how the IO responds. A game developer made a test version that calls for the data in a highly threaded fashion as a proof of concept. I don't remember all of the details because it was somewhere around 2008 or 2009. I do recall they used the optimized IO steams to increase both the load leveling performance and gaming experience (higher frames, less latency accessing data).

I think we will get our first taste of a flash optimized experience with VR products. So much of that work started over from square one and it wouldn't make sense for them to build on legacy interfaces. The IMFT 3D XPoint information we know about kind of backs that up as well. Both Intel and Micron stated the tech would be released for gaming applications first. Given the volume of data that needs to have X, Y, and Z information it only makes sense to put that in a database. That type of data is highly compressible and runs very fast on flash.
 
If you want to measure the latnecy of I/O requests, sure; you can use any number of tools. But we don't know what the "latecny" graph in the article is showing us, so we wouldn't know what to measure to try to replicate the results.

What does that mean, specifically? The total runtime of all the tests? That can't be true, since the description of the tests given in the article says the tests will run for far longer than the longest (slowest) running drive in the graph.


Or, at least, it isn't clear to me how it could be so. The article links to an advanced workloads description that describes the "Precondition", "Degradation", "Steady State", and "Recovery" phases of the tests.

The detail is painfully inexact to anyone who thinks about testing deeply, or to anyone who thinks about scientific method. (Really, if there were any stronger case for open-source software, it would be in benchmarking. If source were available for this test, I could just go look. If I wanted to compare the reviewed system to my system at home, I could rebuild the software and do so. If I wanted to reproduce the results and verify them (because, say, I was worried about advertiser dollar bias) I could readily do so.) But here are my questions about the phases of the test as described:

Precondition: That's it, really? No format, or reset, or recondition tool? no TRIM? Just throw the drive in there and with whatever scars it already has, test it?

Degradation: This test writes for 10 minutes, then runs a performance test. It then writes for 15 minutes, does a perf test, and so on ... until the write time is 45 minutes. The total run time is 10+15+20+25+30+35+40+45 == 220 minutes, plus however long eight runs of the performance test take. According to the graph in the article, the runs take something on the order of 200 seconds for the slowest drive, so I can't figure how to make the numbers match. Am I reading the graph incorrectly? The test description incorrectly?

Steady State: There's some bad editing here; "Run writes of random size ... on random offsets for final duration achieved in degradation phase". Does that mean 45 minutes? Why not just say 45 minutes? What does "random offsets" mean, specifically? Any random spot on the drive? Or somehow aligned? That is, do these writes cause read-write-read sector, cluster, or block split operations?

Recovery: An question about all of the phases is appropriate to raise here: what is the "performance test"? The previous section identifies ten different performance test workloads; are they all exercised? Just one? Something different? I'm left to guess at what "performance test" means.

Note that none of these descriptions mention latency. The only place in that whole description where "latency" is mentioned is in this sentence: "We use the overall throughput from each combined test run and the overall latency." Which is meaningless: what is "overall latency"? Latency is the time between a request and its result, vaguely. Maybe it's time from the request to the request being processed; maybe it's to the first byte coming back. Maybe it's to the acknowledgement of the request. Maybe it's the time between the request being sent and the request being received (network latency, for example).

If we're really measuring "overall latency", then I'm not sure what value the test has. This test seems to be designed to beat up the drive and cause lots of garbage and wear leveling activity. The drive should catch up, but the question is how quickly does it catch up, and how completely does it restore its performance? An overall number doesn't express that with any kind of fidelity. Maybe I'm not like most users, but I'd rather use a drive that went to 50% performance for an hour before recovering 95% performance; compared to a drive that went to 10% performance for even a few minutes, before recovering 99% performance.

I'm sorry to be pedantic, but I would have thought disk drive tests to be pedantic by definition -- but I just can't guess what "overall latency" means.



That document doesn't mention "latency" in the context of a disk performance test, either. It's mentioned a few times describing math around scores for web site rendering and graphics perf, though. Further, the description of the "Stability Test" in that document is different than the one given by Tom's in the link above.

 
Most likely you are reading the wrong section of the document. I have to finish a review but when I'm finished I'll walk you through it.




Precondition phase
1. Write the drive sequentially through up to the reported capacity with random data,
write size of 256*512=131072 bytes.
2. Write it through a second time (to take care of overprovisioning).
Degradation phase
1. Run writes of random size between 8*512 and 2048*512 bytes on random offsets for
10 minutes.
2. Run performance test (one pass only). The result is stored in secondary results with
name prefix degrade_result_X where X is a counter.
3. Repeat 1 and 2 for 8 times and on each pass increase the duration of random writes
by 5 minutes
Steady state phase
1. Run writes of random size between 8*512 and 2048*512 bytes on random offsets for
final duration achieved in degradation phase.
2. Run performance test (one pass only). The result is stored in secondary results with
name prefix steady_result_X where X is a counter.
3. Repeat 1 and 2 for 5 times.
Recovery phase
1. Idle for 5 minutes.
2. Run performance test (one pass only). The result is stored in secondary result with
name recovery_result_X where X is a counter.
3. Repeat 1 and 2 for 5 times.
Clean up
1. Write the drive sequentially through up to the reported capacity with zero data, write
size of 256*512=131072 bytes.
 
Thanks! I bet that would help everyone who reads the article.



That's the text I'm referring to in my post of 1344 on November 2. It's available from the "To learn how we test advanced workload performance, please click here" link in the original review.

The original review doesn't describe "latency" other than to say that "they're important when you're looking for snappy responsiveness". The graph isn't described at all in the original review; the Y axis is labelled "Total Latency Seconds - Lower is Better", and the X axis is a series of categories that seem to relate to the Degrade, Steady State, and Recovery phases in the description. However, the graph is a line graph, implying there's continuous quantity state between each phase of the test. If the timings were discrete for each phase, then a bar chart would be used, right? Space under a line in a graph can be integrated, but bar indicates a discrete value that can't be integrated. Since the X axis appears to be discrete test results, why is a line used instead of a bar?

Anyway, the "how we test" page doesn't mention anything about latency, even outside the text you quote. It just says "We use the overall throughput from each combined test run and the overall latency", which doesn't tell the reader anything about how latency is being measured or what it means.

In the "Latency Tests" section of the Samsung 950 pro review, the 950 Pro (512 GB) drive is scoring just under 200 seconds of latency for each phase of that latency test. (Or, scoring that time continuously under the graphed line? What does that mean?) Let's call the measurement 190 seconds -- what is it in the "Degrade 1" test that takes 190 seconds? Quoting your quote:

Degradation phase
1. Run writes of random size between 8*512 and 2048*512 bytes on random offsets for 10 minutes.
2. Run performance test (one pass only). The result is stored in secondary results with name prefix degrade_result_X where X is a counter.
3. Repeat 1 and 2 for 8 times and on each pass increase the duration of random writes by 5 minutes

Step 1 takes 600 seconds, so that can't be it because it's much longer than 190 seconds.

Step 2 might take 190 seconds, but we don't know, since there's no description of "performance test". (Why not "run the performance test", by the way? A particle is missing from that sentence, I think, but that editing mistake is consistent through the text you've quoted.) Grammar questions aside, what is "performance test"? Is the timing given for all of "performance test", or just the "total latency" that "performance test" encountered, even though the overall run of that iteration of "performance test" took longer than 190 seconds of wall time?

The review has this to say:

What you are seeing is garbage collection and wear-leveling utilizing the controller's resources. Some companies delay wear-leveling operations until the drive is idle. Others give priority to housekeeping activities. There are pros and cons to both approaches. Since the 950 Pro will invariably end up in enthusiast and workstation PCs, this is the method Samsung deemed best.

But that doesn't help the reader understand what's happening as "This" in "this is the method Samsung deemed best" has no clear antecedent. Did Samsung deem the delay of wear-leveling operations best, or did they deem giving priority to housekeeping activities to be best? Are "housekeeping activities" the same as "wear-leveling operations", anyway?

I'm sorry if these questions are awkward and repetitive, but I'm just trying to figure out what the article is trying to tell me. There's really no clear description of the tests, the graphs are awkwardly presented, and the results are superficially and confusingly described, so I'm just baffled by the article and can't figure out how the results are meant to be interpreted.
 
So today I got my Samsung 950 PRO 512GB.
Samsung software is MIA.
Used Active@Boot(latest version) to clone my 3 day old installation of Windows 10 on Samsung 850 PRO.
Booting from 950 I get nothing but BSODs with random errors. Windows 10 repair wont fix anything.
Ugh, will try a different cloning software tomorrow.

All this on a new ASUS Z170-AR.
 
Did you remember to go into the advanced BIOS settings and set the port correctly? I believe on the ASUS Z170 boards the M.2 port both defaults to SATA *and* shares path with some of the SATA ports that would need to be disabled.




 


Windows doesn't seem to like the m.2s initially. Kept getting error messages and such when I tried the free W10 update after it ran for almost a day before failing. I tried the download, thumb drive and disc as well as W repair before finally after the fourth or fifth try getting exasperated and buying a new $100 W10, reformatting my M.2 and loading it directly with all the other drives disconnected. Back when I first attempted to download the free update you couldn't even get through to MS for help and the MS solution online was to carry your PC to a MS store. The only advantage was with a clean install, I updated all my drivers to the latest W10 versions.

I had similar problems installing W7 and had to also buy a new copy of W8.1 when I first bought and installed the M.2 drive. MS monopoly sucks, but hopefully this was the last time I have to deal with it. It was shocking that my W10 free upgrade on my laptop with standard SSDs went really easy and fast.
 
Status
Not open for further replies.