CRamseyer :
You don't need the source code to measure the queue depth. You just need to monitor the drive's IO through a software tool There are several.
If you want to measure the latnecy of I/O requests, sure; you can use any number of tools. But we don't know what the "latecny" graph in the article is showing us, so we wouldn't know what to measure to try to replicate the results.
CRamseyer :
The Latency test is a calculation of all test combined.
What does that mean, specifically? The total runtime of all the tests? That can't be true, since the description of the tests given in the article says the tests will run for far longer than the longest (slowest) running drive in the graph.
Or, at least, it isn't clear to me how it could be so. The article links to an
advanced workloads description that describes the "Precondition", "Degradation", "Steady State", and "Recovery" phases of the tests.
The detail is painfully inexact to anyone who thinks about testing deeply, or to anyone who thinks about scientific method. (Really, if there were any stronger case for open-source software, it would be in benchmarking. If source were available for this test, I could just go look. If I wanted to compare the reviewed system to my system at home, I could rebuild the software and do so. If I wanted to reproduce the results and verify them (because, say, I was worried about advertiser dollar bias) I could readily do so.) But here are my questions about the phases of the test as described:
Precondition: That's it, really? No format, or reset, or recondition tool? no TRIM? Just throw the drive in there and with whatever scars it already has, test it?
Degradation: This test writes for 10 minutes, then runs a performance test. It then writes for 15 minutes, does a perf test, and so on ... until the write time is 45 minutes. The total run time is 10+15+20+25+30+35+40+45 == 220 minutes, plus however long eight runs of the performance test take. According to the graph in the article, the runs take something on the order of 200 seconds for the slowest drive, so I can't figure how to make the numbers match. Am I reading the graph incorrectly? The test description incorrectly?
Steady State: There's some bad editing here; "Run writes of random size ... on random offsets for final duration achieved in degradation phase". Does that mean 45 minutes? Why not just say 45 minutes? What does "random offsets" mean, specifically? Any random spot on the drive? Or somehow aligned? That is, do these writes cause read-write-read sector, cluster, or block split operations?
Recovery: An question about all of the phases is appropriate to raise here: what is the "performance test"? The previous section identifies ten different performance test workloads; are they all exercised? Just one? Something different? I'm left to guess at what "performance test" means.
Note that none of these descriptions mention latency. The only place in that whole description where "latency" is mentioned is in this sentence: "We use the overall throughput from each combined test run and the overall latency." Which is meaningless: what is "overall latency"? Latency is the time between a request and its result, vaguely. Maybe it's time from the request to the request being processed; maybe it's to the first byte coming back. Maybe it's to the acknowledgement of the request. Maybe it's the time between the request being sent and the request being received (network latency, for example).
If we're really measuring "overall latency", then I'm not sure what value the test has. This test seems to be designed to beat up the drive and cause lots of garbage and wear leveling activity. The drive should catch up, but the question is how quickly does it catch up, and how completely does it restore its performance? An overall number doesn't express that with any kind of fidelity. Maybe I'm not like most users, but I'd rather use a drive that went to 50% performance for an hour before recovering 95% performance; compared to a drive that went to 10% performance for even a few minutes, before recovering 99% performance.
I'm sorry to be pedantic, but I would have thought disk drive tests to be pedantic by definition -- but I just can't guess what "overall latency" means.
That document doesn't mention "latency" in the context of a disk performance test, either. It's mentioned a few times describing math around scores for web site rendering and graphics perf, though. Further, the description of the "Stability Test" in that document is different than the one given by Tom's in the link above.