I'm no statistician, so judge the following for yourself.
Failure rates for most things have a probability distribution that changes with time. For example: one in a million fails within the first minute, one in 100,000 within the first hour, 1 in 10,000 within the first month, etc. Every individual drive is different, EVEN when used under the exact same conditions. Estimated "mean time between failure" (MTBF) is the average for the whole "population" of that model drive out there. Any one drive could fail at any time, very soon or very late. But on average, they will fail at the "mean time between failure". I don't know how they calculate this, but to test it empirically would require testing a whole lot of drives to failure, to get a statistically valid sample.
By having two drives instead of one (assuming the same conditions otherwise and all that stuff), you effectively double your chance of a drive failure at any point in time, like buying 2 lottery chances instead of one. However, in my estimation, a 2 drive RAID array (identical drives) would have the SAME "MTBF" as a single drive because they're both part of the same population of "identical" drives. The problem is that even though the MTBF is the same, the probability that one drive will fail at any given time is doubled, and since the drives are effectively "in series", you lose the whole array, like x-mas tree lights strung in series vs. parallel.
I don't know if ANY data can be recovered from a busted RAID 0 array. I think it's possible to salvage a good part of the data from a busted single drive. This may be more important than MTBF. How critical is you data?
Any statisticains out there? Set us straight!
the more I learn, the less I'm sure I know... 😱