Annual Hard Drive Reliability Report: 8TB, HGST Disks Top Chart Racking Up 45 Years Without Failure (arstechnica.com) 30
Online backup solution provider Backblaze has released its much-renowned, annual hard drives reliability and failure report. From a report on ArsTechnica: The company uses self-built pods of 45 or 60 disks for its storage. Each pod is initially assembled with identical disks, but different pods use different sizes and models of disk, depending on age and availability. The standout finding: three 45-disk pods using 4TB Toshiba disks, and one 45-disk pod using 8TB HGST disks, went a full year without a single spindle failing. These are, respectively, more than 145 and 45 years of aggregate usage without a fault. The Toshiba result makes for a nice comparison against the drive's spec sheet. Toshiba rates that model as having a 1-million-hour mean time to failure (MTTF). Mean time to failure (or mean time between failures, MTBF -- the two measures are functionally identical for disks, with vendors using both) is an aggregate property: given a large number of disks, Toshiba says that you can expect to see one disk failure for every million hours of aggregated usage. Over 2016, those disks accumulated 1.2 million hours of usage without failing, healthily surpassing their specification. [...] For 2016 as a whole, Backblaze saw its lowest ever failure rate of 1.95 percent. Though a few models remain concerning -- 13.6 percent of one older model of Seagate 4TB disk failed in 2016 -- most are performing well. Seagate's 6TB and 8TB models, in contrast, outperform the average. Improvements to the storage pod design that reduce vibration are also likely to be at play.
HGST nearly always on top (Score:2, Insightful)
Every time Backblaze publishes a report the HGST drives always come out on top.
It's a little more expensive to fill your NAS with them but in my experience it's been worth it.
Maybe, but apparently not at BackBlaze's scale. The higher failure rate of the Seagates is offset by the lower price.
That effect is so inconsequential to be practically zero.
In other news - in 2062 they will have time travel (Score:1)
Is aggregate usage even a meaningful metric?
In other news, in 2062 they will have time travel, otherwise how could you possibly know that just-released 8TB drive would last 45 years?
Is aggregate usage even a meaningful metric?
It tells you the MTBF for right now, but it's not useful to predict MTTF unless you know the shape of the bathtub curve. It takes a few years to build that curve.
One you see the error rate start to rise, it can be effective to fit to the expected curve shape, but not always. Crystal balls are unreliable.
The bathtub curve is real, and if you follow BackBlaze tips, they show that years 2-4 are usually exceptional in terms of reliability.
My recommendation is to buy the NAS/SAN/POD/Whatever and spin it up for 3 months, then put it into production and then wait 42 months. After that, start planning and when the next drive fails in the 42-48 month range, start the purchasing process (depending on lead time needed), get it installed, wait 3 months to get early failures out of the way than transfer data
Also, how do you account for bad batches/production runs or do they always show up during initial 3 month period?
The bathtub curve is real
This Backblaze report, previous Backblaze reports, and the Google logitudinal disk reliability study [googleusercontent.com], have all found that the "bathtub curve" is a myth. HDDs do not have high early failure rates, nor does the failure rate suddenly rise after a set period of time.
Another myth that these studies have debunked is that HDDs do better if kept cool. Actually, failure rates are lower for disks kept at the higher end of the rated temperatures. This is one reason that Google runs "hot" datacenters today, with amb
Yes, aggregate usage is a meaningful metric, if you know what it defines. MTBF can be tricky, in many cases it is converted to Annualized failure rate (AFR) [wikipedia.org] to obtain a meaningful metric.
However, it makes no sense to employ metrics based on an exponential distribution model (which does not have memory) to compare different sets of disks. In particular, the summary says 13.6 percent of one older model of Seagate 4TB disk failed in 2016... If such drives are older (and thus present a longer uptime) their age
In other news, in 2062 they will have time travel, otherwise how could you possibly know that just-released 8TB drive would last 45 years?
Is aggregate usage even a meaningful metric?
I find it hard to believe. It isn't measuring 45 years worth of things like metal fatigue, material decay or degeneration, wear and tear etc.
What its really saying is that early failures are at a very low rate; they've measured lots of disks for a few years and can show that these disks don't typically fail in the first few years of use. Totally different from saying that one of these disks can last 45 years of continuous use. To represent it as that seems like something doomed to litigation.
Real article (Score:4, Informative)
Arstechnica just borderline copy&pasting from the source. See the actual article at: https://www.backblaze.com/blog... [backblaze.com]
Shame on Arstechnica for not even bothering to link their source material.
Yes, Why the resistance to publishing the URL to the original article? Does ARS pay kickbacks?
The figure is meaningless. (Score:2)
45 years spread over a bunch of drives without a failure doesn't mean that we can expect any individual drive to last 45 years.
The number of times I've bought new drives because I needed increased capacity far exceeds the number of times I've had to replace a failed drive.
I had to replace a couple 2TB Seagate drives before I needed more capacity. Of course I replaced them with bigger drives but could have gone years with the existing capacity had the drives continued operation without difficulty. This model also had a very high failure rate in older Backblaze reports.
These numbers do, however, suggest that you can expect a very low failure rate of those drives within the first year (less than 2.2%). And realistically, you'll probably get far more than that under similar conditions.
45 years (Score:2)
Aggregate years are not years.
"Nine women can't make a baby in one month."
"Nine women can't make a baby in one month."
Well... some people will tell you that babies are created at the time of conception...
Seagate still sucks. (Score:2)
In a server, always on environment these are great numbers but in power conservative desktops/home NAS situations I'd love to see CSS numbers. Again though, Seagate still sucks.