Hard Drive Lottery

show_me_the_evidence

You buy a hard drive and expect it to work through the duration of the product’s warranty period, or get a replacement if it fails within that warranty window. Simple enough, right? Well, if it replaced the data you stored on it, then it would be a simple parts swap. But it isn’t that easy, as the data you store on a hard drive is held captive on those spinning magnetic platters. So while a comprehensive backup plan can remedy part of this, most people don’t have one in place. When the hard drive dies, so does your data in many situations.

That’s why selecting a reliable hard drive (or SSD) is an especially important thing to consider, more-so than almost any other component in a computer (with maybe the very rare edge case of a faulty power supply zapping everything in your computer). To use an unconventional metaphor, your data is to the soul of your computer as the components are to the body of your computer. And when the part of the computer that stores all that data (the hard drive or SSD) dies, so does everything that makes your computer your computer.

Anthropomorphizing aside, choosing a reliable place to store your data is an underemphasized part of choosing a computer. If your DVD drive dies, you get a new one under warranty or buy parts to replace it. Same thing with the power supply, motherboard, CPU, etc. But if your hard drive dies and takes with it the past 10 years worth of data that were never backed up anywhere else? Or your music and video collection that has been carefully curated over many years? Well, that’s a lot bigger problem.

We have to ask the question (posed by the altered quote of Jerry Maguire at the top of this post): where is the empirical evidence to demonstrate which hard drive is the most reliable? Asking the hard drive manufacturer to tell you which hard drive is the most reliable drive to buy is fraught with financial incentives and conflicts of interest that don’t work in your favor.

The metric of MTBF (mean time between failures) is a commonly used statistic to approximate reliability. But that metric is calculated based on testing in a controlled environment and doesn’t take into account quality control issues or unforeseen usage variables that can have a big impact on the overall lifespan of any one drive in particular. When the safety of your data is at stake, you’d like more than a statistical probability of an extrapolated estimate from the manufacturer trying to sell products.

Regardless of the manufacturer’s claims (or even completely sound and honest reliability testing), it’s hard to beat real-world results. That’s why empirical evidence is so powerful (*cough* science-bias *cough*). And that’s why published data on large-scale deployments of hard drives and SSDs are so valuable. (Not anecdotal online reviews of products like “I have never had any issue with the 8 Western Digital hard drives I have owned”). Google released a study several years ago on the environmental impact of hard drives used in their data centers (short overview here). But it didn’t provide the most valuable piece of info for consumers looking to find the most reliable hard drive: the make and model info.

Enter: Backblaze. They are a cloud storage and backup provider with a healthy appetite for buying large quantities of high-capacity hard drives. And although I wouldn’t recommend using these drives for your operating system (see just about anything I have written on the subject of SSDs), nothing beats a traditional spinning hard drive in terms of cost-per-GB for storage. So, it was of great interest to technology enthusiasts when Backblaze started publishing the failure rates of the different hard drive models running in their data centers. It’s a statistically significant data set, and one that provides some of the most reliable hard drive buying advice ever.

You can read their previous report here. I read it and made a much different choice than I otherwise would have for a recent hard drive purchase. Typically, I choose a hard drive based on a well-known brand, largely for the benefit of warranty support. So do a lot of other intelligent consumers. But there is a good amount of unconscious brand equity that influences my buying decisions, and that isn’t always supported by the actual quality of the hard drives (in terms of reliability and low failure rates). Online customer reviews are usually the best available gauge of reliability, but are biased towards failures (since they have something to complain about).

In this case, I bought a nondescript HGST model (refurbished, no less) for video editing storage. I could not have predicted I would have made that choice. Spoiler alert: the brand equity, marketing, packaging, and sales channel support of the more recognized brands of Western Digital (HGST is a sub-brand of WD) and Seagate don’t really matter when it comes to hard drive reliability and low failure rates. Surprise! As individual consumers, we can’t buy thousands of drives and test them for years before making a decision to buy a single hard drive for home use. That’s why the reports that Backblaze has published are a great resource (and a nice plug for their backup service).

Returning to the concern about the importance of keeping your data safe, the first step is to have a reliable starting point. Backups (such as Backblaze, CrashPlan, etc.) are another important piece of the puzzle to keeping your data safe. (And if you really want to geek out on data integrity, there’s always ECC RAMfile system checksums, ReFS, and ZFS to worry about.) But for the average consumer, having a reliable starting place for storing your data and good backup setup are going to go a long way towards protecting the soul of your digital life. So, the next time you’re looking to upgrade or replace a hard drive or SSD in your computer, make sure it’s reliable.

(And if you don’t have the time or patience to research it on your own, ask me for advice!)

Leave a Reply

Your email address will not be published. Required fields are marked *