What are you looking for ?
Infinidat
Articles_top

Failure Rate of HDDs and SSDs Based on 207,478 Running Units in 1Q22

6TB Seagate ST6000DX000 HDDs to defy time with zero failure despite average of nearly 7 years

KleinThis report, published on May 4, 2022, was written by Andy Klein principal storage cloud evangelist at Backblaze, Inc.

 

 

Backblaze Drive Stats for 1Q22

A long time ago, in a galaxy far, far away, Backblaze began collecting and storing statistics about the HDDs it uses to store customer data. As of the end of 1Q22, it was monitoring 211,732 HDDs and SSDs in its data centers. Of that number, there were 3,860 boot drives, leaving the company with 207,872 data drives under management.

This report will focus on those data drives. The firm will review the HDD failure rates for those drive models that were active as of the end of 1Q22, and it will also look at their lifetime failure statistics.

In between, Backblaze will dive into the failure rates of the active drive models over time.

Along the way, it will share its observations and insights on the data presented and, as always, it looks forward doing the same in the comments section at the end of the report.

“The greatest teacher, failure is.” (1)
As of the end of 1Q22, the cloud storage company was monitoring 207,872 HDDs used to store data. For this evaluation, it removed 394 drives from consideration as they were either used for testing purposes or were drive models which did not have at least 60 active drives. This leaves us with 207,478 HDDs to analyze for this report. The chart below contains the results of the analysis for 1Q22.

Backblaze Drive Stats 1q22 F4

“Always pass on what you have learned.” (2)
In reviewing the 1Q22 table above and the data that lies underneath, fere are a few observations and caveats:
“The Force is strong with this one.” (3) The 6TB Seagate (model: ST6000DX000) continues to defy time with zero failures during 1Q22 despite an average age of nearly 7 years (83.7 months). 98% of the drives (859) were installed within the same 2-week period back in 1Q15. The youngest 6TB drive in the entire cohort is a little over 4 years old. The 4TB Toshiba (model: MD04ABA400V) also had zero failures during 1Q22 and the average age (82.3 months) is nearly as old as the Seagate drives, but the Toshiba cohort has only 97 drives. Still, they’ve averaged just 1 drive failure per year over their Backblaze lifetime.
“Great, kid, don’t get cocky.” (4) There were a number of padawan drives (in average age) that also had zero drive failures in 1Q22. The two 16TB WDC drives (models: WUH721816ALEL0 and WUH721816ALEL4) lead the youth movement with an average age of 5.9 and 1.5 months respectively. Between the 2 models, there are 3,899 operational drives and only 1 failure since they were installed 6 months ago. A good start, but surely not Jedi territory yet.
“I find your lack of faith disturbing.” (5) You might have noticed the AFR for 1Q22 of 24.31% for the 8TB HGST drives (model: HUH728080ALE604). The drives are young with an average age of 2 months, and there are only 76 drives with a total of 4,504 drive days. If you find the AFR bothersome, the firm in fact finds lack of faith disturbing, given the history of stellar performance in the other HGST drives we employ. Let’s see where we are in a couple of quarters.
“Try not. Do or do not. There is no try.” (6) The saga continues for the 14TB Seagate drives (model: ST14000NM0138). When we last saw this drive, the Seagate/Dell/Backblaze alliance continued to work diligently to understand why the failure rate was stubbornly high. Unusual it is for this model, and the team has employed multiple firmware tweaks over the past several months with varying degrees of success. Patience.

“I like firsts. Good or bad, they’re always memorable.” (7)
We have been delivering quarterly and annual Drive Stats reports since 1Q15. Along the way, we have presented multiple different views of the data to help provide insights into our operational environment and the HDDs in that environment. Today we’d like to offer a different way to visualize comparing the average age of many of the different models we currently use vs. the annualized failure rate of each of those drive models: the Drive Stats Failure Square:

Backblaze Drive Stats 1q22 F2

“…many of the truths that we cling to depend on our viewpoint.” (8)
Each point on the Drive Stats Failure Square represents a HDD model in operation in our environment as of March 31, 2022 and lies at the intersection of the average age of that model and the AFR (annual failure rate) of that model. We only included drive models with a lifetime total of one million drive days or with a confidence interval of all drive models included being 0.6 or less.

The resulting chart is divided into 4 equal quadrants, which we will categorize as follows:
Quadrant I: Retirees. Drives in this quadrant have performed well, but given their current high AFR level they are first in line to be replaced.
Quadrant II: Winners. Drives in this quadrant have proven themselves to be reliable over time. Given their age, we need to begin planning for their replacement, but there is no need to panic.
Quadrant III: Challengers. Drives in this quadrant have started off on the right foot and don’t present any current concerns for replacement. We will continue to monitor these drive models to ensure they stay on the path to the winners quadrant instead of sliding off to quadrant IV.
Quadrant IV: Muddlers. Drives in this quadrant should be replaced if possible, but they can continue to operate if their failure rates remain at their current rate. The redundancy and durability built into the Backblaze platform protects data from the higher failure rates of the drives in this quadrant. Still, these drives are a drain on data center and operational resources.

“Difficult to see; always in motion is the future.” (9)
Obviously, the Winners quadrant is the desired outcome for all of the drive models we employ. But every drive basically starts out in either quadrant III or IV and moves from there over time. The chart below shows how the drive models in quadrant II (Winners) got there.

Backblaze Drive Stats 1q22 F3

“Your focus determines your reality.” (10)
Each drive model is represented by a snake-like line (Snakes on a plane!?) which shows the AFR of the drive model as the average age of the fleet increased over time. Interestingly, each of the 6 models currently in quadrant II has a different backstory. For example, who could have predicted that the 6TB Seagate drive (model: ST6000DX000) would have ended up in the Winners quadrant given its less than auspicious start in 2015. And that drive was not alone; the 8TB Seagate drives (models: ST8000NM0055 and ST8000DM002) experienced the same behavior.

This chart can also give us a visual clue as to the direction of the AFR over time for a given drive model. For example, the 10TB Seagate drive seems more interested in moving into the Retiree quadrant over the next quarter or so and as such its replacement priority could be increased.

“In my experience, there’s no such thing as luck.” (11)
In the quarterly Drive Stats table at the start of this report, there is some element of randomness which can affect the results. For example, whether a drive is reported as a failure on the March 31 at 11:59 p.m. or at 12:01 a.m. on April 1 can have a small effect on the results. Still, the quarterly results are useful in surfacing unexpected failure rate patterns, but the most accurate information regarding a given drive model is captured in the lifetime AFRs.

The chart below shows the lifetime AFRs of all the drive models in production as of March 31, 2022.

Backblaze Drive Stats 1q22 F4

“You have failed me for the last time…” (12)
The lifetime AFR for all the drives listed above is 1.39%. That was down from 1.40% at the end of 2021. One year ago (3/31/2021), the lifetime AFR was 1.49%.

When looking at the lifetime failure table above, any drive models with less than 500,000 drive days or a confidence interval greater than 1.0% do not have enough data to be considered an accurate portrayal of their performance in our environment. The 8TB HGST drives (model: HUH728080ALE604) and the 16TB Toshiba drives (model: MG08ACA16TA) are good examples of such drives. We list these drives for completeness as they are also listed in the quarterly table at the beginning of this review.

Given the criteria above regarding drive days and confidence intervals, the best performing drive in our environment for each manufacturer is:
• HGST: 12TB, model: HUH721212ALE600. AFR: 0.33%
• Seagate: 12TB model: ST12000NM001G. AFR 0.63%
• WDC: 14TB model: WUH721414ALE6L4. AFR: 0.33%
• Toshiba: 16TB model: MG08ACA16TEY. AFR 0.70%

“I never ask that question until after I’ve done it!” (13)

Quotes Referenced
1 “The greatest teacher, failure is.” – Yoda, “The Last Jedi”
2 “Always pass on what you have learned.”- Yoda, “Return of the Jedi”
3 “The Force is strong with this one.” – Darth Vader, “A New Hope”
4 “Great, kid, don’t get cocky.” – Han Solo, “A New Hope”
5 “I find your lack of faith disturbing.” – Darth Vader, “A New Hope”
6 “Try not. Do or do not. There is no try.” – Yoda, “The Empire Strikes Back”
7 “I like firsts. Good or bad, they’re always memorable.” – Ahsoka Tano, “The Mandalorian”
8 “…many of the truths that we cling to depend on our viewpoint.” – Obi-Wan Kenobi, “Return of the Jedi”
9 “Difficult to see; always in motion is the future.” – Yoda, “The Empire Strikes Back”
10 “Your focus determines your reality.” – Qui-Gon Jinn, “The Phantom Menace”
11 “In my experience, there’s no such thing as luck.” – Obi-Wan Kenobi, “A New Hope”
12 “You have failed me for the last time…” – Darth Vader, “The Empire Strikes Back”
13 “I never ask that question until after I’ve done it!” – Han Solo, “The Force Awakens”

Articles_bottom
AIC
ATTO
OPEN-E