What are you looking for ?
Infinidat
Articles_top

Are Disks the Dominant Contributor for Storage Failures?

A comprehensive study of storage subsystem failure characteristics by researchers and NetApp

To read this study by Weihang Jiang, Chongfeng Hu, Yuanyuan Zhou and Arkady Kanevsky, from the Department of Computer Science at The University of Illinois at Urbana Champaign, and NetApp, presented at FAST 08, click on:

Are Disks the Dominant Contributor for Storage Failures? 

 

Abstract:

Building reliable storage systems becomes increasingly challenging as
the complexity of modern storage systems continues to grow.
Understanding storage failure characteristics is crucially
important for designing and building a reliable storage system. While
several recent studies have been conducted on understanding storage
failures, almost all of them focus on the failure characteristics of
one component – disks – and do not study other storage component failures.

This paper analyzes the failure
characteristics of storage subsystems.
More specifically, we analyzed the storage logs
collected from about 39,000 storage systems commercially deployed
at various customer sites. The data set covers a period of 44 months
and includes about 1,800,000 disks hosted in about 155,000 storage
shelf enclosures. Our study reveals many interesting findings,
providing useful guideline for designing reliable storage systems.

Some of our major findings include:

  • In addition to disk failures
    that contribute to 20-55% of storage subsystem failures, other components such
    as physical interconnects and protocol stacks also account for
    significant percentages of storage subsystem failures.
  • Each individual storage subsystem failure type and storage subsystem failure as a whole
    exhibit strong self-correlations.
    In addition, these failures exhibit ‘bursty’ patterns.
  • Storage
    subsystems configured with redundant interconnects experience 30-40%
    lower failure rates than those with a single interconnect.
  • Spanning disks of a RAID group across multiple shelves provides a more
    resilient solution for storage subsystems than within a single
    shelf.

 

Articles_bottom
AIC
ATTO
OPEN-E