What are you looking for ?
Infinidat
Articles_top

Genomics England Scales Up Genomic Sequencing

With Quantum ActiveScale object storage, WekaIO parallel file system and Mellanox networking

Quantum Corp. announced that Genomics England has expanded its  ActiveScale object storage solution as part of an integrated environment designed to store, protect, and provide access to hundreds of petabytes of genomic data.

Genomics England Scales Up Genomic Sequencing

The solution enabled the organization to scale up from sequencing 100,000 genomes to millions while improving data resilience, controlling costs, and avoiding IT complexity. The implementation supports Genomics England’s commitment to sequence the genomes of intensive care patients with Covid-19 and other people with the virus and is an example of Quantum in storing and managing unstructured data.

Expanding Genomic Work Beyond Limits of Existing NAS System
Genomics England was established in 2013 by the UK’s Department of Health & Social Care to support the 100,000 Genomes Project, an effort to sequence whole genomes from a vast number of patients with rare diseases and common cancers. In 2018, the project was expanded: The new goal was to sequence up to 5 million genomes over 5 years.

Unfortunately, the existing NAS system used for genomic data was not up to the task. The NAS, which held 21PB of data, had reached its node-scaling limit. Genomics England needed something more scalable than existing NAS solutions, an infrastructure that could grow to hundreds of petabytes. A new solution also had to facilitate simple, flexible access to data by more than 3,000 researchers around the world.

Selecting ActiveScale Object Storage as Part of a Single, Integrated Solution
To explore storage solutions, Genomics England consulted with Nephos Technologies Ltd, an independent UK-based data services organization, to design and implement a new storage solution. After evaluating several possibilities, the Nephos team designed a multi-faceted solution that incorporates a parallel file system from WekaIO, Inc., Mellanox Technologies, Ltd.‘s high-speed networking, and ActiveScale object storage.

The solution creates a 2-tier architecture that combines flash storage plus an object storage system, which serves as a long-term data lake repository. The 2 storage tiers – each of which can be scaled independently – present as a single hybrid storage environment. As a result, researchers have the flexibility to query data in a randomized fashion.

Taking on new Challenges During Covid-19
Within a few years of deploying the storage environment, Genomics England needed to expand again. The emergence of the Covid-19 in early 2020 presented urgent challenges for the medical-scientific community, and Genomics England was in a prime position to help better understand who is susceptible to the virus. The organization committed to sequencing the genomes of up to 20,000 intensive care patients with Covid-19 plus up to 15,000 people with the virus who are experiencing only mild symptoms.

Around the same time that it was ramping up participation in Covid-19 research, the ActiveScale solution platform was acquired by Quantum. Its team facilitated a transition for Genomics England, which then expanded the object-storage environment from 40PB to more than 100PB.

The ActiveScale system’s architecture is underpinned by its RAID replacement technology, with dynamic placement of erasure coded data. That placement of data eliminates the need for system re-balancing which can compromise performance and availability.

Protecting Vital Genomic Data
ActiveScale object storage protects data and provides the resiliency that Genomics England needs for its critical work. The organization takes advantage of the geo-distributed capability of ActiveScale, an added strength of Quantum’s RAID replacement technology that spreads data and parity across multiple nodes in the storage grid. With ActiveScale object storage, the organization distributes data across 3 data centers, for full data protection vs. a major disaster such as site-loss. Data can continue to be accessed for reading and writing at the remaining sites and withstand additional hardware failures offering 19x9s data durability.

Gaining Scalability While Controlling Costs and Complexity
With ActiveScale object storage, the end user no longer faces the capacity limits of its previous NAS solution. The organization has been able to expand its object storage to support more genomic analysis and even take on additional Covid-19 work without a major overhaul. In the future, it can integrate ActiveScale object storage with the Amazon S3-compliant public cloud environments for additional protection and scaling flexibility.

The storage environment is also helping to reduce costs. According to Nephos, the Genomics England team decreased storage costs by 75% per genome compared with the previous environment. The organization is expected to reduce costs by 96% by 2023.

Just as important, its team has experienced these benefits without adding complexity. The new integrated storage environment makes it simple for researchers from around the world to store and access the genomic data they need for their work.

Case study for Genomics England

Articles_bottom
AIC
ATTO
OPEN-E