R&D: Disk-Based Archival Storage System Using EOS Erasure Coding Implementation for ALICE Experiment at CERN LHC
Discuss CDS system design based on JBOD products, performance limitations, and data protection strategy accommodating EOS EC implementation, and present CDS operations for ALICE experiment and long-term power consumption measurement.
This is a Press Release edited by StorageNewsletter.com on September 20, 2022 at 2:00 pmJournal of Information Science Theory and Practice has published an article written by Ahn, Sang Un, Korea Institute of Science and Technology Information (KISTI), and Global Science experimental Data hub Center (GSDC), Betev, Latchezar, Bonfillou, Eric, European Organization for Nuclear Research (CERN), Han, Heejune, Kim, Jeongheon, Lee, Seung Hee, Korea Institute of Science and Technology Information (KISTI), Panzer-Steindel, Bernd, Peters, Andreas-Joachim, European Organization for Nuclear Research (CERN), and Yoon, Heejun, Korea Institute of Science and Technology Information (KISTI).
Abstract: “Korea Institute of Science and Technology Information (KISTI) is a Worldwide LHC Computing Grid (WLCG) Tier-1 center mandated to preserve raw data produced from A Large Ion Collider Experiment (ALICE) experiment using the world’s largest particle accelerator, the Large Hadron Collider (LHC) at European Organization for Nuclear Research (CERN). Physical medium used widely for long-term data preservation is tape, thanks to its reliability and least price per capacity compared to other media such as optical disk, hard disk, and solid-state disk. However, decreasing numbers of manufacturers for both tape drives and cartridges, and patent disputes among them escalated risk of market. As alternative to tape-based data preservation strategy, we proposed disk-only erasure-coded archival storage system, Custodial Disk Storage (CDS), powered by Exascale Open Storage (EOS), an open-source storage management software developed by CERN. CDS system consists of 18 high density Just-Bunch-Of-Disks (JBOD) enclosures attached to 9 servers through 12Gbps Serial Attached SCSI (SAS) Host Bus Adapter (HBA) interfaces via multiple paths for redundancy and multiplexing. For data protection, we introduced Reed-Solomon (RS) (16, 4) Erasure Coding (EC) layout, where the number of data and parity blocks are 12 and 4 respectively, which gives the annual data loss probability equivalent to 5×10-14. In this paper, we discuss CDS system design based on JBOD products, performance limitations, and data protection strategy accommodating EOS EC implementation. We present CDS operations for ALICE experiment and long-term power consumption measurement.“
This work was supported by the National Research Foundation of Korea (NRF) through contract N-22-NM-CR02 and the Program of Data Computing Service for Large-scale Experimental Data (K-22-L02-C02).