What are you looking for ?
Advertise with us
RAIDON

CERN Storage Gets Ready for Run 3

Extensive consolidation of storage infrastructure culminated in 2020 when Tape Archive software entered production.

The output, or product, of CERN‘s experimental programme is its data: vast amounts of data are produced by the detectors in the LHC experiments and elsewhere in the accelerator complex.

On 29 June, CERN Tape Archive (CTA) officially entered production
after 83PB of ATLAS data initially stored in CASTOR were migrated to CTA
Click to enlarge

Cern Atlas Graph

These data are curated by the IT department’s storage group for reconstruction and analysis by physicists across the world, using the Worldwide LHC Computing Grid (WLCG).

The custodial copy of all of CERN’s physics data – amounting to about 340PB – is stored on magnetic tapes at the CERN Data Centre, also called the WLCG ‘Tier-0’.

During Runs One and Two of the LHC, the software system used to manage the archival storage of physics data was CASTOR, the CERN storage manager, which was conceived as a system to manage both disk and tape storage. Over the last 10 years, requirements have evolved and a new disk system – EOS – was developed for online storage and data analysis. As EOS does not itself provide offline storage and data archival, a new project, the CERN Tape Archive (CTA), was conceived as the tape back-end to EOS. CTA is an evolution of the CASTOR tape system that removes the necessity to maintain a second disk-management system.

As part of storage consolidation, new tape library
was installed in August 2020 in CERN Data Centre

Cern Tape Archive Photo

In early 2020, the CTA team started commissioning tests with the ATLAS experiment as part of a reprocessing campaign on all of their experimental data from Run 2. During this exercise, raw data are recalled from tier-0 tapes or from the WLCG Tier-1 data centres (which are national data centres that curate a proportional share of the LHC data) in order to be ‘reconstructed’ into meaningful physics data that can be analysed. CTA replaced CASTOR for the tape-recall part of the exercise and had the lowest error rate of all sites, demonstrating performance and reliability for large-volume transfers.

Production for CTA was delayed by the disruption caused by Covid-19 and the subsequent move to teleworking, compelling the CTA team and ATLAS data-management team to adjust to new ways of communicating and scheduling tests. Nonetheless, the integration work and final commissioning tests were marked by a spirit of cooperation, which eventually allowed the migration of ATLAS’s data to take place during the last 2 weeks of June. This involved transferring the metadata (as opposed to the physical data, which does not move) of 86 million files – the entirety of ATLAS’s physics output – from CASTOR to CTA. After the migration of ATLAS data was completed, CTA entered into operation on 29 June.

All other CERN experiments, from the large LHC detectors to the smallest experiments, will migrate to CTA from October onwards, with ALICE the first to follow. This will allow CASTOR disk servers to be recovered and repurposed.

Besides the introduction of CTA, CERN is also preparing for Run 3 by installing a new tape library in the data centre and by upgrading FTS, its File Transfer Service. FTS, which distributes the majority of LHC data across the WLCG infrastructure, benefited from several performance improvements. It is now also supporting CTA and is used by more than 25 experiments at CERN and in other data-intensive sciences.

Read also:
CERN Archives 200PB on Tape
12.3PB just during October 2017
December 27, 2017 | Press Release

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E