What are you looking for ?
Advertise with us
RAIDON

Argonne Team Breaks Record For Globus Data Movement for 2.9PB File Transfer

On Oak Ridge Summit HPC largest ever for Globus; involves 3 of the largest known cosmological simulations.

Globus, a research data management service and a trademark of the University of Chicago, announced the largest single file transfer in its history: a team led by Argonne National Laboratory scientists moved 2.9PB of data as part of a research project involving three of the largest cosmological simulations to date.

Argonne Team Breaks Record For Globus Data Movement

Storage is in general a very large problem in our community – the Universe is just very big, so our work can often generate a lot of data,” explained Dr. Katrin Heitmann, Argonne physicist and computational scientist and an Oak Ridge National Laboratory Leadership Computing Facility (OLCF) early science user. “Using Globus to easily move the data around between different storage solutions and institutions for analysis is essential.

The data in question was stored on the Summit HPC at OLCF, currently the world’s fastest HPC according to the Top500 list published June 18, 2019. Globus was used to move the files from disk to tape, a key use case for researchers.

Due to its uniqueness, the data is very precious and the analysis will take time,” said Heitmann. “The first step after the simulations were finished was to make a backup copy of the data to HPSS, so we can move the data back and forth between disk and tape and thus carry out the analysis in steps. We use Globus for this work due to its speed, reliability, and ease of use.

With exascale imminent, AI on the rise, HPC systems proliferating, and research teams more distributed than ever, fast, secure, reliable data movement and management are now more important than ever,” said Ian Foster, Globus co-founder and director of Argonne’s data science and learning division. “We tend to take these functions for granted, and yet modern collaborative research would not be possible without them.

Globus has underpinned groundbreaking research for decades. We could not be prouder of our role in helping scientists do their world-changing work, and we’re happy to see projects like this one continue to push the boundaries of what Globus can achieve. Congratulations to Dr. Heitmann and team.

When it comes to data transfer performance, “the most important part is reliability,” says Heitmann. “It is basically impossible for me as a user to check the very large amounts of data upon arrival after a transfer has finished. The analysis of the data often uses a subset of the data, so it would take quite a while until bad data would be discovered and at that point we might not have the data anymore at the source. So the reliability aspects of Globus are key.

Of course, speed is also important. If the transfers were very slow, given the amount of data we transfer, we would have had a problem. So it’s good to be able to rely on Globus for fast data movement as well. We are also grateful to Oak Ridge for access to Summit and for their excellent setup of data transfer nodes enabling the use of Globus for HPSS transfers. This work would not have been possible otherwise.

For details, read the Q&A blog with Dr. Heitmann

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E