Designed to help meet the growing need for GPUs in the Ohio Supercomputer Center (OSC) user community, Ascend is a small system equipped with around a hundred Nvidia GPUs and weighs in around 2 peak petaflops, complementing OSC’s more traditional HPC systems.
Ascend is a Dell PowerEdge system with 24 nodes. Each node has dual AMD Epyc Milan 7643 CPUs (for a total of 48); quadruple Nvidia A100 (80GB) GPUs (for a total of 96); and 921GB of usable memory (for a total of ~22TB). The system uses Nvidia IB (200Gb/s) networking and is estimated at 1.95 peak petaflops; OSC hasn’t submitted Linpack benchmarks since the launch of its Owens system in 2016. Speaking of Owens, both it and the more recent Pitzer system (2018) are still in play at OSC: both are Dell-built and powered by Intel and Nvidia hardware, each consist of hundreds of nodes, and OSC says that they total around 5.5 peak petaflops.
OSC’s more traditional HPC systems, Owens (left) and Pitzer (right)
OSC is presenting Ascend as a complementary, immediate solution for its user community. While Ascend might represent a ~36% increase in terms of peak petaflops, the center says that the new cluster (the center stops short of calling it a HPC triples the center’s capacity for AI, modeling and simulation.
Ascend in context, flanked by OSC’s other HPC resources
“OSC developed Ascend in response to discussions with our client community, stakeholders and vendors, who identified an immediate need for greater GPU resources to process research and simulations that rely on AI, big data and ML,” said David Hudak, OSC’s executive director. “We are pleased to be able to offer this major new resource to the HPC community and support client advancements in academic research and commercial technologies.“
One early access user, OSC assistant professor Yu Su, reported positive results, running one of the largest neural network models – BLOOM-176B – on OSU hardware for the first time and highlighting student experiences of 2x to 3x faster processing compared to the older hardware.
One graduate student, Bargeen Turzo, reported differences in protein prediction: “For some large proteins I was not even able to get a single prediction on Pitzer after running the calculation for multiple weeks. “While on Ascend the same calculation finished in 12 hours.“
Ascend was installed last fall, with OSC testing the system from October through December with other early access users like Su.
“Part of the goal of the early-user period was to get a better understanding of how the user applications make use of the GPUs that we are supporting in the system,” explained Doug Johnson, associate director, OSC. “We will continue to improve the software and management of the system as we learn more from what we encounter supporting the early users and operating the system for a longer period of time.“
“OSC’s client services and scientific applications teams will be available to help our clients determine if their applications can make good use of the Ascend GPUs,” he added. “For some applications there is a large performance benefit for using the GPUs and Ascend will make it possible for our clients to tackle some problems that can’t be solved on our current systems.“
OSC also shared a peek at its future plans with a note that it intends to replace its Owens cluster (at 7 years old, an old-timer by HPC standards) this year, running the 2 systems concurrently for a period before phasing out Owens.
Last month, OSC celebrated its 35-year anniversary.
“It’s our job to constantly be on the cutting edge of technology, evaluating and deploying it, and making it available here in Ohio,” Hudak said on that occasion. “Beyond simply providing the technology, though, we also make it uniquely flexible, affordable and easy to access thanks to 35 years of experience pushing Ohio’s capabilities forward.“