What are you looking for ?
Infinidat
Articles_top

HPC Providers Develop UCX Network Communication Framework for Next Generation Programming Models

For high-performance and data-centric applications

UCX (Unified Communication – X) will provide platform abstractions supporting various communication technologies for high-performance compute and data platforms.

UCX_Logo_930x933

Mellanox Technologies, Ltd. announced a collaboration with the Department of Energy’s Oak Ridge National Laboratory (ORNL), IBM Corporation, the University of Tennessee, NVIDIA Corporation, and industry leaders, laboratories and academia to develop an open-source network communication framework for high-performance and data-centric applications.

Traditionally there have been three popular mainstream communication frameworks to support various interconnect technologies and programming languages: MXM, developed by Mellanox Technologies; PAMI, developed by IBM; and UCCS, developed by ORNL, the University of Houston, and the University of Tennessee. UCX will unify the strengths and capabilities of each of these communication libraries and optimize them into one unified communication framework that delivers essential building blocks for the development of a high-performance communication ecosystem.

UCX architecture
Click to enlarge

UCX_Architecture-2

 

As we drive towards next generation, larger scale systems, the UCX project enables the research needed for emergent exascale programming models that are agnostic to the underlying interconnect and acceleration technology,” said Dr. Arthur Bernard Maccabe, division director, computer science and mathematics division, Oak Ridge National Laboratory.

Mellanox is very happy to participate in the co-design efforts of the UCX project. By providing our advancements in shared memory, MPI and underlying network transport technologies, we can continue to advance open standards-based networking and programming models,” said Gilad Shainer, VP marketing, Mellanox. “UCX will provide optimizations for lower software overhead in communication paths that will allow cross platform near native-level interconnect performance. The framework interface will expose semantics that target not only HPC programming models, but data-centric applications as well. It will also enable vendor independent development of the library.

UCX is clearly a strategic open-source communication framework for future high-performance systems,” said Jim Sexton, IBM fellow and director, data centric systems, IBM. “We are eager to collaborate on UCX with our key OpenPOWER and university partners. In particular, IBM is contributing key innovations from our PAMI high-performance messaging software already in use in several Top10 supercomputing systems.

UCX is intended to make it faster and easier to add Tesla Accelerated Computing Platform technologies, including GPUDirect RDMA and the NVLink high-speed interconnect, to the HPC communications stack,” said Duncan Poole, director, platform alliances, NVIDIA. “We look forward to working with the UCX members to bring new levels of HPC solutions to HPC.

“The path to Exascale, in addition to many other challenges, requires programming models where communications and computations unfold together, collaborating instead of ,peting for the underlying resources. In such an environment, providing holistic access to the hardware is a major component of any programming model or communication library. With UCX, we have the opportunity to provide not only a vehicle for production quality software, but also a low-level research infrastructure for more flexible and portable support for the Exascale-ready programming models,” adds George Bosilca, research director, innovative computing laboratory, University of Tennessee, Knoxville, TN.

By sering as a high-performance, low latencycommunication layer, UCX will enable us to provide applications developers with productive, extreme-scale programming languages and libraries, including Partitioned Global Address Space (PGAS) APIs, such as Fortran Coarrays and OpenSHMEM, as well as OpenMP across multiple memory domains and on heterogeneous nodes,” said Professor Barbara Chapman, director, CACDS, University of Houston.

UCX-OFA overview

UCX-OFA_OVERVIEW

The UCX collaboration will be guided by a High-Performance Computing Leadership Team that includes: Dr. Arthur Bernard Maccabe, Division Director, Computer Science and Mathematics Division, Oak Ridge National Laboratory; Donald Becker, Tesla System Architect, NVIDIA; Dr. George Bosilca, research director at the Innovative Computing Laboratory, University of Tennessee; Richard Graham, Senior Solutions Architect, Mellanox Technologies; Dr. Sameer Kumar, Research Scientist, Deep Computing and HPC systems, IBM India Research Lab; Stephen Poole , CTO, Open Software System Solutions; Gilad Shainer, VP marketing, Mellanox Technologies; and Dr. Sameh Sharkawi, Team Lead, Parallel Environment MPI Middleware at IBM.

Information on UCX collaboration.

The UCX project at ORNL is funded by the United States Department of Defense and uses resources of the Extreme Scale Systems Center located at ORNL. This project is being developed using resources of the Oak Ridge Leadership Computing Facility at ORNL, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

Articles_bottom
AIC
ATTO
OPEN-E