Improving Multi-Million Virtual Rank MPI Execution in MUPI

Abstract

MUPI is a parallel discrete event simulator designed for enabling software-based experimentation via simulated execution across a range of synthetic to unmodified parallel programs using the Message Passing Interface (MPI) with millions of tasks. Here, we report work in progress in improving the efficiency of MUPI. Among the issues uncovered are the scaling problems with implementing barriers and inter-task message ordering. Preliminary performance shows the possibility of supporting hundreds of virtual MPI ranks per real processor core. Performance improvements of at least 2x are observed, and enable execution of benchmark MPI runs with over 16 million virtual ranks synchronized in a discrete event fashion on as few as 16,128 real cores of a Cray XT5.

[Pub 131]

http://pdcc.ntu.edu.sg/mascots2011/

Kalyan Perumalla
Kalyan Perumalla

As a Federal Program Manager in Advanced Scientific Computing Research at the U.S. Dept. of Energy, Office of Science, Kalyan Perumalla manages a $100-million R&D portfolio covering AI, HPC, Quantum, SciDAC, and Basic Computer Science. In his 25-year R&D leadership experience, he previously led advanced R&D as Distinguished Research Staff Member at the Oak Ridge National Laboratory (ORNL) developing scalable software and applications on the world’s largest supercomputers for 17 years, including as a line manager and a founding group leader. He has held senior faculty and adjunct appointments at UTK, GT, and UNL, and was an IAS Fellow at Durham University.

Next
Previous

Related