Simulating Billion-Task Parallel Programs

Abstract

In simulating large parallel systems, bottom-up approaches exercise detailed hardware models with effects from simplified software models or traces, whereas top-down approaches evaluate the timing and functionality of detailed software models over coarse hardware models. Here, we focus on the top-down approach and significantly advance the scale of the simulated parallel programs. Via the direct execution technique combined with parallel discrete event simulation, we stretch the limits of the top-down approach by simulating message passing interface (MPI) programs with millions of tasks. Using a timing-validated benchmark application, a proof-of-concept scaling level is achieved to over 0.22 billion virtual MPI processes on 216,000 cores of a Cray XT5 supercomputer, representing one of the largest direct execution simulations to date, combined with a multiplexing ratio of 1024 simulated tasks per real task.

[Pub 143]

http://atc.udg.edu/SPECTS2014/

Kalyan Perumalla
Kalyan Perumalla

Kalyan Perumalla is Founder and President of Discrete Computing, Inc. He led advanced research and development at ORNL and holds senior faculty appointments at UTK, GT, and UNL.

Next
Previous

Related