Efficient Parallel Discrete Event Simulation on Cloud/Virtual Machine Platforms

Abstract

Cloud and Virtual machine (VM) technologies present new challenges with respect to performance and monetary cost in executing parallel discrete event simulation (PDES) applications. Due to the introduction of overall cost as a metric, the traditional use of the highest-end computing configuration is no longer the most obvious choice. Moreover, the unique runtime dynamics and configuration choices of Cloud and VM platforms introduce new design considerations and runtime characteristics specific to PDES over Cloud/VMs. Here, an empirical study is presented to guide an understanding of the dynamics, trends, and trade-offs in executing PDES on Cloud/VM platforms. Performance and cost measures obtained from multiple PDES applications executed on the Amazon EC2 Cloud and on a high-end VM host machine reveal new, counterintuitive VM–PDES dynamics and guidelines. One of the critical aspects uncovered is the fundamental mismatch in hypervisor scheduler policies designed for general cloud workloads versus the virtual time ordering needed for PDES workloads. This insight is supported by experimental data revealing the gross deterioration in PDES performance traceable to VM scheduling policy. To overcome this fundamental problem, the design and implementation of a new deadlock-free scheduler algorithm are presented, optimized specifically for PDES applications on VMs. The scalability of our scheduler has been tested up to 128 VMs multiplexed on 32 cores, showing significant improvement in the runtime relative to the default Cloud/VM scheduler. The observations, algorithmic design, and results are timely for emerging cloud/VM-based installations, highlighting the need for PDES-specific support in high performance discrete event simulations on Cloud/VM platforms.

[Pub 149]

http://tomacs.acm.org/

Kalyan Perumalla
Kalyan Perumalla
R&D Manager

As a Federal Program Manager in Advanced Scientific Computing Research at the U.S. Dept. of Energy, Office of Science, Kalyan Perumalla manages a $100-million R&D portfolio covering AI, HPC, Quantum, SciDAC, and Basic Computer Science. He previously led advanced research and development as Distinguished Research Staff Member at the Oak Ridge National Laboratory (ORNL), where he spent 17 years developing scalable software and applications on the world’s largest supercomputers. He also held senior faculty and adjunct appointments at UTK, GT, and UNL, and was a Fellow at the Institute of Advanced Studies in Durham University, UK.

Next
Previous

Related