CloneX
Cloned eXecution of simulations is a powerful decision-making system to incrementally evaluate millions of what-if scenarios and scale to 1000s of GPUs.
Overview
CloneX is a novel system we designed for scalable simulation cloning, consisting of a conceptual simulation framework, algorithmic foundations, and a highly scalable runtime interface. It efficiently and dynamically creates whole logical copies of a dynamic tree of simulations across a large parallel system without full physical duplication of computation and memory.
CloneX efficiently simulates a tree of multiple what-if scenarios unraveled during the course of a normal (base) simulation. Cloned execution is highly challenging to realize on large, distributed memory computing platforms, due to the dynamic nature of the computational load across clones, and due to the complex dependencies spanning the clone tree.
CloneX has been tested on 1000s of GPUs of a supercomputing system and evaluated with multiple benchmarks – such as heat diffusion, forest fire, and disease propagation models – delivering a speed up of over two orders of magnitude compared to replicated runs.
CloneX represents a major leap in ensemble simulations as a significantly faster and scalable way to execute many what-if scenarios of large simulations.
Organization
- Sponsors: US Department of Energy (DOE)
- Office: Advanced Scientific Computing Research (ASCR)
- Program: Early Career Research Program (ECRP)
- Office: ORNL Strategic Planning Office
- Program: Laboratory-Directed Research and Develoopment (LDRD)
- Office: Advanced Scientific Computing Research (ASCR)
- Period: 2015-2017
Gallery
Related Publications