Search for

or choose

Publication type: conference [total 59]

Show full details for all items

2010


On Deciding between Conservative and Optimistic Approaches on Massively Parallel Platforms
Christopher Carothers and Kalyan Perumalla
Winter Simulation Conference 2010
http://www.wintersim.org

Invited

Abstract: Over 5000 publications on parallel discrete event simulation (PDES) have appeared in the literature to date. Nevertheless, few articles have focused on empirical studies of PDES performance on large supercomputer-based systems. This gap is bridged here, by undertaking a parameterized performance study on thousands of processor cores of a Blue Gene supercomputing system. In contrast to theoretical insights from analytical studies, our study is based on actual implementation in software, incurring the actual messaging and computational overheads for both conservative and optimistic synchronization approaches of PDES. Complex and counter-intuitive effects are uncovered and analyzed, with different event timestamp distributions and available levels of concurrency in the synthetic benchmark models. The results are intended to provide guidance to the PDES community in terms of how the synchronization protocols behave at high processor core counts using a state-of-the-art supercomputing systems.





Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models
Kalyan Perumalla, Sudip Seal
24th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2010) 2010
http://www.pads-workshop.org/pads2010.html

Best Paper Finalist

Abstract: The spatial scale, runtime speed, and behavioral detail of epidemic outbreak simulations altogether require the use of large-scale parallel processing. Here, an optimistic parallel discrete event execution of a reaction-diffusion simulation model is presented. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the system are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundreds of millions in the largest case) are exercised.





Compiler-based Automation Approaches to Reverse Computation
Kalyan Perumalla, Christopher Carothers
24th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2010) 2010
http://www.pads-workshop.org/pads2010.html

Workshop on Reverse Computation

Abstract: Automation is useful to facilitate reverse code generation from normal code. Here, we describe our source-to-source compilation approaches to automatic reverse code generation, developed along three different translation tools/frameworks. At RPI, we are developing frameworks based on PIPS, which is a purely source-to-source translation and optimization tool for parallel computing, and on CLANG/LLVM, which is a full compiler complete with backend processing and optimizers. At ORNL we are continuing development of the seminal reverse computation framework called RCC (Reverse C Compiler) system. For all three systems, we present some implementation issues and challenges encountered in our development.




Efficient Simulation of Agent-Based Models on Multi-GPU and Multi-Core Clusters
Brandon G. Aaby, Kalyan S. Perumalla, Sudip K. Seal
3rd International ICST Conference on Simulation Tools and Techniques (SimuTools) 2010
http://www.simutools.org

Best Paper Finalist

Abstract: An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Message Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.





µπ: A Scalable and Transparent System for Simulating MPI Programs
Kalyan Perumalla
3rd International ICST Conference on Simulation Tools and Techniques (SimuTools) 2010
http://www.simutools.org

Abstract: µπ is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of µπ are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by µπ is being expanded to support a wider set of applications, and MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, µπ has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, µsik. In the largest runs, µπ has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.





High-Performance Simulations for Capturing Feedback and Fidelity in Complex Networked Systems
Kalyan Perumalla
SIAM Conference on Parallel Processing for Scientific Computing (PP10) 2010
http://www.siam.org/meetings/pp10/

Abstract and Presentation in MS44 Computational Network Science

Abstract: In a variety of complex networked systems, simulation is a powerful method to capture critical feedback effects among inter-dependent processes. Network-based phenomena in areas such as cyberinfrastructure, transportation, epidemiology, and social networks, all offer important analysis problems that need such feedback effects to be accurately captured. However, accurate modeling of feedback effects requires increased levels of model fidelity. Moreover, such high-fidelity, feedback-heavy models are especially characterized by very high computational needs. In this backdrop, the need for high-fidelity simulations is illustrated, with examples of how they are driving new high-performance computing-based solutions in the aforementioned areas. Our parallel computing approaches are described in the context of very large-scale, high-fidelity simulations in regional-scale transportation network simulations, nation-scale epidemiological simulations, and Internet simulations with detailed models millions of nodes.





Towards Highly Interactive, GPU-based Evaluation of Evacuation Transport Scenarios at State-Scale
Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath, Sudip Seal
National Evacuation Conference 2010
http://www.nationalevacuationconference.org

Abstract: In large-scale scenarios, transportation modeling and simulation is severely constrained by simulation time. For example, few real-time simulators exist that can scale to evacuation traffic scenarios at the level of an entire state such as Louisiana (approx. 1 million links) or Florida (2.5 million links). New modeling techniques are needed to overcome severe computational demands of conventional (microscopic or mesoscopic) modeling techniques. Here, a modeling and execution methodology is explored which holds potential to provide a tradeoff among the level of behavioral detail, the scale of transportation network, and real-time execution capabilities. A novel, field-based modeling technique, and its implementation on graphical processing units (GPUs) are presented, as a step forward in enabling large-network transportation modeling and simulation. Although additional research with input from domain experts is needed for refining and validating the models, the techniques reported here afford interactive experience at hitherto fore unimaginable scales of multi-million road segments. Illustrative experiments on a few state-scale networks are described based on our implementation of this approach in a software system called GARFIELD-EVAC.





2009


Cyber Security Experimentation: Gory Detail or None at All?
Kalyan Perumalla
SIAM Annual Meeting 2009
http://www.siam.org/meetings/an09/

Abstract and presentation

Abstract: Unique facets confronted by current cyber security analysis efforts are: tremendous pace of change of ground rules (axioms), apparently wide and deep phenomenological effects, and widely-varying interpretations of security objectives. Together, effective methods for cyber security analysis appear to be swung between two extremes: experimentation-based methods with full, gory detail, and abstraction-based methods with significant simplifications. In the case of methods in between, accuracy considerations make intermediate methods tend to swing rapidly back towards full glory, while scientific inquiry and efficiency considerations tend to swing them back towards abstractions. Based on our experience and past evidence, we argue that experimentation with gory detail is the most effective approach in the short- to medium-term, while the other extreme is relevant one for the longer term. Feasibility will be shown of sustaining the scale and fidelity for the former extreme, namely, experiments with full gory.




Switching to High Gear: Opportunities for Grand-scale Real-time Parallel Simulations
Kalyan Perumalla
ACM International Symposium on Distributed Simulations and Real-time Applications 2009


Keynote

Abstract: The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 10^5 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher gear to exploit the new, immense levels of computational power. The potential for grand-scale real-time solutions is illustrated using preliminary results from prototypes in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) very large-scale epidemic modeling, (c) modeling the propagation of wireless network signals in very large, cluttered terrains, and, (d) country- or world-scale social behavioral modeling. We believe the stage is perfectly poised for the parallel/distributed simulation community to envision and formulate similar grand-scale, real-time simulation-based solutions in many application areas.



GPU-based Real-Time Execution of Vehicular Mobility Models in Large-Scale Road Network Scenarios
Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath and Sudip Seal
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2009


Abstract: A methodology and its associated algorithms are presented for mapping a novel, field-based vehicular mobility model onto graphical processing unit computational platform for simulating mobility in large-scale road networks. Of particular focus is the achievement of real-time execution, on desktop platforms, of vehicular mobility on road networks comprised of millions of nodes and links, and multi-million counts of simultaneously active vehicles. The methodology is realized in a system called GARFIELD, whose implementation details and performance study are described. The runtime characteristics of a prototype implementation are presented that show real-time performance in simulations of networks at the scale of a few states of the US road networks.





A Connectionist Modeling Approach to Rapid Analysis of Emergent Social Cognition Properties in Large-Populations
Kalyan S. Perumalla and Jack C. Schryver
Human Behavior-Computational Modeling and Interoperability Conference 2009


Abstract: Traditional modeling methodologies, such as those based on rule-based agent modeling, are exhibiting limitations in application to rich behavioral scenarios, especially when applied to large population aggregates. Here, we propose a new modeling methodology based on a well-known "connectionist approach," and articulate its pertinence in new applications of interest. This methodology is designed to address challenges such as speed of model development, model customization, model reuse across disparate geographic/cultural regions, and rapid and incremental updates to models over time.





Coping at the User-Level with Resource Limitations in the Cray Message Passing Toolkit MPI at Scale: How Not to Spend Your Summer Vacation
Richard Mills, Forrest Hoffman, Patrick Worley, Kalyan Perumalla, Art Mirin, Glenn Hammond and Barry Smith
Proceedings of Cray User Group Meeting 2009






Scalable Parallel Execution of an Event-based Radio Signal Propagation Model for Cluttered 3D Terrains
Sudip Seal and Kalyan Perumalla
Proceedings of International Conference on Parallel Processing 2009






2008


Efficient Execution on GPUs of Field-based Vehicular Mobility Models
Kalyan Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2008






Data Parallel Execution Challenges and Runtime Performance of Agent Simulations on GPUs
Kalyan Perumalla and Brandon Aaby
Proceedings of Spring Computer Simulation Conference 2008


Best Paper Award





Parallel Vehicular Traffic Simulations using Reverse Computation-based Optimistic Execution
Srikanth Yoginath and Kalyan Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2008






2007


Scaling Time Warp-based Discrete Event Execution to 10^4 Processors on a Blue Gene Supercomputer
Kalyan Perumalla
Proceedings of ACM Computing Frontiers 2007






An Analysis Approach to Large-Scale Vehicular Network Simulations
Kalyan Perumalla and Martin Beckerman
Proceedings of Summer Computer Simulation Conference 2007






2006


On Evaluation Needs of Real-life Sensor Network Deployments
Alfred Park, Kalyan S. Perumalla, Vladimir Protopopescu, Mallikarjun Shankar, Frank DeNap and Bryan Gorman
Proceedings of European Modeling and Simulation Symposium (EMSS) 2006






Parallel and Distributed Simulation: Traditional Techniques and Recent Advances
Kalyan Perumalla
Proceedings of Winter Simulation Conference (WSC) 2006






A Systems Approach to Scalable Transportation Network Modeling
Kalyan Perumalla
Proceedings of Winter Simulation Conference (WSC) 2006






Discrete-Event Execution Alternatives on GPGPUs
Kalyan S. Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2006






On Accounting for the Interplay of Kinetic and Non-Kinetic Aspects of Population Mobility Models
Kalyan S. Perumalla
Proceedings of European Modeling and Simulation Symposium (EMSS 2006






Parallel Execution of Region-Scale Evacuation Traffic Models
Kalyan S. Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2006






Scalable Simulation of Electromagnetic Hybrid Codes
Kalyan S. Perumalla, Richard M. Fujimoto, Homa Karimabadi
Proceedings of International Conference on Computational Science 2006






2005


Virtual Simulator: An Infrastructure for Design and Performance-Prediction of Massively Parallel Codes
Kalyan Perumalla, Richard Fujimoto, Santosh Pande, Homa Karimabadi, Jonathan Driscoll, and Yuri Omelchenko
Proceedings of Eos Transactions, American Geophysical Union Fall Meeting 2005


Abstract and presentation

Abstract: Large parallel/distributed scientific simulations are very complex, and their dynamic behavior is hard to predict. Efficient development of massively parallel codes remains a computational challenge. For example, almost none of the kinetic codes in use in space physics today have dynamic load balancing capability. Here we present a new infrastructure for design and prediction of parallel codes. Performance prediction is useful to analyze, understand and experiment with different partitioning schemes, multiple modeling alternatives and so on, without having to run the application on supercomputers. Instrumentation of the model (with least perturbance to performance) is useful to glean key metrics and understand application-level behavior. Unfortunately, traditional approaches to virtual execution and instrumentation are limited by either slow execution speed or low resolution or both. We present a new framework that provides a high-resolution framework that provides a virtual CPU abstraction (with a full thread context per CPU), yet scales to thousands of virtual CPUs. The tool, called PDES2, presents different levels of modeling interfaces, from general purpose parallel simulations to parallel grid-based particle-in-cell (PIC) codes. The tool itself runs on multiple processors in order to accommodate the high-resolution by distributing the virtual execution across processors. Validation experiments of PIC models in the framework using a 1-D hybrid shock application show close agreement of results from virtual executions with results from actual supercomputer runs. The utility of this tool is further illustrated through an application to a parallel global hybrid code.





A New Simulation Technique for Study of Collision-less Shocks: Self Adaptive Simulations
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla
Proceedings of 4th Annual International Astrophysics Conference (IGPP) 2005




A New Methodology for Multi-scale Simulation of Plasmas
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla
Proceedings of 7th International Symposium for Space Simulations (ISSS) 2005




µsik - A Micro-kernel for Parallel/Distributed Simulation Systems
Kalyan S. Perumalla
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2005






Performance Prediction of Large-scale Parallel Discrete Event Models of Physical Systems
Kalyan S. Perumalla, Richard M. Fujimoto, Prashant Thakare, Santosh Pande, Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll
Proceedings of Winter Simulation Conference (WSC) 2005






Optimistic Parallel Discrete Event Simulations of Physical Systems using Reverse Computation
Yarong Tang, Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi, Jonathan Driscoll and Yuri Omelchenko
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2005






2004


Conservative Synchronization of Large-scale Network Simulations
Alfred Park, Richard Fujimoto and Kalyan Perumalla
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2004






A New Approach to Modeling Physical Systems: Discrete Event Simulations of Grid-based Models
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, N. Omidi, Richard Fujimoto and Kalyan Perumalla
Proceedings of Workshop on State-Of-The-Art in Scientific Computing (PARA) 2004






High Fidelity Modeling of Computer Network Worms
Kalyan Perumalla and Srikanth Sundaragopalan
Proceedings of Annual Computer Security Applications Conference (ACSAC) 2004






2003


Scalable RTI-based Parallel Simulation of Networks
Kalyan Perumalla, Alfred Park, Richard Fujimoto and George Riley
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2003






Large-Scale Network Simulation - How Big? How Fast?
Richard Fujimoto, Kalyan Perumalla, Alfred Park, Hao Wu, Mostafa Ammar, and George Riley
Proceedings of IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunication Systems (MASC 2003






Power-aware State Dissemination in Mobile Distributed Virtual Environments
Weidong Shi, Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2003






2002


Using Reverse Circuit Execution for Efficient Parallel Simulation of Logic Circuits
Kalyan Perumalla, and Richard Fujimoto
Proceedings of The International Society for Optical Engineering (SPIE) Annual Meeting 2002






Experiences Applying Parallel and Interoperable Network Simulation Techniques in On-line Simulations of Military Networks
Kalyan Perumalla, Richard Fujimoto, Thom McLean and George Riley
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2002






Web Services for Extensible Modeling and Simulation
Kalyan S. Perumalla
Proceedings of Workshop on Extensible Modeling and Simulation Framework (XMSF) 2002






Updateable Simulations
Steve Ferenci, Richard Fujimoto, Mostafa Ammar, Kalyan Perumalla and George Riley
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2002






2001


Distributed Network Simulations using the Dynamic Simulation Backplane
George Riley, Mostafa Ammar, Richard Fujimoto, Donghua Xu and Kalyan Perumalla
Proceedings of the International Conference on Distributed Computing Systems, April 2001 (ICDCS) 2001






Virtual Time Synchronization over Unreliable Network Transport
Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2001






2000


Design of High-performance RTI software
Richard Fujimoto, Thom McLean, Kalyan Perumalla and Ivan Tacic
Proceedings of Distributed Simulations and Real-time Applications (DS-RT) 2000






An Approach to Federating Parallel Simulators
Steve Ferenci, Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2000






1999


The Effect of State Saving in Optimistic Simulation on a Cache-coherent Non-uniform Memory Access (CC-NUMA) Architecture
Christopher Carothers, Kalyan Perumalla and Richard Fujimoto
Proceedings of the Winter Simulation Conference 1999






Efficient Optimistic Parallel Simulation using Reverse Computation
Christopher Carothers, Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 1999






PARINO: A Parallel Branch and Cut Code
Jeff Linderoth, Kalyan Perumalla and Martin Savelsbergh
Proceedings of INFORMS National Meeting 1999




1998


Efficient Large-scale Process-oriented Parallel Simulations
Kalyan Perumalla and Richard Fujimoto
Proceedings of the Winter Simulation Conference (WSC) 1998






1997


Time Parallel Generation of Self-similar ATM Traffic
Ioannis Nikolaidis, Anthony Cooper, Kalyan Perumalla and Richard Fujimoto
Proceedings of the Winter Simulation Conference (WSC) 1997






A Virtual PNNI Network Testbed
Kalyan Perumalla, Matthew Andrews and Sandeep Bhatt
Proceedings of the Winter Simulation Conference (WSC) 1997






PARINO, A Parallel Integer Optimizer
Martin Savelsbergh, Kalyan Perumalla, Jeff Linderoth, and Umakishore Ramachandran
Proceedings of International Symposium on Mathematical Programming 1997






1996


An Efficiency Prediction Method for ATM Multiplexer
Kalyan Perumalla, Anthony Cooper, Richard Fujimoto
Proceedings of Broadband Communications 1996




1994


Parallelizing Sequential Algorithms for the Generalized Assignment Problem
Ivan Yanasak, Gautam Shah, Kalyan Perumalla, et al
DIMACS Challenge of Parallel Computing 1994




Parallel Algorithms for Maximum Sub-sequence and Sub-array
Kalyan Perumalla and Narsingh Deo
Proceedings of International Conference on Combinatorics, Graph Theory and Computing 1994




1993


Integrating Aggregate and Vehicle Level Simulations
Clark Karr, Robert Francescini and Kalyan Perumalla
Proceedings of the 3rd Conference on Computer Generated Forces and Behavioral Representation 1993




A Distributed Algorithm for Ear Decomposition
Sridhar Hannenhalli, Kalyan Perumalla and Narayan Chandrasekharan
Proceedings of International Conference on Computing and Information (ICCI) 1993




A Debugging Environment for PVM
Uday Vemulapati and Kalyan Perumalla
Distributed Computing for Aeroscience Applications 1993




1992


Integrating Battlefield Simulations of Different Granularity
Clark Karr, Robert Francescini and Kalyan Perumalla
Proceedings of the Southeastern Simulation Conference 1992




SELECT * FROM pubs WHERE PubType=4 ORDER BY PubYear DESC, PubDate DESC, PubAuthors


Copyright © Perumalla 2009-2010