Search for

or choose

Publication type: all [total 112]

Show full details for all items

2010


On Deciding between Conservative and Optimistic Approaches on Massively Parallel Platforms
Christopher Carothers and Kalyan Perumalla
Winter Simulation Conference 2010
http://www.wintersim.org

Invited

Abstract: Over 5000 publications on parallel discrete event simulation (PDES) have appeared in the literature to date. Nevertheless, few articles have focused on empirical studies of PDES performance on large supercomputer-based systems. This gap is bridged here, by undertaking a parameterized performance study on thousands of processor cores of a Blue Gene supercomputing system. In contrast to theoretical insights from analytical studies, our study is based on actual implementation in software, incurring the actual messaging and computational overheads for both conservative and optimistic synchronization approaches of PDES. Complex and counter-intuitive effects are uncovered and analyzed, with different event timestamp distributions and available levels of concurrency in the synthetic benchmark models. The results are intended to provide guidance to the PDES community in terms of how the synchronization protocols behave at high processor core counts using a state-of-the-art supercomputing systems.





Supercomputing Applications of the Other Kind: Real-time Parallel Discrete Event Simulations of Large-scale, Smart Infrastructures
Kalyan Perumalla
IBM T J Watson Research Center, Yorktown Heights, New York 2010


Abstract: Ultra-scale supercomputing hardware is a reality, reaching peta-scale recently and now moving to exa-scale. A rich class of applications has however remained largely untapped to reap supercomputing benefits, namely, parallel discrete event simulations (PDES), partly due to technical challenges, and partly awaiting compelling applications. The recent emergence of new visions of a "smarter" and more agile societal operation has clearly opened new large-scale applications that are directly formulated as grand-scale PDES scenarios executed in an on-line, real-time fashion. Our presentation will focus on highlighting the potential for grand-scale real-time solutions, illustrated using our preliminary efforts in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) country- or world-scale epidemic modeling, (c) cyber infrastructure and security operations at the scale of multiple autonomous systems, and, (d) country- or world-scale social behavioral modeling. We believe the context is ripe to envision and formulate similar grand-scale solutions in many additional application areas. The technical vision to progress to such an ambitious goal, however, is highly challenging. We outline some of the salient technical issues, challenges, and potential solution directions in meeting the scale and speed demanded by this "other kind" of supercomputing applications.



Bcyclic: A Parallel Block Tri-diagonal Matrix Cyclic Solver
Steven Hirshman, Kalyan Perumalla, Vickie Lynch and Raul Sanchez
Journal of Computational Physics 2010
http://www.sciencedirect.com/science/journal/00219991

Accepted April 30, 2010

Abstract: A block tri-diagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three dimensional equilibrium solvers for high-temperature fusion plasmas is cited.





Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models
Kalyan Perumalla, Sudip Seal
24th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2010) 2010
http://www.pads-workshop.org/pads2010.html

Best Paper Finalist

Abstract: The spatial scale, runtime speed, and behavioral detail of epidemic outbreak simulations altogether require the use of large-scale parallel processing. Here, an optimistic parallel discrete event execution of a reaction-diffusion simulation model is presented. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the system are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundreds of millions in the largest case) are exercised.





Compiler-based Automation Approaches to Reverse Computation
Kalyan Perumalla, Christopher Carothers
24th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2010) 2010
http://www.pads-workshop.org/pads2010.html

Workshop on Reverse Computation

Abstract: Automation is useful to facilitate reverse code generation from normal code. Here, we describe our source-to-source compilation approaches to automatic reverse code generation, developed along three different translation tools/frameworks. At RPI, we are developing frameworks based on PIPS, which is a purely source-to-source translation and optimization tool for parallel computing, and on CLANG/LLVM, which is a full compiler complete with backend processing and optimizers. At ORNL we are continuing development of the seminal reverse computation framework called RCC (Reverse C Compiler) system. For all three systems, we present some implementation issues and challenges encountered in our development.




Towards Highly Interactive, GPU-based Evaluation of Evacuation Transport Scenarios at State-Scale
Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath, Sudip Seal
Journal of Transportation Safety and Security 2010
http://stc.utk.edu/tss

Accepted

Abstract: In large-scale scenarios, transportation modeling and simulation is severely constrained by simulation time. For example, few real-time simulators exist that can scale to evacuation traffic scenarios at the level of an entire state such as Louisiana (approx. 1 million links) or Florida (2.5 million links). New modeling techniques are needed to overcome severe computational demands of conventional (microscopic or mesoscopic) modeling techniques. Here, a modeling and execution methodology is explored which holds potential to provide a tradeoff among the level of behavioral detail, the scale of transportation network, and real-time execution capabilities. A novel, field-based modeling technique, and its implementation on graphical processing units (GPUs) are presented, as a step forward in enabling large-network transportation modeling and simulation. Although additional research with input from domain experts is needed for refining and validating the models, the techniques reported here afford interactive experience at hitherto fore unimaginable scales of multi-million road segments. Illustrative experiments on a few state-scale networks are described based on our implementation of this approach in a software system called GARFIELD-EVAC.





Efficient Simulation of Agent-Based Models on Multi-GPU and Multi-Core Clusters
Brandon G. Aaby, Kalyan S. Perumalla, Sudip K. Seal
3rd International ICST Conference on Simulation Tools and Techniques (SimuTools) 2010
http://www.simutools.org

Best Paper Finalist

Abstract: An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Message Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs.





µπ: A Scalable and Transparent System for Simulating MPI Programs
Kalyan Perumalla
3rd International ICST Conference on Simulation Tools and Techniques (SimuTools) 2010
http://www.simutools.org

Abstract: µπ is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of µπ are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by µπ is being expanded to support a wider set of applications, and MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, µπ has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, µsik. In the largest runs, µπ has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time.





Reversible Parallel Discrete Event Formulation of a TLM-based Radio Signal Propagation Model
Sudip Seal and Kalyan Perumalla
ACM Transactions on Modeling and Computer Simulations (TOMACS) 2010
http://linklings.net/tomacs

Under review

Abstract: Radio signal strength estimation is essential in many applications, including the design of military radio communications and industrial wireless installations. For scenarios with large or richly-featured geographical volumes, parallel processing is required to meet the memory and computation time demands. Here, we present a scalable and efficient parallel execution of the sequential model for radio signal propagation recently developed by Nutaro et al. Starting with that model, we (a) provide a vector-based reformulation that has significantly lower computational overhead for event handling, (b) develop a parallel decomposition approach that is amenable to reversibility with minimal computational overheads, (c) present a framework for transparently mapping the conservative time-stepped model into an optimistic parallel discrete event execution, (d) present a new reversible method, along with its analysis and implementation, for inverting the vector-based event model to be executed in an optimistic parallel style of execution, and (e) present performance results from implementation on Cray XT platforms. We demonstrate scalability, with the largest runs tested on up to 127,500 cores of a Cray XT5, enabling simulation of larger scenarios and with faster execution than reported before on the radio propagation model. This also represents the first successful demonstration of the ability to efficiently map a conservative time-stepped model to an optimistic discrete-event execution.





High-Performance Simulations for Capturing Feedback and Fidelity in Complex Networked Systems
Kalyan Perumalla
SIAM Conference on Parallel Processing for Scientific Computing (PP10) 2010
http://www.siam.org/meetings/pp10/

Abstract and Presentation in MS44 Computational Network Science

Abstract: In a variety of complex networked systems, simulation is a powerful method to capture critical feedback effects among inter-dependent processes. Network-based phenomena in areas such as cyberinfrastructure, transportation, epidemiology, and social networks, all offer important analysis problems that need such feedback effects to be accurately captured. However, accurate modeling of feedback effects requires increased levels of model fidelity. Moreover, such high-fidelity, feedback-heavy models are especially characterized by very high computational needs. In this backdrop, the need for high-fidelity simulations is illustrated, with examples of how they are driving new high-performance computing-based solutions in the aforementioned areas. Our parallel computing approaches are described in the context of very large-scale, high-fidelity simulations in regional-scale transportation network simulations, nation-scale epidemiological simulations, and Internet simulations with detailed models millions of nodes.





Towards Highly Interactive, GPU-based Evaluation of Evacuation Transport Scenarios at State-Scale
Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath, Sudip Seal
National Evacuation Conference 2010
http://www.nationalevacuationconference.org

Abstract: In large-scale scenarios, transportation modeling and simulation is severely constrained by simulation time. For example, few real-time simulators exist that can scale to evacuation traffic scenarios at the level of an entire state such as Louisiana (approx. 1 million links) or Florida (2.5 million links). New modeling techniques are needed to overcome severe computational demands of conventional (microscopic or mesoscopic) modeling techniques. Here, a modeling and execution methodology is explored which holds potential to provide a tradeoff among the level of behavioral detail, the scale of transportation network, and real-time execution capabilities. A novel, field-based modeling technique, and its implementation on graphical processing units (GPUs) are presented, as a step forward in enabling large-network transportation modeling and simulation. Although additional research with input from domain experts is needed for refining and validating the models, the techniques reported here afford interactive experience at hitherto fore unimaginable scales of multi-million road segments. Illustrative experiments on a few state-scale networks are described based on our implementation of this approach in a software system called GARFIELD-EVAC.





2009


Scalable Parallel Execution of an Event-based Radio Signal Propagation Model for Cluttered 3D Terrains
Sudip Seal and Kalyan Perumalla
Oak Ridge National Laboratory 2009
http://www.osti.gov/bridge

Technical Report ORNL/TM-2009/165

Abstract: Radio signal strength estimation is essential in many applications, including the design of military radio communications and industrial wireless installations. While classical approaches such as finite difference methods are well-known, new event-based models of radio signal propagation have been recently shown to deliver such estimates faster (via serial execution) than other methods. For scenarios with large or richly-featured geographical volumes however, parallel processing is required to meet the memory and computation time demands. Here, we present a scalable and efficient parallel execution of a recently-developed event-based radio signal propagation model. We demonstrate its scalability to thousands of processors, with parallel speedups over 1000?. The speed and scale achieved by our parallel execution enable larger scenarios and faster execution than has ever been reported before.





Perfect Reversal of Rejection Sampling Methods for First-Passage-Time and Similar Probability Distributions
Kalyan Perumalla and Aleksandar Donev
Oak Ridge National Laboratory 2009
http://www.osti.gov/bridge

Technical Report ORNL/TM-2009/182

Abstract: We present a perfectly reversible method for bi-directional generation of samples from computationally complex probability distributions. While the previously best-known procedures consume memory proportional to the length of execution between changes of execution direction, here we present a scheme to completely eliminate the memory overhead. Our solution affords two important features, namely determinism and repeatability, across arbitrarily spaced changes of direction (and arbitrary number of samples) along the sample stream. We illustrate the perfect reversal method with first passage time distributions that appear in physical system models, and present its implementation and verification in FORTRAN.





Computational Spectrum of Agent Model Simulation
Kalyan S. Perumalla
Modeling, Simulation and Optimization 2009


ISBN 978-953-7619-36-7

Abstract: The study of human social behavioral systems is finding renewed interest in military, homeland security and other applications. Simulation is the most generally applied approach to studying complex scenarios in such systems. Here, we outline some of the important considerations that underlie the computational aspects of simulation-based study of human social systems. The fundamental imprecision underlying questions and answers in social science makes it necessary to carefully distinguish among different simulation problem classes and to identify the most pertinent set of computational dimensions associated with those classes. We identify a few such classes and present their computational implications. The focus is then shifted to the most challenging combinations in the computational spectrum, namely, large-scale entity counts at moderate to high levels of fidelity. Recent developments in furthering the state-of-the-art in these challenging cases are outlined. A case study of large-scale agent simulation is provided in simulating large numbers (millions) of social entities at real-time speeds on inexpensive hardware. Recent computational results are identified that highlight the potential of modern high-end computing platforms to push the envelope with respect to speed, scale and fidelity of social system simulations. Finally, the problem of shielding the modeler or domain expert from the complex computational aspects is discussed and a few potential solution approaches are identified.



Cyber Security Experimentation: Gory Detail or None at All?
Kalyan Perumalla
SIAM Annual Meeting 2009
http://www.siam.org/meetings/an09/

Abstract and presentation

Abstract: Unique facets confronted by current cyber security analysis efforts are: tremendous pace of change of ground rules (axioms), apparently wide and deep phenomenological effects, and widely-varying interpretations of security objectives. Together, effective methods for cyber security analysis appear to be swung between two extremes: experimentation-based methods with full, gory detail, and abstraction-based methods with significant simplifications. In the case of methods in between, accuracy considerations make intermediate methods tend to swing rapidly back towards full glory, while scientific inquiry and efficiency considerations tend to swing them back towards abstractions. Based on our experience and past evidence, we argue that experimentation with gory detail is the most effective approach in the short- to medium-term, while the other extreme is relevant one for the longer term. Feasibility will be shown of sustaining the scale and fidelity for the former extreme, namely, experiments with full gory.




Introduction to Simulations on GPUs
Kalyan Perumalla
ACM International Symposium on Distributed Simulations and Real-time Applications 2009
http://www.cs.unibo.it/ds-rt2009/

Abstract: Graphical processing units (GPUs) are now established as efficient, alternative computing platforms for certain niche applications. Computationally intensive simulations are among applications that can utilize GPUs as computing co-processors. This tutorial introduces the concepts and algorithms for executing simulations on GPUs. Algorithmic aspects for multi-pass execution of time-stepped simulations, and refinements for discrete event execution are described. Examples from applications such as agent-based simulations will be used to illustrate implementations, with source code extracts. Also briefly introduced is advanced material such as use of clusters of multiple GPUs, using a combination of the Message Passing Interface (MPI) and the Common Unified Device Architecture (CUDA). Implementation challenges, such as memory hierarchies and latency hiding needs, will be described. The tutorial is structured to minimize duplication of existing GPU literature, but to be self-contained and customized for simulation applications.





Switching to High Gear: Opportunities for Grand-scale Real-time Parallel Simulations
Kalyan Perumalla
ACM International Symposium on Distributed Simulations and Real-time Applications 2009
http://www.cs.unibo.it/ds-rt2009/

Keynote Talk

Abstract: The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 10^5 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher gear to exploit the new, immense levels of computational power. The potential for grand-scale real-time solutions is illustrated using preliminary results from prototypes in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) very large-scale epidemic modeling, (c) modeling the propagation of wireless network signals in very large, cluttered terrains, and, (d) country- or world-scale social behavioral modeling. We believe the stage is perfectly poised for the parallel/distributed simulation community to envision and formulate similar grand-scale, real-time simulation-based solutions in many application areas.





Switching to High Gear: Opportunities for Grand-scale Real-time Parallel Simulations
Kalyan Perumalla
ACM International Symposium on Distributed Simulations and Real-time Applications 2009


Keynote

Abstract: The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 10^5 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher gear to exploit the new, immense levels of computational power. The potential for grand-scale real-time solutions is illustrated using preliminary results from prototypes in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) very large-scale epidemic modeling, (c) modeling the propagation of wireless network signals in very large, cluttered terrains, and, (d) country- or world-scale social behavioral modeling. We believe the stage is perfectly poised for the parallel/distributed simulation community to envision and formulate similar grand-scale, real-time simulation-based solutions in many application areas.



Reversible Discrete Event Formulation and Optimistic Parallel Execution of Vehicular Traffic Models
Kalyan Perumalla and Srikanth Yoginath
International Journal of Simulation and Process Modeling 2009


Vol. 5 No. 2

Abstract: Vehicular traffic simulations are useful in applications such as emergency planning and traffic management. High speed of traffic simulations translates to speed of response and level of resilience in those applications. Discrete event formulation of traffic flow at the level of individual vehicles affords both the flexibility of simulating complex scenarios of vehicular flow behavior as well as rapid simulation time advances. However, efficient parallel/distributed execution of the models becomes challenging due to synchronization overheads. Here, a parallel traffic simulation approach is presented that is aimed at reducing the time for simulating emergency vehicular traffic scenarios. Our approach resolves the challenges that arise in parallel execution of microscopic, vehicular-level models of traffic. We apply a reverse computation-based optimistic execution approach to address the parallel synchronization problem. This is achieved by formulating a reversible version of a discrete event model of vehicular traffic, and by utilizing this reversible model in an optimistic execution setting. Three unique aspects of this effort are: (1) exploration of optimistic simulation applied to vehicular traffic simulation (2) addressing reverse computation challenges specific to optimistic vehicular traffic simulation (3) achieving absolute (as opposed to self-relative) speedup with a sequential speed close to that of a fast, de facto standard sequential simulator for emergency traffic. The design and development of the parallel simulation system is presented, along with a performance study that demonstrates excellent sequential performance as well as parallel performance. The benefits of optimistic execution are demonstrated, including a speed up of nearly 20 on 32 processors observed on a vehicular network of over 65,000 intersections and over 13 million vehicles.





GPU-based Real-Time Execution of Vehicular Mobility Models in Large-Scale Road Network Scenarios
Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath and Sudip Seal
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2009


Abstract: A methodology and its associated algorithms are presented for mapping a novel, field-based vehicular mobility model onto graphical processing unit computational platform for simulating mobility in large-scale road networks. Of particular focus is the achievement of real-time execution, on desktop platforms, of vehicular mobility on road networks comprised of millions of nodes and links, and multi-million counts of simultaneously active vehicles. The methodology is realized in a system called GARFIELD, whose implementation details and performance study are described. The runtime characteristics of a prototype implementation are presented that show real-time performance in simulations of networks at the scale of a few states of the US road networks.





A Connectionist Modeling Approach to Rapid Analysis of Emergent Social Cognition Properties in Large-Populations
Kalyan S. Perumalla and Jack C. Schryver
Human Behavior-Computational Modeling and Interoperability Conference 2009


Abstract: Traditional modeling methodologies, such as those based on rule-based agent modeling, are exhibiting limitations in application to rich behavioral scenarios, especially when applied to large population aggregates. Here, we propose a new modeling methodology based on a well-known "connectionist approach," and articulate its pertinence in new applications of interest. This methodology is designed to address challenges such as speed of model development, model customization, model reuse across disparate geographic/cultural regions, and rapid and incremental updates to models over time.





Coping at the User-Level with Resource Limitations in the Cray Message Passing Toolkit MPI at Scale: How Not to Spend Your Summer Vacation
Richard Mills, Forrest Hoffman, Patrick Worley, Kalyan Perumalla, Art Mirin, Glenn Hammond and Barry Smith
Proceedings of Cray User Group Meeting 2009






Scalable Parallel Execution of an Event-based Radio Signal Propagation Model for Cluttered 3D Terrains
Sudip Seal and Kalyan Perumalla
Proceedings of International Conference on Parallel Processing 2009






2008


Efficient Execution on GPUs of Field-based Vehicular Mobility Models
Kalyan Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2008






High Performance Computing-based Experimentation for Cyber Infrastructure and Security
Kalyan Perumalla
Lawrence Livermore National Laboratory 2008




Feasibility, Efficiency and Limits of Compiler-based Automation for Reversibility of Codes
Kalyan Perumalla
Lawrence Livermore National Laboratory 2008


Seminar



Data Parallel Execution Challenges and Runtime Performance of Agent Simulations on GPUs
Kalyan Perumalla and Brandon Aaby
Proceedings of Spring Computer Simulation Conference 2008


Best Paper Award





On the Reversibility of Newton-Raphson Root-Finding Method
Kalyan Perumalla, John Wright and Phani Kuruganti
Oak Ridge National Laboratory 2008


Technical Report ORNL-2007/152



Parallel Vehicular Traffic Simulations using Reverse Computation-based Optimistic Execution
Srikanth Yoginath and Kalyan Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2008






2007


A New Methodology for Multi-scale Simulation of Plasmas
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla
Lecture Book on Advanced Methods for Space Engineering 2007




A New Methodology for Multi-scale Simulation of Plasmas
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla
ISS-7 Lecture Notes on Advanced Methods for Space Simulations 2007




Scaling Time Warp-based Discrete Event Execution to 10^4 Processors on a Blue Gene Supercomputer
Kalyan Perumalla
Proceedings of ACM Computing Frontiers 2007






Handling Time Management under the HLA
Kalyan Perumalla
Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2007





Parallel and Distributed Simulation: Traditional Techniques and Recent Advances
Kalyan Perumalla
Principles of Advanced and Distributed Simulation (PADS) 2007





Application-Level Asynchronous Speculative Execution
Kalyan Perumalla
IBM T.J.Watson Research Center, York Town, New York 2007




An Analysis Approach to Large-Scale Vehicular Network Simulations
Kalyan Perumalla and Martin Beckerman
Proceedings of Summer Computer Simulation Conference 2007






Efficient Parallel Execution of Event-Driven Electromagnetic Hybrid Models
Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi
International Journal for Multi-scale Computational Engineering 2007


Vol. 5, No. 1



Model Execution
Kalyan S. Perumalla
CRC Handbook of Dynamic System Modeling 2007


ISBN 1584885653



2006


Integrated Analysis of Environment-Driven Operational Effects in Sensor Networks
Alfred Park and Kalyan Perumalla
Oak Ridge National Laboratory 2006


Technical Report ORNL-2006/537





On Evaluation Needs of Real-life Sensor Network Deployments
Alfred Park, Kalyan S. Perumalla, Vladimir Protopopescu, Mallikarjun Shankar, Frank DeNap and Bryan Gorman
Proceedings of European Modeling and Simulation Symposium (EMSS) 2006






Parallel and Distributed Simulation: Traditional Techniques and Recent Advances
Kalyan Perumalla
Proceedings of Winter Simulation Conference (WSC) 2006






A Systems Approach to Scalable Transportation Network Modeling
Kalyan Perumalla
Proceedings of Winter Simulation Conference (WSC) 2006






Parallel and Distributed Simulation: Traditional Techniques and Recent Advances
Kalyan Perumalla
Winter Simulation Conference (WSC) 2006






Handling Time Management under the HLA
Kalyan Perumalla
Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2006





Discrete-Event Execution Alternatives on GPGPUs
Kalyan S. Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2006






On Accounting for the Interplay of Kinetic and Non-Kinetic Aspects of Population Mobility Models
Kalyan S. Perumalla
Proceedings of European Modeling and Simulation Symposium (EMSS 2006






Parallel Execution of Region-Scale Evacuation Traffic Models
Kalyan S. Perumalla
Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2006






Scalable Simulation of Electromagnetic Hybrid Codes
Kalyan S. Perumalla, Richard M. Fujimoto, Homa Karimabadi
Proceedings of International Conference on Computational Science 2006






Network Simulation
Richard Fujimoto, Kalyan Perumalla and George Riley
Morgan & Claypool Publishers 2006


ISBN 1598291106



Optimistic Simulation of Physical Systems using Reverse Computation
Yarong Tang, Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi, Jonathan Driscoll and Yuri Omelchenko
Simulation: Transactions of the Society for Modeling and Simulation International 2006


Vol. 82, No. 1





2005


Virtual Simulator: An Infrastructure for Design and Performance-Prediction of Massively Parallel Codes
Kalyan Perumalla, Richard Fujimoto, Santosh Pande, Homa Karimabadi, Jonathan Driscoll, and Yuri Omelchenko
Proceedings of Eos Transactions, American Geophysical Union Fall Meeting 2005


Abstract and presentation

Abstract: Large parallel/distributed scientific simulations are very complex, and their dynamic behavior is hard to predict. Efficient development of massively parallel codes remains a computational challenge. For example, almost none of the kinetic codes in use in space physics today have dynamic load balancing capability. Here we present a new infrastructure for design and prediction of parallel codes. Performance prediction is useful to analyze, understand and experiment with different partitioning schemes, multiple modeling alternatives and so on, without having to run the application on supercomputers. Instrumentation of the model (with least perturbance to performance) is useful to glean key metrics and understand application-level behavior. Unfortunately, traditional approaches to virtual execution and instrumentation are limited by either slow execution speed or low resolution or both. We present a new framework that provides a high-resolution framework that provides a virtual CPU abstraction (with a full thread context per CPU), yet scales to thousands of virtual CPUs. The tool, called PDES2, presents different levels of modeling interfaces, from general purpose parallel simulations to parallel grid-based particle-in-cell (PIC) codes. The tool itself runs on multiple processors in order to accommodate the high-resolution by distributing the virtual execution across processors. Validation experiments of PIC models in the framework using a 1-D hybrid shock application show close agreement of results from virtual executions with results from actual supercomputer runs. The utility of this tool is further illustrated through an application to a parallel global hybrid code.





Parallel Discrete Event Simulations of Grid-based Models ? Asynchronous Electromagnetic Hybrid Code
Homa Karimabadi, Jonathan Driscoll, Jagrut Dave, Yuri Omelchenko, Kalyan Perumalla, Richard Fujimoto and N. Omidi
Springer Lecture Notes in Computer Science 2005






A New Simulation Technique for Study of Collision-less Shocks: Self Adaptive Simulations
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla
Proceedings of 4th Annual International Astrophysics Conference (IGPP) 2005




A New Methodology for Multi-scale Simulation of Plasmas
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla
Proceedings of 7th International Symposium for Space Simulations (ISSS) 2005




Parallel and Distributed Systems and the High-Level Architecture
Kalyan Perumalla
Interservice/Industry Training, Simulation and Education Conference (IITSEC) 2005




Distributed Simulation Systems & the High Level Architecture (HLA)
Kalyan Perumalla
Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2005






Computational Tools for Efficient Large-scale Discrete-Event Models
Kalyan Perumalla
Oak Ridge National Laboratory 2005




Computational Methods for Efficient Large-scale System Models
Kalyan Perumalla
Indiana University Purdue University 2005




µsik - A Micro-kernel for Parallel/Distributed Simulation Systems
Kalyan S. Perumalla
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2005






Performance Prediction of Large-scale Parallel Discrete Event Models of Physical Systems
Kalyan S. Perumalla, Richard M. Fujimoto, Prashant Thakare, Santosh Pande, Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll
Proceedings of Winter Simulation Conference (WSC) 2005






Optimistic Parallel Discrete Event Simulations of Physical Systems using Reverse Computation
Yarong Tang, Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi, Jonathan Driscoll and Yuri Omelchenko
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2005






2004


Conservative Synchronization of Large-scale Network Simulations
Alfred Park, Richard Fujimoto and Kalyan Perumalla
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2004






A Federated Approach to Distributed Network Simulation
George Riley, Mostafa Ammar, Richard Fujimoto, Alfred Park, Kalyan Perumalla and Donghua Xu
ACM Transactions on Modeling and Computer Simulation (TOMACS) 2004


Vol. 14, No. 2

Abstract: We describe an approach and our experiences in applying federated simulation techniques to create large-scale parallel simulations of computer networks. Using the federated approach, the topology and the protocol stack of the simulated network is partitioned into a number of submodels, and a simulation process is instantiated for each one. Runtime infrastructure software provides services for interprocess communication and synchronization (time management). We first describe issues that arise in homogeneous federations where a sequential simulator is federated with itself to realize a parallel implementation. We then describe additional issues that must be addressed in heterogeneous federations composed of different network simulation packages, and describe a dynamic simulation backplane mechanism that facilitates interoperability among different network simulators. Specifically, the dynamic simulation backplane provides a means of addressing key issues that arise in federating different network simulators: differing packet representations, incomplete implementations of network protocol models, and differing levels of detail among the simulation processes. We discuss two different methods for using the backplane for interactions between heterogeneous simulators: the cross-protocol stack method and the split-protocol stack method. Finally, results from an experimental study are presented for both the homogeneous and heterogeneous cases that provide evidence of the scalability of our federated approach on two moderately sized computing clusters. Two different homogeneous implementations are described: Parallel/Distributed ns (pdns) and the Georgia Tech Network Simulator (GTNetS). Results of a heterogeneous implementation federating ns with GloMoSim are described. This research demonstrates that federated simulations are a viable approach to realizing efficient parallel network simulation tools.





A New Approach to Modeling Physical Systems: Discrete Event Simulations of Grid-based Models
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, N. Omidi, Richard Fujimoto and Kalyan Perumalla
Proceedings of Workshop on State-Of-The-Art in Scientific Computing (PARA) 2004






Distributed Simulation Systems & the High Level Architecture (HLA)
Kalyan Perumalla
Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2004




High Fidelity Modeling of Computer Network Worms
Kalyan Perumalla and Srikanth Sundaragopalan
Proceedings of Annual Computer Security Applications Conference (ACSAC) 2004






2003


Generating Perfect Reversals of Simple Linear Codes
Kalyan Perumalla
Center for Experimental Research in Computing Systems, Georgia Institute of Technology 2003


Technical Report GIT-CERCS-TR-03-04



Techniques for Improving Accuracy and Usability in Large-scale Network Emulation
Kalyan Perumalla
College of Computing, Georgia Institute of Technology 2003


Technical Report GIT-CC-03-04



Achieving Interoperability and Scalability in Simulation of Networks
Kalyan Perumalla
University of Louisville, Louisville, Kentucky 2003




Scalable RTI-based Parallel Simulation of Networks
Kalyan Perumalla, Alfred Park, Richard Fujimoto and George Riley
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2003






Large-Scale Network Simulation - How Big? How Fast?
Richard Fujimoto, Kalyan Perumalla, Alfred Park, Hao Wu, Mostafa Ammar, and George Riley
Proceedings of IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunication Systems (MASC 2003






Power-aware State Dissemination in Mobile Distributed Virtual Environments
Weidong Shi, Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2003






2002


Using Reverse Circuit Execution for Efficient Parallel Simulation of Logic Circuits
Kalyan Perumalla, and Richard Fujimoto
Proceedings of The International Society for Optical Engineering (SPIE) Annual Meeting 2002






Experiences Applying Parallel and Interoperable Network Simulation Techniques in On-line Simulations of Military Networks
Kalyan Perumalla, Richard Fujimoto, Thom McLean and George Riley
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2002






Web Services for Extensible Modeling and Simulation
Kalyan S. Perumalla
Proceedings of Workshop on Extensible Modeling and Simulation Framework (XMSF) 2002






Updateable Simulations
Steve Ferenci, Richard Fujimoto, Mostafa Ammar, Kalyan Perumalla and George Riley
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2002






2001


Distributed Network Simulations using the Dynamic Simulation Backplane
George Riley, Mostafa Ammar, Richard Fujimoto, Donghua Xu and Kalyan Perumalla
Proceedings of the International Conference on Distributed Computing Systems, April 2001 (ICDCS) 2001






Interactive Parallel Simulations with the JANE Framework
Kalyan Perumalla and Richard Fujimoto
Future Generation Computer Systems 2001


Vol. 17, No. 5, Elsevier Science





Virtual Time Synchronization over Unreliable Network Transport
Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2001






2000


Using Reverse Computation towards Efficient Parallel/Distributed Computation
Kalyan Perumalla
Bell Labs, Lucent Technologies, Murray Hill, New Jersey 2000




Using Reverse Computation towards Efficient Parallel/Distributed Computation
Kalyan Perumalla
IBM T J Watson Research Center, Yorktown Heights, New York 2000




Parallel Simulation Backplanes for Mixed Signal Circuit Design
Richard Fujimoto, Kalyan Perumalla and Liang Xiao, Giorgio Casinovi, Madhavan Swaminathan, Siddharth Dalmia, J. Mao
Yamacraw Research Report, Georgia Institute of Technology 2000


Technical Report IAB-10-2000



Design of High-performance RTI software
Richard Fujimoto, Thom McLean, Kalyan Perumalla and Ivan Tacic
Proceedings of Distributed Simulations and Real-time Applications (DS-RT) 2000






An Approach to Federating Parallel Simulators
Steve Ferenci, Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2000






1999


Efficient Optimistic Parallel Simulations using Reverse Computation
Christopher Carothers, Kalyan Perumalla and Richard Fujimoto
ACM Transactions on Modeling and Computer Simulation (TOMACS) 1999


Vol. 9, No. 3





The Effect of State Saving in Optimistic Simulation on a Cache-coherent Non-uniform Memory Access (CC-NUMA) Architecture
Christopher Carothers, Kalyan Perumalla and Richard Fujimoto
Proceedings of the Winter Simulation Conference 1999






Efficient Optimistic Parallel Simulation using Reverse Computation
Christopher Carothers, Kalyan Perumalla and Richard Fujimoto
Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 1999






PARINO: A Parallel Branch and Cut Code
Jeff Linderoth, Kalyan Perumalla and Martin Savelsbergh
Proceedings of INFORMS National Meeting 1999




Source Code Transformations for Efficient Reversibility
Kalyan Perumalla and Richard Fujimoto
College of Computing, Georgia Institute of Technology 1999


Technical Report GIT-CC-99-21



The High Level Architecture for Simulation
Richard Fujimoto, Katherine Morse, Richard Weatherly, and Kalyan Perumalla
ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation 1999




1998


Towards Reusable Modeling and Parallel Simulation of Telecommunication Networks
Kalyan Perumalla
Bell Labs, Lucent Technologies, Murray Hill, New Jersey 1998




Efficient Large-scale Process-oriented Parallel Simulations
Kalyan Perumalla and Richard Fujimoto
Proceedings of the Winter Simulation Conference (WSC) 1998






TeD - A Language for Modeling Telecommunication Networks
Kalyan Perumalla, Andrew Ogielski, Richard Fujimoto
ACM Performance Evaluation Review 1998


Vol. 25, No. 4





TeD Models for ATM Internetworks
Kalyan Perumalla, Matthew Andrews and Sandeep Bhatt
ACM Performance Evaluation Review 1998


Vol. 25, No. 4





Parallel Simulation Techniques for Large-scale Networks
Sandeep Bhatt, Richard Fujimoto, Andrew Ogielski and Kalyan Perumalla
IEEE Communications 1998


Vol. 36, No. 8





1997


Time Parallel Generation of Self-similar ATM Traffic
Ioannis Nikolaidis, Anthony Cooper, Kalyan Perumalla and Richard Fujimoto
Proceedings of the Winter Simulation Conference (WSC) 1997






Reusable Modeling and Parallel Simulation of Networks using the TeD Language
Kalyan Perumalla
WINLAB, Rutgers University, Piscataway, New Jersey 1997




PARINO: An Extensible Framework for Solving Mixed Integer Programs in Parallel
Kalyan Perumalla, Martin Savelsbergh and Umakishore Ramachandran
College of Computing, Georgia Institute of Technology 1997


Technical Report GIT-CC-97-07





A Virtual PNNI Network Testbed
Kalyan Perumalla, Matthew Andrews and Sandeep Bhatt
Proceedings of the Winter Simulation Conference (WSC) 1997






PARINO, A Parallel Integer Optimizer
Martin Savelsbergh, Kalyan Perumalla, Jeff Linderoth, and Umakishore Ramachandran
Proceedings of International Symposium on Mathematical Programming 1997






1996


GTW++ -- An Object Oriented Interface in C++ to the Georgia Tech Time Warp System
Kalyan Perumalla and Richard Fujimoto
College of Computing, Georgia Institute of Technology 1996


Technical Report GIT-CC-96-09



A C++ Instance of TeD
Kalyan Perumalla, and Richard Fujimoto
College of Computing, Georgia Institute of Technology 1996


Technical Report GIT-CC-96-33



An Efficiency Prediction Method for ATM Multiplexer
Kalyan Perumalla, Anthony Cooper, Richard Fujimoto
Proceedings of Broadband Communications 1996




MetaTeD - A Meta Language for Modeling Telecommunication Networks
Kalyan Perumalla, Richard Fujimoto and Andrew Ogielski
College of Computing, Georgia Institute of Technology 1996


Technical Report GIT-CC-96-32



1995


A Performance Prediction Method for ATM Multiplexers
C. Anthony Cooper and Kalyan Perumalla
Bell Communications Research (Bellcore) 1995


Technical Memorandum TM-25152



Parallel Algorithms for Maximum Sub-sequence and Sub-array
Kalyan Perumalla and Narsingh Deo
Parallel Processing Letters 1995


Vol. 5, No. 3





1994


Parallelizing Sequential Algorithms for the Generalized Assignment Problem
Ivan Yanasak, Gautam Shah, Kalyan Perumalla, et al
DIMACS Challenge of Parallel Computing 1994




Parallel Algorithms for Maximum Sub-sequence and Sub-array
Kalyan Perumalla and Narsingh Deo
Proceedings of International Conference on Combinatorics, Graph Theory and Computing 1994




1993


Integrating Aggregate and Vehicle Level Simulations
Clark Karr, Robert Francescini and Kalyan Perumalla
Proceedings of the 3rd Conference on Computer Generated Forces and Behavioral Representation 1993




A Distributed Algorithm for Ear Decomposition
Sridhar Hannenhalli, Kalyan Perumalla and Narayan Chandrasekharan
Proceedings of International Conference on Computing and Information (ICCI) 1993




A Debugging Environment for PVM
Uday Vemulapati and Kalyan Perumalla
Distributed Computing for Aeroscience Applications 1993




1992


Integrating Battlefield Simulations of Different Granularity
Clark Karr, Robert Francescini and Kalyan Perumalla
Proceedings of the Southeastern Simulation Conference 1992




SELECT * FROM pubs ORDER BY PubYear DESC, PubDate DESC, PubAuthors


Copyright © Perumalla 2009-2010