|
Publication type: all [total 112]
Show full details for all items
2010On Deciding between Conservative and Optimistic Approaches on Massively Parallel Platforms Christopher Carothers and Kalyan Perumalla Winter Simulation Conference 2010 http://www.wintersim.org Invited Abstract: Over 5000 publications on parallel discrete event simulation (PDES) have appeared in the literature to date. Nevertheless, few articles have focused on empirical studies of PDES performance on large supercomputer-based systems. This gap is bridged here, by undertaking a parameterized performance study on thousands of processor cores of a Blue Gene supercomputing system. In contrast to theoretical insights from analytical studies, our study is based on actual implementation in software, incurring the actual messaging and computational overheads for both conservative and optimistic synchronization approaches of PDES. Complex and counter-intuitive effects are uncovered and analyzed, with different event timestamp distributions and available levels of concurrency in the synthetic benchmark models. The results are intended to provide guidance to the PDES community in terms of how the synchronization protocols behave at high processor core counts using a state-of-the-art supercomputing systems. Supercomputing Applications of the Other Kind: Real-time Parallel Discrete Event Simulations of Large-scale, Smart Infrastructures Kalyan Perumalla IBM T J Watson Research Center, Yorktown Heights, New York 2010 Abstract: Ultra-scale supercomputing hardware is a reality, reaching peta-scale recently and now moving to exa-scale. A rich class of applications has however remained largely untapped to reap supercomputing benefits, namely, parallel discrete event simulations (PDES), partly due to technical challenges, and partly awaiting compelling applications. The recent emergence of new visions of a "smarter" and more agile societal operation has clearly opened new large-scale applications that are directly formulated as grand-scale PDES scenarios executed in an on-line, real-time fashion. Our presentation will focus on highlighting the potential for grand-scale real-time solutions, illustrated using our preliminary efforts in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) country- or world-scale epidemic modeling, (c) cyber infrastructure and security operations at the scale of multiple autonomous systems, and, (d) country- or world-scale social behavioral modeling. We believe the context is ripe to envision and formulate similar grand-scale solutions in many additional application areas. The technical vision to progress to such an ambitious goal, however, is highly challenging. We outline some of the salient technical issues, challenges, and potential solution directions in meeting the scale and speed demanded by this "other kind" of supercomputing applications. Bcyclic: A Parallel Block Tri-diagonal Matrix Cyclic Solver Steven Hirshman, Kalyan Perumalla, Vickie Lynch and Raul Sanchez Journal of Computational Physics 2010 http://www.sciencedirect.com/science/journal/00219991 Accepted April 30, 2010 Abstract: A block tri-diagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three dimensional equilibrium solvers for high-temperature fusion plasmas is cited. Reversible Parallel Discrete-Event Execution of Large-scale Epidemic Outbreak Models Kalyan Perumalla, Sudip Seal 24th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2010) 2010 http://www.pads-workshop.org/pads2010.html Best Paper Finalist Abstract: The spatial scale, runtime speed, and behavioral detail of epidemic outbreak simulations altogether require the use of large-scale parallel processing. Here, an optimistic parallel discrete event execution of a reaction-diffusion simulation model is presented. Rollback support is achieved with the development of a novel reversible model that combines reverse computation with a small amount of incremental state saving. Parallel speedup and other runtime performance metrics of the system are tested on a small (8,192-core) Blue Gene / P system, while scalability is demonstrated on 65,536 cores of a large Cray XT5 system. Scenarios representing large population sizes (up to several hundreds of millions in the largest case) are exercised. Compiler-based Automation Approaches to Reverse Computation Kalyan Perumalla, Christopher Carothers 24th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2010) 2010 http://www.pads-workshop.org/pads2010.html Workshop on Reverse Computation Abstract: Automation is useful to facilitate reverse code generation from normal code. Here, we describe our source-to-source compilation approaches to automatic reverse code generation, developed along three different translation tools/frameworks. At RPI, we are developing frameworks based on PIPS, which is a purely source-to-source translation and optimization tool for parallel computing, and on CLANG/LLVM, which is a full compiler complete with backend processing and optimizers. At ORNL we are continuing development of the seminal reverse computation framework called RCC (Reverse C Compiler) system. For all three systems, we present some implementation issues and challenges encountered in our development. Towards Highly Interactive, GPU-based Evaluation of Evacuation Transport Scenarios at State-Scale Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath, Sudip Seal Journal of Transportation Safety and Security 2010 http://stc.utk.edu/tss Accepted Abstract: In large-scale scenarios, transportation modeling and simulation is severely constrained by simulation time. For example, few real-time simulators exist that can scale to evacuation traffic scenarios at the level of an entire state such as Louisiana (approx. 1 million links) or Florida (2.5 million links). New modeling techniques are needed to overcome severe computational demands of conventional (microscopic or mesoscopic) modeling techniques. Here, a modeling and execution methodology is explored which holds potential to provide a tradeoff among the level of behavioral detail, the scale of transportation network, and real-time execution capabilities. A novel, field-based modeling technique, and its implementation on graphical processing units (GPUs) are presented, as a step forward in enabling large-network transportation modeling and simulation. Although additional research with input from domain experts is needed for refining and validating the models, the techniques reported here afford interactive experience at hitherto fore unimaginable scales of multi-million road segments. Illustrative experiments on a few state-scale networks are described based on our implementation of this approach in a software system called GARFIELD-EVAC. Efficient Simulation of Agent-Based Models on Multi-GPU and Multi-Core Clusters Brandon G. Aaby, Kalyan S. Perumalla, Sudip K. Seal 3rd International ICST Conference on Simulation Tools and Techniques (SimuTools) 2010 http://www.simutools.org Best Paper Finalist Abstract: An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory hierarchies of extant platforms and present a novel analytical model of the tradeoff. We describe our implementation and report preliminary performance results on two distinct parallel platforms suitable for ABMS: CUDA threads on multiple, networked graphical processing units (GPUs), and pthreads on multi-core processors. Message Passing Interface (MPI) is used for inter-GPU as well as inter-socket communication on a cluster of multiple GPUs and multi-core processors. Results indicate the benefits of our latency-hiding scheme, delivering as much as over 100-fold improvement in runtime for certain benchmark ABMS application scenarios with several million agents. This speed improvement is obtained on our system that is already two to three orders of magnitude faster on one GPU than an equivalent CPU-based execution in a popular simulator in Java. Thus, the overall execution of our current work is over four orders of magnitude faster when executed on multiple GPUs. µπ: A Scalable and Transparent System for Simulating MPI Programs Kalyan Perumalla 3rd International ICST Conference on Simulation Tools and Techniques (SimuTools) 2010 http://www.simutools.org Abstract: µπ is a scalable, transparent system for experimenting with the execution of parallel programs on simulated computing platforms. The level of simulated detail can be varied for application behavior as well as for machine characteristics. Unique features of µπ are repeatability of execution, scalability to millions of simulated (virtual) MPI ranks, scalability to hundreds of thousands of host (real) MPI ranks, portability of the system to a variety of host supercomputing platforms, and the ability to experiment with scientific applications whose source-code is available. The set of source-code interfaces supported by µπ is being expanded to support a wider set of applications, and MPI-based scientific computing benchmarks are being ported. In proof-of-concept experiments, µπ has been successfully exercised to spawn and sustain very large-scale executions of an MPI test program given in source code form. Low slowdowns are observed, due to its use of purely discrete event style of execution, and due to the scalability and efficiency of the underlying parallel discrete event simulation engine, µsik. In the largest runs, µπ has been executed on up to 216,000 cores of a Cray XT5 supercomputer, successfully simulating over 27 million virtual MPI ranks, each virtual rank containing its own thread context, and all ranks fully synchronized by virtual time. Reversible Parallel Discrete Event Formulation of a TLM-based Radio Signal Propagation Model Sudip Seal and Kalyan Perumalla ACM Transactions on Modeling and Computer Simulations (TOMACS) 2010 http://linklings.net/tomacs Under review Abstract: Radio signal strength estimation is essential in many applications, including the design of military radio communications and industrial wireless installations. For scenarios with large or richly-featured geographical volumes, parallel processing is required to meet the memory and computation time demands. Here, we present a scalable and efficient parallel execution of the sequential model for radio signal propagation recently developed by Nutaro et al. Starting with that model, we (a) provide a vector-based reformulation that has significantly lower computational overhead for event handling, (b) develop a parallel decomposition approach that is amenable to reversibility with minimal computational overheads, (c) present a framework for transparently mapping the conservative time-stepped model into an optimistic parallel discrete event execution, (d) present a new reversible method, along with its analysis and implementation, for inverting the vector-based event model to be executed in an optimistic parallel style of execution, and (e) present performance results from implementation on Cray XT platforms. We demonstrate scalability, with the largest runs tested on up to 127,500 cores of a Cray XT5, enabling simulation of larger scenarios and with faster execution than reported before on the radio propagation model. This also represents the first successful demonstration of the ability to efficiently map a conservative time-stepped model to an optimistic discrete-event execution. High-Performance Simulations for Capturing Feedback and Fidelity in Complex Networked Systems Kalyan Perumalla SIAM Conference on Parallel Processing for Scientific Computing (PP10) 2010 http://www.siam.org/meetings/pp10/ Abstract and Presentation in MS44 Computational Network Science Abstract: In a variety of complex networked systems, simulation is a powerful method to capture critical feedback effects among inter-dependent processes. Network-based phenomena in areas such as cyberinfrastructure, transportation, epidemiology, and social networks, all offer important analysis problems that need such feedback effects to be accurately captured. However, accurate modeling of feedback effects requires increased levels of model fidelity. Moreover, such high-fidelity, feedback-heavy models are especially characterized by very high computational needs. In this backdrop, the need for high-fidelity simulations is illustrated, with examples of how they are driving new high-performance computing-based solutions in the aforementioned areas. Our parallel computing approaches are described in the context of very large-scale, high-fidelity simulations in regional-scale transportation network simulations, nation-scale epidemiological simulations, and Internet simulations with detailed models millions of nodes. Towards Highly Interactive, GPU-based Evaluation of Evacuation Transport Scenarios at State-Scale Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath, Sudip Seal National Evacuation Conference 2010 http://www.nationalevacuationconference.org Abstract: In large-scale scenarios, transportation modeling and simulation is severely constrained by simulation time. For example, few real-time simulators exist that can scale to evacuation traffic scenarios at the level of an entire state such as Louisiana (approx. 1 million links) or Florida (2.5 million links). New modeling techniques are needed to overcome severe computational demands of conventional (microscopic or mesoscopic) modeling techniques. Here, a modeling and execution methodology is explored which holds potential to provide a tradeoff among the level of behavioral detail, the scale of transportation network, and real-time execution capabilities. A novel, field-based modeling technique, and its implementation on graphical processing units (GPUs) are presented, as a step forward in enabling large-network transportation modeling and simulation. Although additional research with input from domain experts is needed for refining and validating the models, the techniques reported here afford interactive experience at hitherto fore unimaginable scales of multi-million road segments. Illustrative experiments on a few state-scale networks are described based on our implementation of this approach in a software system called GARFIELD-EVAC. 2009Scalable Parallel Execution of an Event-based Radio Signal Propagation Model for Cluttered 3D Terrains Sudip Seal and Kalyan Perumalla Oak Ridge National Laboratory 2009 http://www.osti.gov/bridge Technical Report ORNL/TM-2009/165 Abstract: Radio signal strength estimation is essential in many applications, including the design of military radio communications and industrial wireless installations. While classical approaches such as finite difference methods are well-known, new event-based models of radio signal propagation have been recently shown to deliver such estimates faster (via serial execution) than other methods. For scenarios with large or richly-featured geographical volumes however, parallel processing is required to meet the memory and computation time demands. Here, we present a scalable and efficient parallel execution of a recently-developed event-based radio signal propagation model. We demonstrate its scalability to thousands of processors, with parallel speedups over 1000?. The speed and scale achieved by our parallel execution enable larger scenarios and faster execution than has ever been reported before. Perfect Reversal of Rejection Sampling Methods for First-Passage-Time and Similar Probability Distributions Kalyan Perumalla and Aleksandar Donev Oak Ridge National Laboratory 2009 http://www.osti.gov/bridge Technical Report ORNL/TM-2009/182 Abstract: We present a perfectly reversible method for bi-directional generation of samples from computationally complex probability distributions. While the previously best-known procedures consume memory proportional to the length of execution between changes of execution direction, here we present a scheme to completely eliminate the memory overhead. Our solution affords two important features, namely determinism and repeatability, across arbitrarily spaced changes of direction (and arbitrary number of samples) along the sample stream. We illustrate the perfect reversal method with first passage time distributions that appear in physical system models, and present its implementation and verification in FORTRAN. Computational Spectrum of Agent Model Simulation Kalyan S. Perumalla Modeling, Simulation and Optimization 2009 ISBN 978-953-7619-36-7 Abstract: The study of human social behavioral systems is finding renewed interest in military, homeland security and other applications. Simulation is the most generally applied approach to studying complex scenarios in such systems. Here, we outline some of the important considerations that underlie the computational aspects of simulation-based study of human social systems. The fundamental imprecision underlying questions and answers in social science makes it necessary to carefully distinguish among different simulation problem classes and to identify the most pertinent set of computational dimensions associated with those classes. We identify a few such classes and present their computational implications. The focus is then shifted to the most challenging combinations in the computational spectrum, namely, large-scale entity counts at moderate to high levels of fidelity. Recent developments in furthering the state-of-the-art in these challenging cases are outlined. A case study of large-scale agent simulation is provided in simulating large numbers (millions) of social entities at real-time speeds on inexpensive hardware. Recent computational results are identified that highlight the potential of modern high-end computing platforms to push the envelope with respect to speed, scale and fidelity of social system simulations. Finally, the problem of shielding the modeler or domain expert from the complex computational aspects is discussed and a few potential solution approaches are identified. Cyber Security Experimentation: Gory Detail or None at All? Kalyan Perumalla SIAM Annual Meeting 2009 http://www.siam.org/meetings/an09/ Abstract and presentation Abstract: Unique facets confronted by current cyber security analysis efforts are: tremendous pace of change of ground rules (axioms), apparently wide and deep phenomenological effects, and widely-varying interpretations of security objectives. Together, effective methods for cyber security analysis appear to be swung between two extremes: experimentation-based methods with full, gory detail, and abstraction-based methods with significant simplifications. In the case of methods in between, accuracy considerations make intermediate methods tend to swing rapidly back towards full glory, while scientific inquiry and efficiency considerations tend to swing them back towards abstractions. Based on our experience and past evidence, we argue that experimentation with gory detail is the most effective approach in the short- to medium-term, while the other extreme is relevant one for the longer term. Feasibility will be shown of sustaining the scale and fidelity for the former extreme, namely, experiments with full gory. Introduction to Simulations on GPUs Kalyan Perumalla ACM International Symposium on Distributed Simulations and Real-time Applications 2009 http://www.cs.unibo.it/ds-rt2009/ Abstract: Graphical processing units (GPUs) are now established as efficient, alternative computing platforms for certain niche applications. Computationally intensive simulations are among applications that can utilize GPUs as computing co-processors. This tutorial introduces the concepts and algorithms for executing simulations on GPUs. Algorithmic aspects for multi-pass execution of time-stepped simulations, and refinements for discrete event execution are described. Examples from applications such as agent-based simulations will be used to illustrate implementations, with source code extracts. Also briefly introduced is advanced material such as use of clusters of multiple GPUs, using a combination of the Message Passing Interface (MPI) and the Common Unified Device Architecture (CUDA). Implementation challenges, such as memory hierarchies and latency hiding needs, will be described. The tutorial is structured to minimize duplication of existing GPU literature, but to be self-contained and customized for simulation applications. Switching to High Gear: Opportunities for Grand-scale Real-time Parallel Simulations Kalyan Perumalla ACM International Symposium on Distributed Simulations and Real-time Applications 2009 http://www.cs.unibo.it/ds-rt2009/ Keynote Talk Abstract: The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 10^5 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher gear to exploit the new, immense levels of computational power. The potential for grand-scale real-time solutions is illustrated using preliminary results from prototypes in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) very large-scale epidemic modeling, (c) modeling the propagation of wireless network signals in very large, cluttered terrains, and, (d) country- or world-scale social behavioral modeling. We believe the stage is perfectly poised for the parallel/distributed simulation community to envision and formulate similar grand-scale, real-time simulation-based solutions in many application areas. Switching to High Gear: Opportunities for Grand-scale Real-time Parallel Simulations Kalyan Perumalla ACM International Symposium on Distributed Simulations and Real-time Applications 2009 Keynote Abstract: The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 10^5 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher gear to exploit the new, immense levels of computational power. The potential for grand-scale real-time solutions is illustrated using preliminary results from prototypes in four example application areas: (a) state- or regional-scale vehicular mobility modeling, (b) very large-scale epidemic modeling, (c) modeling the propagation of wireless network signals in very large, cluttered terrains, and, (d) country- or world-scale social behavioral modeling. We believe the stage is perfectly poised for the parallel/distributed simulation community to envision and formulate similar grand-scale, real-time simulation-based solutions in many application areas. Reversible Discrete Event Formulation and Optimistic Parallel Execution of Vehicular Traffic Models Kalyan Perumalla and Srikanth Yoginath International Journal of Simulation and Process Modeling 2009 Vol. 5 No. 2 Abstract: Vehicular traffic simulations are useful in applications such as emergency planning and traffic management. High speed of traffic simulations translates to speed of response and level of resilience in those applications. Discrete event formulation of traffic flow at the level of individual vehicles affords both the flexibility of simulating complex scenarios of vehicular flow behavior as well as rapid simulation time advances. However, efficient parallel/distributed execution of the models becomes challenging due to synchronization overheads. Here, a parallel traffic simulation approach is presented that is aimed at reducing the time for simulating emergency vehicular traffic scenarios. Our approach resolves the challenges that arise in parallel execution of microscopic, vehicular-level models of traffic. We apply a reverse computation-based optimistic execution approach to address the parallel synchronization problem. This is achieved by formulating a reversible version of a discrete event model of vehicular traffic, and by utilizing this reversible model in an optimistic execution setting. Three unique aspects of this effort are: (1) exploration of optimistic simulation applied to vehicular traffic simulation (2) addressing reverse computation challenges specific to optimistic vehicular traffic simulation (3) achieving absolute (as opposed to self-relative) speedup with a sequential speed close to that of a fast, de facto standard sequential simulator for emergency traffic. The design and development of the parallel simulation system is presented, along with a performance study that demonstrates excellent sequential performance as well as parallel performance. The benefits of optimistic execution are demonstrated, including a speed up of nearly 20 on 32 processors observed on a vehicular network of over 65,000 intersections and over 13 million vehicles. GPU-based Real-Time Execution of Vehicular Mobility Models in Large-Scale Road Network Scenarios Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath and Sudip Seal Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2009 Abstract: A methodology and its associated algorithms are presented for mapping a novel, field-based vehicular mobility model onto graphical processing unit computational platform for simulating mobility in large-scale road networks. Of particular focus is the achievement of real-time execution, on desktop platforms, of vehicular mobility on road networks comprised of millions of nodes and links, and multi-million counts of simultaneously active vehicles. The methodology is realized in a system called GARFIELD, whose implementation details and performance study are described. The runtime characteristics of a prototype implementation are presented that show real-time performance in simulations of networks at the scale of a few states of the US road networks. A Connectionist Modeling Approach to Rapid Analysis of Emergent Social Cognition Properties in Large-Populations Kalyan S. Perumalla and Jack C. Schryver Human Behavior-Computational Modeling and Interoperability Conference 2009 Abstract: Traditional modeling methodologies, such as those based on rule-based agent modeling, are exhibiting limitations in application to rich behavioral scenarios, especially when applied to large population aggregates. Here, we propose a new modeling methodology based on a well-known "connectionist approach," and articulate its pertinence in new applications of interest. This methodology is designed to address challenges such as speed of model development, model customization, model reuse across disparate geographic/cultural regions, and rapid and incremental updates to models over time. Coping at the User-Level with Resource Limitations in the Cray Message Passing Toolkit MPI at Scale: How Not to Spend Your Summer Vacation Richard Mills, Forrest Hoffman, Patrick Worley, Kalyan Perumalla, Art Mirin, Glenn Hammond and Barry Smith Proceedings of Cray User Group Meeting 2009 Scalable Parallel Execution of an Event-based Radio Signal Propagation Model for Cluttered 3D Terrains Sudip Seal and Kalyan Perumalla Proceedings of International Conference on Parallel Processing 2009 2008Efficient Execution on GPUs of Field-based Vehicular Mobility Models Kalyan Perumalla Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2008 High Performance Computing-based Experimentation for Cyber Infrastructure and Security Kalyan Perumalla Lawrence Livermore National Laboratory 2008 Feasibility, Efficiency and Limits of Compiler-based Automation for Reversibility of Codes Kalyan Perumalla Lawrence Livermore National Laboratory 2008 Seminar Data Parallel Execution Challenges and Runtime Performance of Agent Simulations on GPUs Kalyan Perumalla and Brandon Aaby Proceedings of Spring Computer Simulation Conference 2008 Best Paper Award On the Reversibility of Newton-Raphson Root-Finding Method Kalyan Perumalla, John Wright and Phani Kuruganti Oak Ridge National Laboratory 2008 Technical Report ORNL-2007/152 Parallel Vehicular Traffic Simulations using Reverse Computation-based Optimistic Execution Srikanth Yoginath and Kalyan Perumalla Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2008 2007A New Methodology for Multi-scale Simulation of Plasmas Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla Lecture Book on Advanced Methods for Space Engineering 2007 A New Methodology for Multi-scale Simulation of Plasmas Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla ISS-7 Lecture Notes on Advanced Methods for Space Simulations 2007 Scaling Time Warp-based Discrete Event Execution to 10^4 Processors on a Blue Gene Supercomputer Handling Time Management under the HLA Kalyan Perumalla Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2007 Parallel and Distributed Simulation: Traditional Techniques and Recent Advances Application-Level Asynchronous Speculative Execution Kalyan Perumalla IBM T.J.Watson Research Center, York Town, New York 2007 An Analysis Approach to Large-Scale Vehicular Network Simulations Efficient Parallel Execution of Event-Driven Electromagnetic Hybrid Models Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi International Journal for Multi-scale Computational Engineering 2007 Vol. 5, No. 1 2006Integrated Analysis of Environment-Driven Operational Effects in Sensor Networks Alfred Park and Kalyan Perumalla Oak Ridge National Laboratory 2006 Technical Report ORNL-2006/537 On Evaluation Needs of Real-life Sensor Network Deployments Alfred Park, Kalyan S. Perumalla, Vladimir Protopopescu, Mallikarjun Shankar, Frank DeNap and Bryan Gorman Proceedings of European Modeling and Simulation Symposium (EMSS) 2006 Parallel and Distributed Simulation: Traditional Techniques and Recent Advances A Systems Approach to Scalable Transportation Network Modeling Parallel and Distributed Simulation: Traditional Techniques and Recent Advances Handling Time Management under the HLA Kalyan Perumalla Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2006 Discrete-Event Execution Alternatives on GPGPUs Kalyan S. Perumalla Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2006 On Accounting for the Interplay of Kinetic and Non-Kinetic Aspects of Population Mobility Models Parallel Execution of Region-Scale Evacuation Traffic Models Kalyan S. Perumalla Proceedings of International Workshop on Principles of Advanced and Distributed Simulation 2006 Scalable Simulation of Electromagnetic Hybrid Codes Kalyan S. Perumalla, Richard M. Fujimoto, Homa Karimabadi Proceedings of International Conference on Computational Science 2006 Network Simulation Richard Fujimoto, Kalyan Perumalla and George Riley Morgan & Claypool Publishers 2006 ISBN 1598291106 Optimistic Simulation of Physical Systems using Reverse Computation Yarong Tang, Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi, Jonathan Driscoll and Yuri Omelchenko Simulation: Transactions of the Society for Modeling and Simulation International 2006 Vol. 82, No. 1 2005Virtual Simulator: An Infrastructure for Design and Performance-Prediction of Massively Parallel Codes Kalyan Perumalla, Richard Fujimoto, Santosh Pande, Homa Karimabadi, Jonathan Driscoll, and Yuri Omelchenko Proceedings of Eos Transactions, American Geophysical Union Fall Meeting 2005 Abstract and presentation Abstract: Large parallel/distributed scientific simulations are very complex, and their dynamic behavior is hard to predict. Efficient development of massively parallel codes remains a computational challenge. For example, almost none of the kinetic codes in use in space physics today have dynamic load balancing capability. Here we present a new infrastructure for design and prediction of parallel codes. Performance prediction is useful to analyze, understand and experiment with different partitioning schemes, multiple modeling alternatives and so on, without having to run the application on supercomputers. Instrumentation of the model (with least perturbance to performance) is useful to glean key metrics and understand application-level behavior. Unfortunately, traditional approaches to virtual execution and instrumentation are limited by either slow execution speed or low resolution or both. We present a new framework that provides a high-resolution framework that provides a virtual CPU abstraction (with a full thread context per CPU), yet scales to thousands of virtual CPUs. The tool, called PDES2, presents different levels of modeling interfaces, from general purpose parallel simulations to parallel grid-based particle-in-cell (PIC) codes. The tool itself runs on multiple processors in order to accommodate the high-resolution by distributing the virtual execution across processors. Validation experiments of PIC models in the framework using a 1-D hybrid shock application show close agreement of results from virtual executions with results from actual supercomputer runs. The utility of this tool is further illustrated through an application to a parallel global hybrid code. Parallel Discrete Event Simulations of Grid-based Models ? Asynchronous Electromagnetic Hybrid Code Homa Karimabadi, Jonathan Driscoll, Jagrut Dave, Yuri Omelchenko, Kalyan Perumalla, Richard Fujimoto and N. Omidi Springer Lecture Notes in Computer Science 2005 A New Simulation Technique for Study of Collision-less Shocks: Self Adaptive Simulations Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla Proceedings of 4th Annual International Astrophysics Conference (IGPP) 2005 A New Methodology for Multi-scale Simulation of Plasmas Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla Proceedings of 7th International Symposium for Space Simulations (ISSS) 2005 Parallel and Distributed Systems and the High-Level Architecture Kalyan Perumalla Interservice/Industry Training, Simulation and Education Conference (IITSEC) 2005 Distributed Simulation Systems & the High Level Architecture (HLA) Kalyan Perumalla Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2005 Computational Tools for Efficient Large-scale Discrete-Event Models Kalyan Perumalla Oak Ridge National Laboratory 2005 Computational Methods for Efficient Large-scale System Models Kalyan Perumalla Indiana University Purdue University 2005 µsik - A Micro-kernel for Parallel/Distributed Simulation Systems Kalyan S. Perumalla Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2005 Performance Prediction of Large-scale Parallel Discrete Event Models of Physical Systems Kalyan S. Perumalla, Richard M. Fujimoto, Prashant Thakare, Santosh Pande, Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll Proceedings of Winter Simulation Conference (WSC) 2005 Optimistic Parallel Discrete Event Simulations of Physical Systems using Reverse Computation Yarong Tang, Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi, Jonathan Driscoll and Yuri Omelchenko Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2005 2004Conservative Synchronization of Large-scale Network Simulations Alfred Park, Richard Fujimoto and Kalyan Perumalla Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2004 A Federated Approach to Distributed Network Simulation George Riley, Mostafa Ammar, Richard Fujimoto, Alfred Park, Kalyan Perumalla and Donghua Xu ACM Transactions on Modeling and Computer Simulation (TOMACS) 2004 Vol. 14, No. 2 Abstract: We describe an approach and our experiences in applying federated simulation techniques to create large-scale parallel simulations of computer networks. Using the federated approach, the topology and the protocol stack of the simulated network is partitioned into a number of submodels, and a simulation process is instantiated for each one. Runtime infrastructure software provides services for interprocess communication and synchronization (time management). We first describe issues that arise in homogeneous federations where a sequential simulator is federated with itself to realize a parallel implementation. We then describe additional issues that must be addressed in heterogeneous federations composed of different network simulation packages, and describe a dynamic simulation backplane mechanism that facilitates interoperability among different network simulators. Specifically, the dynamic simulation backplane provides a means of addressing key issues that arise in federating different network simulators: differing packet representations, incomplete implementations of network protocol models, and differing levels of detail among the simulation processes. We discuss two different methods for using the backplane for interactions between heterogeneous simulators: the cross-protocol stack method and the split-protocol stack method. Finally, results from an experimental study are presented for both the homogeneous and heterogeneous cases that provide evidence of the scalability of our federated approach on two moderately sized computing clusters. Two different homogeneous implementations are described: Parallel/Distributed ns (pdns) and the Georgia Tech Network Simulator (GTNetS). Results of a heterogeneous implementation federating ns with GloMoSim are described. This research demonstrates that federated simulations are a viable approach to realizing efficient parallel network simulation tools. A New Approach to Modeling Physical Systems: Discrete Event Simulations of Grid-based Models Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, N. Omidi, Richard Fujimoto and Kalyan Perumalla Proceedings of Workshop on State-Of-The-Art in Scientific Computing (PARA) 2004 Distributed Simulation Systems & the High Level Architecture (HLA) Kalyan Perumalla Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) 2004 High Fidelity Modeling of Computer Network Worms Kalyan Perumalla and Srikanth Sundaragopalan Proceedings of Annual Computer Security Applications Conference (ACSAC) 2004 2003Generating Perfect Reversals of Simple Linear Codes Kalyan Perumalla Center for Experimental Research in Computing Systems, Georgia Institute of Technology 2003 Technical Report GIT-CERCS-TR-03-04 Techniques for Improving Accuracy and Usability in Large-scale Network Emulation Kalyan Perumalla College of Computing, Georgia Institute of Technology 2003 Technical Report GIT-CC-03-04 Achieving Interoperability and Scalability in Simulation of Networks Kalyan Perumalla University of Louisville, Louisville, Kentucky 2003 Scalable RTI-based Parallel Simulation of Networks Kalyan Perumalla, Alfred Park, Richard Fujimoto and George Riley Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2003 Large-Scale Network Simulation - How Big? How Fast? Richard Fujimoto, Kalyan Perumalla, Alfred Park, Hao Wu, Mostafa Ammar, and George Riley Proceedings of IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunication Systems (MASC 2003 Power-aware State Dissemination in Mobile Distributed Virtual Environments Weidong Shi, Kalyan Perumalla and Richard Fujimoto Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2003 2002Using Reverse Circuit Execution for Efficient Parallel Simulation of Logic Circuits Kalyan Perumalla, and Richard Fujimoto Proceedings of The International Society for Optical Engineering (SPIE) Annual Meeting 2002 Experiences Applying Parallel and Interoperable Network Simulation Techniques in On-line Simulations of Military Networks Kalyan Perumalla, Richard Fujimoto, Thom McLean and George Riley Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2002 Web Services for Extensible Modeling and Simulation Kalyan S. Perumalla Proceedings of Workshop on Extensible Modeling and Simulation Framework (XMSF) 2002 Updateable Simulations Steve Ferenci, Richard Fujimoto, Mostafa Ammar, Kalyan Perumalla and George Riley Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2002 2001Distributed Network Simulations using the Dynamic Simulation Backplane George Riley, Mostafa Ammar, Richard Fujimoto, Donghua Xu and Kalyan Perumalla Proceedings of the International Conference on Distributed Computing Systems, April 2001 (ICDCS) 2001 Interactive Parallel Simulations with the JANE Framework Kalyan Perumalla and Richard Fujimoto Future Generation Computer Systems 2001 Vol. 17, No. 5, Elsevier Science Virtual Time Synchronization over Unreliable Network Transport Kalyan Perumalla and Richard Fujimoto Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2001 2000Using Reverse Computation towards Efficient Parallel/Distributed Computation Kalyan Perumalla Bell Labs, Lucent Technologies, Murray Hill, New Jersey 2000 Using Reverse Computation towards Efficient Parallel/Distributed Computation Kalyan Perumalla IBM T J Watson Research Center, Yorktown Heights, New York 2000 Parallel Simulation Backplanes for Mixed Signal Circuit Design Richard Fujimoto, Kalyan Perumalla and Liang Xiao, Giorgio Casinovi, Madhavan Swaminathan, Siddharth Dalmia, J. Mao Yamacraw Research Report, Georgia Institute of Technology 2000 Technical Report IAB-10-2000 Design of High-performance RTI software Richard Fujimoto, Thom McLean, Kalyan Perumalla and Ivan Tacic Proceedings of Distributed Simulations and Real-time Applications (DS-RT) 2000 An Approach to Federating Parallel Simulators Steve Ferenci, Kalyan Perumalla and Richard Fujimoto Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 2000 1999Efficient Optimistic Parallel Simulations using Reverse Computation Christopher Carothers, Kalyan Perumalla and Richard Fujimoto ACM Transactions on Modeling and Computer Simulation (TOMACS) 1999 Vol. 9, No. 3 The Effect of State Saving in Optimistic Simulation on a Cache-coherent Non-uniform Memory Access (CC-NUMA) Architecture Christopher Carothers, Kalyan Perumalla and Richard Fujimoto Proceedings of the Winter Simulation Conference 1999 Efficient Optimistic Parallel Simulation using Reverse Computation Christopher Carothers, Kalyan Perumalla and Richard Fujimoto Proceedings of ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation (PADS) 1999 PARINO: A Parallel Branch and Cut Code Jeff Linderoth, Kalyan Perumalla and Martin Savelsbergh Proceedings of INFORMS National Meeting 1999 Source Code Transformations for Efficient Reversibility Kalyan Perumalla and Richard Fujimoto College of Computing, Georgia Institute of Technology 1999 Technical Report GIT-CC-99-21 The High Level Architecture for Simulation Richard Fujimoto, Katherine Morse, Richard Weatherly, and Kalyan Perumalla ACM/IEEE/SCS Workshop on Parallel and Distributed Simulation 1999 1998Towards Reusable Modeling and Parallel Simulation of Telecommunication Networks Kalyan Perumalla Bell Labs, Lucent Technologies, Murray Hill, New Jersey 1998 Efficient Large-scale Process-oriented Parallel Simulations TeD - A Language for Modeling Telecommunication Networks Kalyan Perumalla, Andrew Ogielski, Richard Fujimoto ACM Performance Evaluation Review 1998 Vol. 25, No. 4 TeD Models for ATM Internetworks Kalyan Perumalla, Matthew Andrews and Sandeep Bhatt ACM Performance Evaluation Review 1998 Vol. 25, No. 4 Parallel Simulation Techniques for Large-scale Networks Sandeep Bhatt, Richard Fujimoto, Andrew Ogielski and Kalyan Perumalla IEEE Communications 1998 Vol. 36, No. 8 1997Time Parallel Generation of Self-similar ATM Traffic Ioannis Nikolaidis, Anthony Cooper, Kalyan Perumalla and Richard Fujimoto Proceedings of the Winter Simulation Conference (WSC) 1997 Reusable Modeling and Parallel Simulation of Networks using the TeD Language Kalyan Perumalla WINLAB, Rutgers University, Piscataway, New Jersey 1997 PARINO: An Extensible Framework for Solving Mixed Integer Programs in Parallel Kalyan Perumalla, Martin Savelsbergh and Umakishore Ramachandran College of Computing, Georgia Institute of Technology 1997 Technical Report GIT-CC-97-07 A Virtual PNNI Network Testbed Kalyan Perumalla, Matthew Andrews and Sandeep Bhatt Proceedings of the Winter Simulation Conference (WSC) 1997 PARINO, A Parallel Integer Optimizer Martin Savelsbergh, Kalyan Perumalla, Jeff Linderoth, and Umakishore Ramachandran Proceedings of International Symposium on Mathematical Programming 1997 1996GTW++ -- An Object Oriented Interface in C++ to the Georgia Tech Time Warp System Kalyan Perumalla and Richard Fujimoto College of Computing, Georgia Institute of Technology 1996 Technical Report GIT-CC-96-09 A C++ Instance of TeD Kalyan Perumalla, and Richard Fujimoto College of Computing, Georgia Institute of Technology 1996 Technical Report GIT-CC-96-33 An Efficiency Prediction Method for ATM Multiplexer Kalyan Perumalla, Anthony Cooper, Richard Fujimoto Proceedings of Broadband Communications 1996 MetaTeD - A Meta Language for Modeling Telecommunication Networks Kalyan Perumalla, Richard Fujimoto and Andrew Ogielski College of Computing, Georgia Institute of Technology 1996 Technical Report GIT-CC-96-32 1995A Performance Prediction Method for ATM Multiplexers C. Anthony Cooper and Kalyan Perumalla Bell Communications Research (Bellcore) 1995 Technical Memorandum TM-25152 Parallel Algorithms for Maximum Sub-sequence and Sub-array 1994Parallelizing Sequential Algorithms for the Generalized Assignment Problem Ivan Yanasak, Gautam Shah, Kalyan Perumalla, et al DIMACS Challenge of Parallel Computing 1994 Parallel Algorithms for Maximum Sub-sequence and Sub-array Kalyan Perumalla and Narsingh Deo Proceedings of International Conference on Combinatorics, Graph Theory and Computing 1994 1993Integrating Aggregate and Vehicle Level Simulations Clark Karr, Robert Francescini and Kalyan Perumalla Proceedings of the 3rd Conference on Computer Generated Forces and Behavioral Representation 1993 A Distributed Algorithm for Ear Decomposition Sridhar Hannenhalli, Kalyan Perumalla and Narayan Chandrasekharan Proceedings of International Conference on Computing and Information (ICCI) 1993 A Debugging Environment for PVM Uday Vemulapati and Kalyan Perumalla Distributed Computing for Aeroscience Applications 1993 1992Integrating Battlefield Simulations of Different Granularity Clark Karr, Robert Francescini and Kalyan Perumalla Proceedings of the Southeastern Simulation Conference 1992 SELECT * FROM pubs ORDER BY PubYear DESC, PubDate DESC, PubAuthors |