Search for

or choose

Publication type: journal [total 15]

Show full details for all items

2010


Bcyclic: A Parallel Block Tri-diagonal Matrix Cyclic Solver
Steven Hirshman, Kalyan Perumalla, Vickie Lynch and Raul Sanchez
Journal of Computational Physics 2010
http://www.sciencedirect.com/science/journal/00219991

Accepted April 30, 2010

Abstract: A block tri-diagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three dimensional equilibrium solvers for high-temperature fusion plasmas is cited.





Towards Highly Interactive, GPU-based Evaluation of Evacuation Transport Scenarios at State-Scale
Kalyan Perumalla, Brandon Aaby, Srikanth Yoginath, Sudip Seal
Journal of Transportation Safety and Security 2010
http://stc.utk.edu/tss

Accepted

Abstract: In large-scale scenarios, transportation modeling and simulation is severely constrained by simulation time. For example, few real-time simulators exist that can scale to evacuation traffic scenarios at the level of an entire state such as Louisiana (approx. 1 million links) or Florida (2.5 million links). New modeling techniques are needed to overcome severe computational demands of conventional (microscopic or mesoscopic) modeling techniques. Here, a modeling and execution methodology is explored which holds potential to provide a tradeoff among the level of behavioral detail, the scale of transportation network, and real-time execution capabilities. A novel, field-based modeling technique, and its implementation on graphical processing units (GPUs) are presented, as a step forward in enabling large-network transportation modeling and simulation. Although additional research with input from domain experts is needed for refining and validating the models, the techniques reported here afford interactive experience at hitherto fore unimaginable scales of multi-million road segments. Illustrative experiments on a few state-scale networks are described based on our implementation of this approach in a software system called GARFIELD-EVAC.





Reversible Parallel Discrete Event Formulation of a TLM-based Radio Signal Propagation Model
Sudip Seal and Kalyan Perumalla
ACM Transactions on Modeling and Computer Simulations (TOMACS) 2010
http://linklings.net/tomacs

Under review

Abstract: Radio signal strength estimation is essential in many applications, including the design of military radio communications and industrial wireless installations. For scenarios with large or richly-featured geographical volumes, parallel processing is required to meet the memory and computation time demands. Here, we present a scalable and efficient parallel execution of the sequential model for radio signal propagation recently developed by Nutaro et al. Starting with that model, we (a) provide a vector-based reformulation that has significantly lower computational overhead for event handling, (b) develop a parallel decomposition approach that is amenable to reversibility with minimal computational overheads, (c) present a framework for transparently mapping the conservative time-stepped model into an optimistic parallel discrete event execution, (d) present a new reversible method, along with its analysis and implementation, for inverting the vector-based event model to be executed in an optimistic parallel style of execution, and (e) present performance results from implementation on Cray XT platforms. We demonstrate scalability, with the largest runs tested on up to 127,500 cores of a Cray XT5, enabling simulation of larger scenarios and with faster execution than reported before on the radio propagation model. This also represents the first successful demonstration of the ability to efficiently map a conservative time-stepped model to an optimistic discrete-event execution.





2009


Reversible Discrete Event Formulation and Optimistic Parallel Execution of Vehicular Traffic Models
Kalyan Perumalla and Srikanth Yoginath
International Journal of Simulation and Process Modeling 2009


Vol. 5 No. 2

Abstract: Vehicular traffic simulations are useful in applications such as emergency planning and traffic management. High speed of traffic simulations translates to speed of response and level of resilience in those applications. Discrete event formulation of traffic flow at the level of individual vehicles affords both the flexibility of simulating complex scenarios of vehicular flow behavior as well as rapid simulation time advances. However, efficient parallel/distributed execution of the models becomes challenging due to synchronization overheads. Here, a parallel traffic simulation approach is presented that is aimed at reducing the time for simulating emergency vehicular traffic scenarios. Our approach resolves the challenges that arise in parallel execution of microscopic, vehicular-level models of traffic. We apply a reverse computation-based optimistic execution approach to address the parallel synchronization problem. This is achieved by formulating a reversible version of a discrete event model of vehicular traffic, and by utilizing this reversible model in an optimistic execution setting. Three unique aspects of this effort are: (1) exploration of optimistic simulation applied to vehicular traffic simulation (2) addressing reverse computation challenges specific to optimistic vehicular traffic simulation (3) achieving absolute (as opposed to self-relative) speedup with a sequential speed close to that of a fast, de facto standard sequential simulator for emergency traffic. The design and development of the parallel simulation system is presented, along with a performance study that demonstrates excellent sequential performance as well as parallel performance. The benefits of optimistic execution are demonstrated, including a speed up of nearly 20 on 32 processors observed on a vehicular network of over 65,000 intersections and over 13 million vehicles.





2007


A New Methodology for Multi-scale Simulation of Plasmas
Homa Karimabadi, Yuri Omelchenko, Jonathan Driscoll, Richard Fujimoto and Kalyan Perumalla
ISS-7 Lecture Notes on Advanced Methods for Space Simulations 2007




Efficient Parallel Execution of Event-Driven Electromagnetic Hybrid Models
Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi
International Journal for Multi-scale Computational Engineering 2007


Vol. 5, No. 1



2006


Optimistic Simulation of Physical Systems using Reverse Computation
Yarong Tang, Kalyan Perumalla, Richard Fujimoto, Homa Karimabadi, Jonathan Driscoll and Yuri Omelchenko
Simulation: Transactions of the Society for Modeling and Simulation International 2006


Vol. 82, No. 1





2005


Parallel Discrete Event Simulations of Grid-based Models ? Asynchronous Electromagnetic Hybrid Code
Homa Karimabadi, Jonathan Driscoll, Jagrut Dave, Yuri Omelchenko, Kalyan Perumalla, Richard Fujimoto and N. Omidi
Springer Lecture Notes in Computer Science 2005






2004


A Federated Approach to Distributed Network Simulation
George Riley, Mostafa Ammar, Richard Fujimoto, Alfred Park, Kalyan Perumalla and Donghua Xu
ACM Transactions on Modeling and Computer Simulation (TOMACS) 2004


Vol. 14, No. 2

Abstract: We describe an approach and our experiences in applying federated simulation techniques to create large-scale parallel simulations of computer networks. Using the federated approach, the topology and the protocol stack of the simulated network is partitioned into a number of submodels, and a simulation process is instantiated for each one. Runtime infrastructure software provides services for interprocess communication and synchronization (time management). We first describe issues that arise in homogeneous federations where a sequential simulator is federated with itself to realize a parallel implementation. We then describe additional issues that must be addressed in heterogeneous federations composed of different network simulation packages, and describe a dynamic simulation backplane mechanism that facilitates interoperability among different network simulators. Specifically, the dynamic simulation backplane provides a means of addressing key issues that arise in federating different network simulators: differing packet representations, incomplete implementations of network protocol models, and differing levels of detail among the simulation processes. We discuss two different methods for using the backplane for interactions between heterogeneous simulators: the cross-protocol stack method and the split-protocol stack method. Finally, results from an experimental study are presented for both the homogeneous and heterogeneous cases that provide evidence of the scalability of our federated approach on two moderately sized computing clusters. Two different homogeneous implementations are described: Parallel/Distributed ns (pdns) and the Georgia Tech Network Simulator (GTNetS). Results of a heterogeneous implementation federating ns with GloMoSim are described. This research demonstrates that federated simulations are a viable approach to realizing efficient parallel network simulation tools.





2001


Interactive Parallel Simulations with the JANE Framework
Kalyan Perumalla and Richard Fujimoto
Future Generation Computer Systems 2001


Vol. 17, No. 5, Elsevier Science





1999


Efficient Optimistic Parallel Simulations using Reverse Computation
Christopher Carothers, Kalyan Perumalla and Richard Fujimoto
ACM Transactions on Modeling and Computer Simulation (TOMACS) 1999


Vol. 9, No. 3





1998


TeD - A Language for Modeling Telecommunication Networks
Kalyan Perumalla, Andrew Ogielski, Richard Fujimoto
ACM Performance Evaluation Review 1998


Vol. 25, No. 4





TeD Models for ATM Internetworks
Kalyan Perumalla, Matthew Andrews and Sandeep Bhatt
ACM Performance Evaluation Review 1998


Vol. 25, No. 4





Parallel Simulation Techniques for Large-scale Networks
Sandeep Bhatt, Richard Fujimoto, Andrew Ogielski and Kalyan Perumalla
IEEE Communications 1998


Vol. 36, No. 8





1995


Parallel Algorithms for Maximum Sub-sequence and Sub-array
Kalyan Perumalla and Narsingh Deo
Parallel Processing Letters 1995


Vol. 5, No. 3





SELECT * FROM pubs WHERE PubType=3 ORDER BY PubYear DESC, PubDate DESC, PubAuthors


Copyright © Perumalla 2009-2010