This is a report of the runtime performance profiling efforts and results for the ECP ExaSGD project (Exascale Computing Program – Stochastic Grid Dynamics at Exascale.
This paper presents new methodology and algorithms to automatically identify cybersecurity-related claims expressed in natural language form in ICS device documents. The verification pipeline includes automated vendor identification, device document curation, and feature claim identification via sentiment analysis. Our novel matching engine represents the first automated information system available in the cybersecurity domain to directly aid ICS compliance reporting.
Our classifier of feature claims for cybersecurity literature analytics is introduced in our new model called ClaimsBERT. ClaimsBERT outperforms all other evaluated framework aiming to improve the cybersecurity of industrial control systems (ICS).
This report captures a computer science-oriented view of research in PDES and important PDES applications. Needs are outlined in core areas of PDES research as well as cross-cutting directions that positively impact scientific advancements. A selection of priority research directions in advanced computing for PDES is identified for scientific advancements.
We extend the classic Naming Game to multiple hearers per speaker in each conversation even while allowing simultaneous speaking and hearing. We simulate the impact on the rate of convergence by varying the number of hearers and investigate the impact of different network types on the global convergence. Multiple network types and agent population sizes are used in the simulation experiments.
A characterization of the software development metadata is presented in terms of distributions of data that best captures the trends in the datasets, to feed into the machine learning components of ZeroIn to exploit connectivity among the sets of repositories, commits, and developers.
Using multiple classifiers we verify the feasibility of using metadata from synthetic datasets modeled by a characterization of a few large software repositories and developer profiles. Results show that the metadata-based learning approach appears promising towards early flagging of potentially buggy commits in software repositories.
Two key cyber elements affecting the grid’s timing capability are Cyber Resilience and Cyber Trust. The timing services of DarkNet are systematically subjected to a range of cyber phenomena that stress four key performance factors, namely: Accuracy, Manageability, Telemetry, and Visibility. This analysis is designed to provide insights into four important categories of undesirable cyber phenomena: Loss of View (LoV), Loss of Control (LoC), Manipulation of View (MoV), and Manipulation of Control (MoC).
We present preliminary results from characterizing the distribution of 452 million commits in a metadata listing from GitHub repositories. Based on multiple distributions, we find the best fits and second best fits across different ranges in the data. The characterization is aimed at synthetic repository generation suitable for use in simulation and machine learning.
We introduce CyBERT, a cybersecurity feature claims classifier based on bidirectional encoder representations from transformers and a key component in our semi-automated cybersecurity vetting for industrial control systems (ICS)…The results showed that CyBERT outperforms these models on the validation accuracy and the F1 score, validating CyBERT’s robustness and accuracy as a cybersecurity feature claims classifier.
A novel algorithm has been designed, developed, and implemented on modern GPU accelerators, and benchmarked on networks with billions of edges, including Facebook and Twitter networks. Rates of generation exceed 50 billion edges per second.
Mixed Integer Programming (MIP) is a powerful abstraction in combinatorial optimization…here, we recount the conventional processor-based strategies and focus on configurations where the most promising intersection lies between parallel MIP solver approaches and the specific strengths of accelerated parallel platforms.
A mesoscopic modeling approach is described that strikes a middle ground between macroscopic models based on coupled differential equations and microscopic models built on fine-grained behaviors at the individual entity level. Execution of our implementation scaled to 8192 GPUs of supercomputing platforms demonstrates the ability to rapidly evaluate what–if scenarios several orders of magnitude faster than the conventional methods.
A simplified circuit is used as a case study to uncover and highlight key considerations in the use of traditional numerical simulation methods and compare them with those obtained from alternative methods that are discrete event-based from the outset. Results show the regimes where the traditional numerical methods and the alternative discrete event methods are applicable, and the need for discrete event approaches that precisely and efficiently resolve switching dynamics produced by power electronics systems that are important in emerging grid scenarios, such as large scale renewable energy.
Cyber-physical systems span a wide spectrum, from long-lived legacy systems to more modern installations. Trust is an issue that arises across the spectrum, albeit with different variants of goals and constraints. On the one end of the spectrum, legacy systems are characterized by function-based designs in which trust is an implicitly in-built concept…
A solution is needed for vetting the vendor-supplied feature claims and their adherence to cybersecurity requirements and standards. We are presently engaged in an effort to develop such a system. This paper demonstrates one vital aspect of this effort in proposing an end-to-end framework to accumulate a large repository of ICS device information for this vetting system, curate the dataset, and conduct extensive processing. This framework is designed to use web scraping, data analytics and Natural Language Processing (NLP) techniques to identify vendor websites, automate the collection of website-accessible documents and automatically derive metadata from them for identification of product documents relevant to the repository…
Our algorithm cuPPA generates scale-free networks using the preferential-attachment model, custom-designed to exploit multiple GPUs. We generate extremely large scale-free networks of 4 trillion edges in less than 8 minutes using 1,008 NVIDIA Volta GPUs of the Summit supercomputer. This represents the first ever graph network generation at this scale of parallel execution with over thousand GPUs. Moreover, our algorithm is uniquely suitable for generating networks in a streaming mode without the need for explicitly storing (writing to disk) the entire network, and is suitable for targeting even larger scales with quadrillions of edges.
A more comprehensive and practical treatment of memory remains to be performed in visiting and answering memory-related unknowns. Some of the answers could have profound impact on classical memory technologies. Reversible execution restores symmetry between memory and computation, correctly reinstating the memory state restoration cost in the aggregate memory cost.
We study the impact of edge additions on the community structure using Lancichinetti-Fortunato-Radicchi (LFR) benchmark networks. We show that, for a fixed network size, the impact of edge additions is greater on networks with initially weak community structure than on networks with strongly clustered structures. Also, we find that the perception of the impact is also dependent on the community detection algorithm used to uncover communities.
The cybersecurity auditing for Operation Technology is critical and has been largely missing from the cybersecurity research, especially in the energy sector. In this paper, we present a novel “cybersecurity vetting” approach (CYVET) to the problem of verification and validation of cybersecurity in complex cyber-physical installations underlying modern energy grid systems.
We present a message passing interface-based distributed memory parallel algorithms for generating random scale-free networks using the preferential-attachment model. The algorithms have been exercised in scale and speed to generate scale-free networks with one trillion edges in 6 minutes using 1,000 PEs.
https://ashraem.confex.com/ashraem/w20/meetingapp.cgi/Paper/26352
We describe an approach to detecting and preventing cyber attacks by continuously comparing the infrastructure state with a real-time digital-twin simulation of it. Specifically, we describe and demonstrate a Digital Twin Framework (DTF) designed specifically to detect and eventually prevent such attacks. The canal lock system’s digital twin uses a recurrent neural network trained from the experimental data collected via the DTF.
The objective of this work is to create the optimal schedule for HVAC operation to reduce the cost while satisfying the home-owner and equipment’s constraints using a model-free Reinforcement Learning (RL)-based optimization. This optimization is addressed with the development of initial learning testbed and implementation of RL techniques on a real home. Our preliminary results showed a 17% reduction in the total cost and a 15% reduction in the power utilization using our RL-based HVAC model–RL-HEMS.
Here, we focus on coordinated intelligence about normal and abnormal phenomena from multiple sensors geographically co-located, monitoring and controlling a set of co-located devices. Given a set of co-located sensors, we develop an intelligent approach that automatically determines the ’normal’ patterns of behaviors among the correlated sensors. After normal behavior is extracted, later monitoring detects deviant variations over time.
HPC facilities used for scientific computing draw enormous energy, consuming many megawatt-hours. Using simulation cloning, we reduce energy consumption and track the power drawn by thousands of GPUs, achieving significant aggregate energy savings from cloned simulations.
We empirically study the effectiveness of Recurrent Neural Network (RNN)-based models as the basis of DT-based resilience and uncover the important characteristics of an RNN-based solution with experimentation on a lab-scale Canal Lock CPS emulator with live validations and attack scenarios. For the first time, we demonstrate actual, real-time use of a RNN-based model as a DT for performing live analysis on an operational CPS.
In this paper, we present our research and development efforts aimed at addressing the gap in discovering sensors at level 0 in industrial CPS by building a system called Deep-cyberia (Deep Cyber-Physical System Interrogation and Analysis) that incorporates algorithms and interfaces aimed at uncovering sensors and computing estimates of correlations among them.
Our focus is to extract volatile and dynamically changing internal information form CPS 0-1 level devices, and design preliminary schemes to exploit that extracted information. As a case study, we apply the proposed methodology to Modicon PLC using Modbus protocol. We extract the memory layout and subject the device to read operations at the most critical regions of memory. This capability of generating a sequence of volatile memory snapshots for offline, detailed and sophisticated analysis opens a new class of cyber security schemes for CPS forensic analysis, taint analysis and watermarking.
We propose a new redundancy reduction technique for large-scale discrete event simulations, called exact-differential simulation, which simulates only the altered portions of scenarios and their influences in repeated executions while still achieving the same results as the re-execution of entire simulations. We evaluate our approach by using two case studies, PHOLD benchmark and a traffic simulation of Tokyo.
In this paper, we report and analyze performance results from native execution of deep learning on a leadership-class high-performance computing (HPC) system. Using our new code called DeepEx, we present a study of the parallel speed up and convergence rates of learning achieved with native parallel execution. Scaling results are reported from execution on up to 15,000 GPUs using two scientific data sets from atom microscopy and protein folding applications, and also using the popular ImageNet data set.
A novel parallel algorithm, cuPPA, is presented for generating random scale-free networks using the preferential attachment model. The algorithm is custom-designed for ‘single instruction multiple data (SIMD)’ for GPUs. Our algorithm is the first to exploit GPUs, and also the fastest implementation available today, for scale-free networks. On an NVidia GeForce 1080 GPU, cuPPA generates a scale-free network of two billion edges in less than 3 s. On multi-GPU platforms, cuPPA-Hash generates a scale-free network of 16 billion edges in less than 7 s using a machine consisting of 4 NVidia Tesla P100 GPUs.
Formation of a butterfly from a pupa, extraction of a live dove from a magician’s empty hat, generation of new particles from high-energy particle collisions and spawning a new dream world from mind in sleep are all examples of a common, fuzzy notion called ‘emergence’. In this paper, I pin the concept of emergence to the element of surprise in a phenomenon. I categorise the various notions of emergence into three main classes. These definitions are used to explain instances of emergence, organised along a continuous spectrum as normality, magic, miracle and error.
In this paper, we present an agent based model (and a scalable approximation of it) in a closely related spirit. The central feature of this model is that wealth enables an individual to secure more wealth. Using historical data, we initialize the model with US wealth shares in 1988 and show that the model tracks wealth share changes from 1988 to 2012. Simulations to 2088 project that the top 0.01% of the population will possess more than 70% of the total wealth in the economy.
https://www.osti.gov/biblio/1468255
Cloning is a technique to efficiently simulate a tree of multiple what-if scenarios that are unraveled during the course of a base simulation. We present the conceptual simulation framework, algorithmic foundations, and runtime interface of CloneX, a new system we designed for scalable simulation cloning.
A novel parallel algorithm, cuPPA, is presented for generating random scale-free networks using the preferential-attachment model. In one of the best cases, when executed on an NVidia GeForce 1080 GPU, cuPPA generates a scale-free network of two billion edges in less than 3 seconds.
Here, we develop a new concurrent model as a relaxation of the classical formulation of the Naming Game and express it in a discrete event style of evaluation. Further, with the uncovered concurrency that was absent in the classical algorithm, we map the concurrent model to parallel discrete event simulation. We present a parallel performance study on networks with hundreds of thousands of individuals.
A novel parallel algorithm is presented for generating random scale-free networks using the preferential-attachment model. In one of the best cases, when executed on an NVidia GeForce 1080 GPU, cuPPA generates a scale free network of a billion edges in less than 2 seconds.
https://www.sciencedirect.com/science/article/abs/pii/S016792601630044X
We define a new problem of computing the intersections among arbitrarily nested hollow spheres of possibly different sizes, thicknesses, positions, and nesting levels. We describe a new algorithm designed to solve this nested hollow sphere intersection problem and implement it for parallel execution on graphical processing units (GPUs). We present first results about the runtime performance and scaling to hundreds of thousands of spheres, and compare the performance with that from a leading solid object intersection package also running on GPUs.
https://www.osti.gov/biblio/1347338
https://ieeexplore.ieee.org/abstract/document/7828553
https://www.sciencedirect.com/science/article/pii/S1571066116300706
https://ieeexplore.ieee.org/abstract/document/7560215
https://www.osti.gov/biblio/1408647
Andelfinger’s thesis online
There are many key concepts that, even while being part of everyday life, elude definition. One such is “Art.” Here, possible ways are identified to define Art, along with a description of a few factors that underlie the challenge in arriving at a definition. Additionally, a candidate definition from a scientist’s viewpoint is proposed for an abstract, encompassing model.
Cloud and Virtual machine (VM) technologies present new challenges with respect to performance and monetary cost in executing parallel discrete event simulation (PDES) applications…
https://smartech.gatech.edu/bitstream/handle/1853/52321/YOGINATH-DISSERTATION-2014.pdf
A solution methodology and implementation components are presented that can uncover unwanted, unin-t…
Global virtual time (GVT) computation is a key determinant of the efficiency and runtime dynamics of parallel discrete event simulations (PDES)…
This tutorial provides an introduction to the concept of reversible computing, adopting an expanded view…
This tutorial introduces the fundamental principles and algorithms underlying parallel/distributed discrete event simulation (PDES)…
[Pub 150] http://www.acm-sigsim-pads.org/
Reverse computation is presented here as an important future direction in addressing the challenge o…
In simulating large parallel systems, bottom-up approaches exercise detailed hardware models with effects from simplified software models or traces, …
Problems such as fault tolerance and scalable synchronization can be efficiently solved using reversibility of applications…
In modeling and simulating complex systems such as mobile ad-hoc networks (MANETs) in defense communications, …
Romdhanne’s thesis online
The algorithmic and implementation principles are explored in gainfully exploiting GPU accelerators in conjunction with multicore processors on high-end systems…
Virtual machine (VM) technologies, especially those offered via Cloud platforms, present new dimensions with respect to performance and cost in executing parallel discrete event simulation (PDES) applications…
SIESTA is a parallel three-dimensional plasma equilibrium code capable of resolving magnetic islands…
Few books comprehensively cover the software and programming aspects of reversible computing. Fillin…
With the advent of virtual machine (VM)-based platforms for parallel computing, it is now possible to execute parallel discrete event simulations (PDES)…
Consider a system of N identical hard spherical particles moving in a d-dimensional box and undergoi…
Consider a system of N identical hard spherical particles moving in a d-dimensional box and undergoing elastic, possibly multi-particle, collisions…
Direct solvers based on prefix computation and cyclic reduction algorithms exploit the special struc…
To keep up with the increasing number of processing elements in parallel/distributed computing, traditional tightly-coupled time-stepped models…
In complex phenomena such as epidemiological outbreaks, the intensity of inherent feedback effects a…
This tutorial introduces the typical hardware and software characteristics of extant and emerging supercomputing systems…
In prior work (Yoginath and Perumalla, 2011; Yoginath, Perumalla and Henz, 2012), the motivation, challenges and issues were articulated in favor of virtual time ordering of virtual machines…
SIESTA is capable of computing three-dimensional plasma equilibria with magnetic islands at high spatial resolutions for toroidally confined plasmas…
We report the results of a scaling effort that increases both the speed and resolution of the SIESTA…
The next generation of scalable network simulators employ virtual machines (VMs) to act as high-fidelity models of traffic producer/consumer nodes…
In large-scale scenarios, transportation modeling and simulation is severely constrained by simulati…
Virtual machine (VM)-based simulation is a method used by network simulators to incorporate realistic application behaviors by executing actual VMs as high-fidelity surrogates for simulated end-hosts…
Parallel discrete event simulation (PDES) represents a class of codes that are challenging to scale to large number of processors…
[Pub 128] http://www.aps.org/meetings/meeting.cfm?name=DPP11
MUPI is a parallel discrete event simulator designed for enabling software-based experimentation via…
[Pub 136] http://science.energy.gov/ascr/ascac/meetings/nov-2011/
Radio signal strength estimation is essential in many applications, including the design of military…
Future electric grid technology is envisioned on the notion of a smart grid in which responsive end-…
µπ is a scalable, transparent system for experimenting with the execution of parallel progr…
Parallelizing a domain-specific production code with thousands of lines that is developed over several years is a daunting task…
A block tri-diagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that…
Automation is useful to facilitate reverse code generation from normal code. Here, we describe our …
An effective latency-hiding mechanism is presented in the parallelization of agent-based model simul…
In a variety of complex networked systems, simulation is a powerful method to capture critical feedb…
Over 5000 publications on parallel discrete event simulation (PDES) have appeared in the literature …
The spatial scale, runtime speed, and behavioral detail of epidemic outbreak simulations altogether …
Ultra-scale supercomputing hardware is a reality, reaching peta-scale recently and now moving to exa…
In large-scale scenarios, transportation modeling and simulation is severely constrained by simulati…
Traditional modeling methodologies, such as those based on rule-based agent modeling, are exhibiting…
The study of human social behavioral systems is finding renewed interest in military, homeland secur…
Unique facets confronted by current cyber security analysis efforts are: tremendous pace of change o…
A methodology and its associated algorithms are presented for mapping a novel, field-based vehicular…
Graphical processing units (GPUs) are now established as efficient, alternative computing platforms …
We present a perfectly reversible method for bi-directional generation of samples from computational…
Vehicular traffic simulations are useful in applications such as emergency planning and traffic mana…
Radio signal strength estimation is essential in many applications, including the design of military…
The recent emergence of dramatically large computational power, spanning desktops with multi-core pr…
The recent emergence of dramatically large computational power, spanning desktops with multi-core pr…
Agent based simulation has been both a large area of study and a widely used tool for scientific research in past years…
Modern supercomputers use thousands of processors running in parallel to achieve their high computat…
Currently state- saving is employed in many large simulations to realize rollback. Reverse computati…
A detailed introduction to the design, implementation, and use of network simulation tools is presen…
Large parallel/distributed scientific simulations are very complex, and their dynamic behavior is ha…
We describe an approach and our experiences in applying federated simulation techniques to create la…
[Pub 63] ftp://ftp.cc.gatech.edu/pub/tech_reports/1994/GIT-CC-94-44.ps.Z
ISBN 978-1439873403, 1st Edition
Chapman and Hall/CRC
This is a seminal book on reversible computing. Collecting scattered knowledge into one coherent account, this book provides a compendium of both classical and recently developed results on reversible computing. It offers an expanded view of the field that includes the traditional energy-motivated hardware viewpoint as well as the emerging application-motivated software approach. It explores up-and-coming theories, techniques, and tools for the application of reversible computing. The topics covered span several areas of computer science, including high-performance computing, parallel/distributed systems, computational theory, compilers, power-aware computing, and supercomputing.
ISBN 978-1598291100
Morgan & Claypool
A detailed introduction to the design, implementation, and use of network simulation tools is presented. The requirements and issues faced in the design of simulators for wired and wireless networks are discussed. Abstractions such as packet- and fluid-level network models are covered. Several existing simulations are given as examples, with details and rationales regarding design decisions presented. Issues regarding performance and scalability are discussed in detail, describing how one can utilize distributed simulation methods to increase the scale and performance of a simulation environment. Finally, a case study of two simulation tools is presented that have been developed using distributed simulation techniques. This text is essential to any student, researcher, or network architect desiring a detailed understanding of how network simulation tools are designed, implemented, and used.