Smart Semi-Supervised Accumulation of Large Repositories for Industrial Control Systems Device Information

Abstract

Industrial Control Systems device manufacturers frequently add new features to improve their product performance. Oftentimes, these changes are mainly vendor-driven initiatives, and customers may not be aware of the full impact of these new capabilities on their cybersecurity posture. In the energy sector, this can lead to considerable dissonance between vendor-provided cybersecurity claims and a customer’s responsibility for Operation Technology cybersecurity compliance. Thus, the resulting dynamic verification burden is shifted towards the customer and may pose a significant cybersecurity risk to the energy sector landscape. We found that there is very limited research into cybersecurity auditing for Operational Technology. However, a solution is needed for vetting the vendor-supplied feature claims and their adherence to cybersecurity requirements and standards. We are presently engaged in an effort to develop such a system. This paper demonstrates one vital aspect of this effort in proposing an end-to-end framework to accumulate a large repository of ICS device information for this vetting system, curate the dataset, and conduct extensive processing. This framework is designed to use web scraping, data analytics and Natural Language Processing (NLP) techniques to identify vendor websites, automate the collection of website-accessible documents and automatically derive metadata from them for identification of product documents relevant to the repository. We have found that this automated approach to vendor identification, document extraction into a product repository, and NLP pre-processing is unique and has not been…

Kalyan Perumalla
Kalyan Perumalla

As a Federal Program Manager in Advanced Scientific Computing Research at the U.S. Dept. of Energy, Office of Science, Kalyan Perumalla manages a $100-million R&D portfolio covering AI, HPC, Quantum, SciDAC, and Basic Computer Science. In his 25-year R&D leadership experience, he previously led advanced R&D as Distinguished Research Staff Member at the Oak Ridge National Laboratory (ORNL) developing scalable software and applications on the world’s largest supercomputers for 17 years, including as a line manager and a founding group leader. He has held senior faculty and adjunct appointments at UTK, GT, and UNL, and was an IAS Fellow at Durham University.

Next
Previous

Related