Industrial Control Systems device manufacturers frequently add new features to improve their product performance. Oftentimes, these changes are mainly vendor-driven initiatives, and customers may not be aware of the full impact of these new capabilities on their cybersecurity posture. In the energy sector, this can lead to considerable dissonance between vendor-provided cybersecurity claims and a customer’s responsibility for Operation Technology cybersecurity compliance. Thus, the resulting dynamic verification burden is shifted towards the customer and may pose a significant cybersecurity risk to the energy sector landscape. We found that there is very limited research into cybersecurity auditing for Operational Technology. However, a solution is needed for vetting the vendor-supplied feature claims and their adherence to cybersecurity requirements and standards. We are presently engaged in an effort to develop such a system. This paper demonstrates one vital aspect of this effort in proposing an end-to-end framework to accumulate a large repository of ICS device information for this vetting system, curate the dataset, and conduct extensive processing. This framework is designed to use web scraping, data analytics and Natural Language Processing (NLP) techniques to identify vendor websites, automate the collection of website-accessible documents and automatically derive metadata from them for identification of product documents relevant to the repository. We have found that this automated approach to vendor identification, document extraction into a product repository, and NLP pre-processing is unique and has not been…