Asee peer logo

Statistical Word Analysis to support the Semiautomatic Implementation of the NIST 800-53 Cybersecurity Framework

Download Paper |

Conference

2024 ASEE North East Section

Location

Fairfield, Connecticut

Publication Date

April 19, 2024

Start Date

April 19, 2024

End Date

April 20, 2024

Page Count

11

DOI

10.18260/1-2--45783

Permanent URL

https://peer.asee.org/45783

Download Count

76

Paper Authors

biography

Mirco Speretta Fairfield University

visit author page

Rohan Sahu is a senior at Westhill High School in Stamford, Connecticut. He started to learn about statistical word analysis based on TF-IDF in the fall of 2021, when he was a sophomore. He implemented this technique from scratch in Java and applied it to the NIST Risk Management framework.

Dr. Mirco Speretta is the Director of the Cybersecurity Programs at Fairfield University. Before this role he spent 10 years as a director of technical engineering, acting as a security incident manager for mobile websites and applications. His early research focused on the development of semiautomatic techniques to build ontologies and the creation of user profiles that improve search results.

visit author page

Download Paper |

Abstract

Cybersecurity frameworks such as NIST, CIS, and ISO, include a collection of families and controls that recommend security policies to organizations. They play a critical role in mitigating the risks of cyber attacks and breaches in organizations. Due to the manual process of selecting families and controls the implementation of these frameworks is very resource-intensive and time-consuming. This project addresses this challenge by investigating the feasibility of partially automating the process of selecting families. In this study, we developed an application in Java that applies statistical techniques such as TF-IDF and Cosine similarity to the families of the NIST cybersecurity framework. The framework is split into distinctive corpora of tokens representing each family. A corpus includes all the controls for a given family and is simplified to the list of tokens that are most representative of that family. We evaluated how accurately the corpora represented the framework by using both a qualitative and a quantitative approach. Considering the positive results of our tests, we believe that this approach could have a great impact on semi-automating the process of selecting controls within a family. This will reduce the resources and the cost needed for implementing cybersecurity frameworks. At the same time, it will increase the accuracy and consistency of the selection process.

Speretta, M. (2024, April), Statistical Word Analysis to support the Semiautomatic Implementation of the NIST 800-53 Cybersecurity Framework Paper presented at 2024 ASEE North East Section, Fairfield, Connecticut. 10.18260/1-2--45783

ASEE holds the copyright on this document. It may be read by the public free of charge. Authors may archive their work on personal websites or in institutional repositories with the following citation: © 2024 American Society for Engineering Education. Other scholars may excerpt or quote from these materials with the same citation. When excerpting or quoting from Conference Proceedings, authors should, in addition to noting the ASEE copyright, list all the original authors and their institutions and name the host city of the conference. - Last updated April 1, 2015