skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: D2U: Data Driven User Emulation for the Enhancement of Cyber Testing, Training, and Data Set Generation

Abstract

Whether testing intrusion detection systems, conducting training exercises, or creating data sets to be used by the broader cybersecurity community, realistic user behavior is a critical component of a cyber range. Existing methods either rely on network level data or replay recorded user actions to approximate real users in a network. Our work is the first to produce generative models trained on actual user data (sequences of application usage) collected on endpoints. Once trained to the user's behavioral data, these models can generate novel sequences of actions %that appear to come from the same distribution as the training data. These sequences of actions are then fed to our custom software via configuration files, which replicate those behaviors on end devices. Notably, our models are platform agnostic and could generate behavior data for any emulation software package. In this paper we present our model generation process, software architecture, and an initial evaluation of the fidelity of our models. Our software is currently deployed in a cyber range to help evaluate the efficacy of defensive cyber technologies. We suggest additional ways that the cyber community as a whole can benefit from more realistic user behavior emulation. The data used to train ourmore » model, as well as sample configuration files produced by the model, are available at [redacted].« less

Authors:
ORCiD logo [1]; ORCiD logo [1];  [1];  [1];  [1]
  1. ORNL
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1813180
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: 14th Cyber Security Experimentation and Test Workshop - Boston, Massachusetts, United States of America - 8/9/2021 4:00:00 AM-
Country of Publication:
United States
Language:
English

Citation Formats

Oesch, T, Bridges, Robert, Verma, Miki, Weber, Brian, and Diallo, Oumar. D2U: Data Driven User Emulation for the Enhancement of Cyber Testing, Training, and Data Set Generation. United States: N. p., 2021. Web. doi:10.1145/3474718.3475718.
Oesch, T, Bridges, Robert, Verma, Miki, Weber, Brian, & Diallo, Oumar. D2U: Data Driven User Emulation for the Enhancement of Cyber Testing, Training, and Data Set Generation. United States. https://doi.org/10.1145/3474718.3475718
Oesch, T, Bridges, Robert, Verma, Miki, Weber, Brian, and Diallo, Oumar. 2021. "D2U: Data Driven User Emulation for the Enhancement of Cyber Testing, Training, and Data Set Generation". United States. https://doi.org/10.1145/3474718.3475718. https://www.osti.gov/servlets/purl/1813180.
@article{osti_1813180,
title = {D2U: Data Driven User Emulation for the Enhancement of Cyber Testing, Training, and Data Set Generation},
author = {Oesch, T and Bridges, Robert and Verma, Miki and Weber, Brian and Diallo, Oumar},
abstractNote = {Whether testing intrusion detection systems, conducting training exercises, or creating data sets to be used by the broader cybersecurity community, realistic user behavior is a critical component of a cyber range. Existing methods either rely on network level data or replay recorded user actions to approximate real users in a network. Our work is the first to produce generative models trained on actual user data (sequences of application usage) collected on endpoints. Once trained to the user's behavioral data, these models can generate novel sequences of actions %that appear to come from the same distribution as the training data. These sequences of actions are then fed to our custom software via configuration files, which replicate those behaviors on end devices. Notably, our models are platform agnostic and could generate behavior data for any emulation software package. In this paper we present our model generation process, software architecture, and an initial evaluation of the fidelity of our models. Our software is currently deployed in a cyber range to help evaluate the efficacy of defensive cyber technologies. We suggest additional ways that the cyber community as a whole can benefit from more realistic user behavior emulation. The data used to train our model, as well as sample configuration files produced by the model, are available at [redacted].},
doi = {10.1145/3474718.3475718},
url = {https://www.osti.gov/biblio/1813180}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Sun Aug 01 00:00:00 EDT 2021},
month = {Sun Aug 01 00:00:00 EDT 2021}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: