skip to main content
research-article
Public Access

Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

Published: 25 July 2016 Publication History

Abstract

An important component of the cyber-defense mechanism is the adequate staffing levels of its cybersecurity analyst workforce and their optimal assignment to sensors for investigating the dynamic alert traffic. The ever-increasing cybersecurity threats faced by today’s digital systems require a strong cyber-defense mechanism that is both reactive in its response to mitigate the known risk and proactive in being prepared for handling the unknown risks. In order to be proactive for handling the unknown risks, the above workforce must be scheduled dynamically so the system is adaptive to meet the day-to-day stochastic demands on its workforce (both size and expertise mix). The stochastic demands on the workforce stem from the varying alert generation and their significance rate, which causes an uncertainty for the cybersecurity analyst scheduler that is attempting to schedule analysts for work and allocate sensors to analysts. Sensor data are analyzed by automatic processing systems, and alerts are generated. A portion of these alerts is categorized to be significant, which requires thorough examination by a cybersecurity analyst. Risk, in this article, is defined as the percentage of significant alerts that are not thoroughly analyzed by analysts. In order to minimize risk, it is imperative that the cyber-defense system accurately estimates the future significant alert generation rate and dynamically schedules its workforce to meet the stochastic workload demand to analyze them. The article presents a reinforcement learning-based stochastic dynamic programming optimization model that incorporates the above estimates of future alert rates and responds by dynamically scheduling cybersecurity analysts to minimize risk (i.e., maximize significant alert coverage by analysts) and maintain the risk under a pre-determined upper bound. The article tests the dynamic optimization model and compares the results to an integer programming model that optimizes the static staffing needs based on a daily-average alert generation rate with no estimation of future alert rates (static workforce model). Results indicate that over a finite planning horizon, the learning-based optimization model, through a dynamic (on-call) workforce in addition to the static workforce, (a) is capable of balancing risk between days and reducing overall risk better than the static model, (b) is scalable and capable of identifying the quantity and the right mix of analyst expertise in an organization, and (c) is able to determine their dynamic (on-call) schedule and their sensor-to-analyst allocation in order to maintain risk below a given upper bound. Several meta-principles are presented, which are derived from the optimization model, and they further serve as guiding principles for hiring and scheduling cybersecurity analysts. Days-off scheduling was performed to determine analyst weekly work schedules that met the cybersecurity system’s workforce constraints and requirements.

Supplementary Material

a4-ganesan-apndx.pdf (ganesan.zip)
Supplemental movie, appendix, image and software files for, Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning

References

[1]
M. Albanese, C. Molinaro, F. Persia, A. Picariello, and V. S. Subrahmanian. 2014. Discovering the top-k unexplained sequences in time-stamped observation data. IEEE Trans. Knowl. Data Eng. 26, 3 (2014), 577--594.
[2]
J. P. Anderson. 1980. Computer Security Threat Monitoring and Surveillance. Technical Report. James P. Anderson Co., Fort Washington, PA.
[3]
M. E. Aydin and E. Oztemel. 2000. Dynamic job-shop scheduling using reinforcement learning agents. Robot. Autonom. Syst. 33, 2 (2000), 169--178.
[4]
Daniel Barbara and Sushil Jajodia (Eds.). 2002. Application of Data Mining in Computer Security. Advances in Information Security, Vol. 6. Springer, Berlin.
[5]
R. Bellman. 1957. Dynamic Programming. Princeton University Press, Princeton NJ.
[6]
Der-San Chen, Robert G. Batson, and Yu Dang. 2010. Applied Integer Programming. Wiley, New York, NY.
[7]
CIO Chief Information Officer. 2008. DON Cyber Crime Handbook. Dept. of Navy, Washington, DC.
[8]
Dorothy E. Denning. 1986. An intrusion-detection model. In Proceedings of IEEE Symposium on Security and Privacy. Oakland, CA, 118--131.
[9]
Dorothy E. Denning. 1987. An intrusion-detection model. IEEE Trans. Software Eng. 13, 2 (1987), 222--232.
[10]
Roberto Di Pietro and Luigi V. Mancini (Eds.). 2008. Intrusion Detection Systems. Advances in Information Security, Vol. 38. Springer, Berlin.
[11]
Robert F. Erbacher and Steve E. Hutchinson. 2012. Extending case-based reasoning to network alert reporting. In 2012 ASE International Conference on Cyber Security. 187--194.
[12]
Rajesh Ganesan, Sushil Jajodia, and Hasan Cam. 2015. Optimal scheduling of cybersecurity analyst for minimizing risk. ACM Transactions on Intelligent Systems and Technology (under review) (2015).
[13]
A. Gosavi. 2003. Simulation Based Optimization: Parametric Optimization Techniques and Reinforcement Learning. Kluwer Academic, Norwell, MA.
[14]
D. Lesaint, C. Voudouris, N. Azarmi, I. Alletson, and B. Laithwaite. 2003. Field workforce scheduling. BT Technol. J. 21, 4 (2003), 23--26.
[15]
George L. Nemhauser and Laurence A. Wolsey. 1999. Integer and Combinatorial Optimization. Wiley-Interscience, New York, NY.
[16]
Y. Nobert and J. Roy. 1998. Freight handling personnel scheduling at air cargo terminals. Transport. Sci. 32, 3 (1998), 295--301.
[17]
Stephen Northcutt and Judy Novak. 2002. Network Intrusion Detection, 3rd Edition. New Riders Publishing, Thousand Oaks, CA.
[18]
M. Ovelgonne, V. S. Subrahmanian, T. Dumitras, and A. Prakash. 2015. Global Cyber-Vulnerability Report. Springer, Berlin.
[19]
C. D. Paternina-Arboleda and T. K. Das. 2005. A multi-agent reinforcement learning approach to obtaining dynamic control policies for stochastic lot scheduling problem. Simul. Model. Pract. Theor. 13, 5 (2005), 389--406.
[20]
Vern Paxson. 1999. Bro: A system for detecting network intruders in real-time. Comput. Networks 31, 23--24 (1999), 2435--2463.
[21]
Michael Pinedo. 2009. Planning and Scheduling in Manufacturing and Services. Springer, New York, NY.
[22]
W.B. Powell. 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley-Interscience, New York, NY.
[23]
M. L. Puterman. 1994. Markov Decision Processes. Wiley Interscience, New York, NY.
[24]
J. Reis and N. Mamede. 2002. Multi-Agent Dynamic Scheduling and Re-Scheduling with Global Temporal Constraints. Kluwer Academic Publishers, Amsterdam.
[25]
Robin Sommer and Vern Paxson. 2010. Outside the closed world: On using machine learning for network intrusion detection. In Proceedings of IEEE Symposium on Security and Privacy. 305--316.
[26]
R. Sutton and A. G. Barto. 1998. In Reinforcement Learning. The MIT Press, Cambridge, MA.
[27]
El-Ghazali Talbi. 2009. Metaheuristics. Wiley-Interscience, New York, NY.
[28]
Wayne Winston. 2003. Operations Research. Cengage Learning, New York, NY.
[29]
W. Wonham. 1979. Linear Multivariable Control: A Geometric Approach. Faller-Verlag.
[30]
F. Zhou, J. Wang, J. Wang, and J. Jonrinaldi. 2012. A dynamic rescheduling model with multi-agent system and its solution method. J. Mech. Eng. 58, 2 (2012), 81--92.
[31]
Carson Zimmerman. 2014. The Strategies of a World-Class Cybersecurity Operations Center. The MITRE Corporation, McLean, VA.

Cited By

View all
  • (2024)Applied Machine Learning for Information SecurityDigital Threats: Research and Practice10.1145/36520295:1(1-5)Online publication date: 11-Mar-2024
  • (2024)A Machine Learning and Optimization Framework for Efficient Alert Management in a Cybersecurity Operations CenterDigital Threats: Research and Practice10.1145/36443935:2(1-23)Online publication date: 5-Feb-2024
  • (2024)A sequential deep learning framework for a robust and resilient network intrusion detection systemComputers & Security10.1016/j.cose.2024.103928144(103928)Online publication date: Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 1
January 2017
363 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2973184
  • Editor:
  • Yu Zheng
Issue’s Table of Contents
© 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2016
Accepted: 01 January 2016
Revised: 01 November 2015
Received: 01 September 2015
Published in TIST Volume 8, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cybersecurity analysts
  2. dynamic scheduling
  3. genetic algorithm
  4. integer programming
  5. optimization
  6. reinforcement learning
  7. resource allocation
  8. risk mitigation

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Army Research Office
  • Office of Naval Research

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)209
  • Downloads (Last 6 weeks)39
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Applied Machine Learning for Information SecurityDigital Threats: Research and Practice10.1145/36520295:1(1-5)Online publication date: 11-Mar-2024
  • (2024)A Machine Learning and Optimization Framework for Efficient Alert Management in a Cybersecurity Operations CenterDigital Threats: Research and Practice10.1145/36443935:2(1-23)Online publication date: 5-Feb-2024
  • (2024)A sequential deep learning framework for a robust and resilient network intrusion detection systemComputers & Security10.1016/j.cose.2024.103928144(103928)Online publication date: Sep-2024
  • (2024)RRIoTComputers and Security10.1016/j.cose.2024.103786140:COnline publication date: 1-May-2024
  • (2024)When should shelf stocking be done at night? A workforce management optimization approach for retailersComputers & Industrial Engineering10.1016/j.cie.2024.110025190(110025)Online publication date: Apr-2024
  • (2023)Principled data-driven decision support for cyber-forensic investigationsProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i4.25628(5010-5017)Online publication date: 7-Feb-2023
  • (2023)A stochastic bi-objective cybersecurity analyst scheduling problem with preferential days off and upskilling decisionsComputers and Industrial Engineering10.1016/j.cie.2023.109551183:COnline publication date: 17-Oct-2023
  • (2023)Military and Security Applications: CybersecurityEncyclopedia of Optimization10.1007/978-3-030-54621-2_761-1(1-10)Online publication date: 16-Mar-2023
  • (2022)Research and Challenges of Reinforcement Learning in Cyber Defense Decision-Making for Intranet SecurityAlgorithms10.3390/a1504013415:4(134)Online publication date: 18-Apr-2022
  • (2022)Hierarchical Multi-agent Model for Reinforced Medical Resource Allocation with Imperfect InformationACM Transactions on Intelligent Systems and Technology10.1145/355243614:1(1-27)Online publication date: 9-Nov-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media