Online Learning Methods for Controlling Dynamic Cyber Deception Strategies

Gutierrez, Marcus; Kiekintveld, Christopher

doi:10.1007/978-3-030-33432-1_11

Marcus Gutierrez⁷ &
Christopher Kiekintveld⁷

1261 Accesses
1 Citations

Abstract

Cyber deception is an important tool for many aspects of cyber defense, including detecting and learning about attackers, as well as mitigating the effectiveness of reconnaissance and some types of attacks. To make cyber deception and even more effective tool, we need better methods to automatically reason about how to use specific deception techniques strategically, taking into account the costs and benefits, as well as how to adapt these strategies over time based on changes to the network or the threat environment. The principles of moving target defense and game theoretic models have made significant advances in this area, but are typically limited in being able to adapt to new and specific threats. Here we consider method based on online learning that are able to adapt defensive deception strategies over time based on interactions with attackers, and which can handle novel threats such as zero-day attacks. We introduce as an example a formal model of how these methods can be used to deploy honeypots for the purpose of detecting exploits, and present results from simulations using this model. We also present results from a second study with human participants showing that humans have a very difficult time learning to play against similar adaptive deception strategies. This shows the value of considering adaptive learning models as a complement to game theory for strategic cyber deception.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This work was first presented in the Artifical Intelligence for Cyber Security workshop in San Francisco, CA in 2017 [15].
2.
Complete description of this work found in the proceedings in 41st CogSci conference [14]
3.
This could easily be generalized to include non-binary features, but it is not necessary for our purposes.
4.
Sections 2 and 3 are updated excerpts from separate works. Please consider these mathematical symbols found in these sections in isolation in the case of conflicting definitions.
5.
We assume \(v_{i} > c^{a}_{i}\) and \(\sum _{i \in N} c^{d}_{i} > D\).

References

Alpcan, T., Başar, T.: Network security: A decision and game-theoretic approach. Cambridge University Press (2010)
Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine learning 47(2–3), 235–256 (2002)
Article Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: The adversarial multi-armed bandit problem. In: Foundations of Computer Science, 1995. Proceedings., 36th Annual Symposium on, pp. 322–331. IEEE (1995)
Google Scholar
Ben-Asher, N., Gonzalez, C.: Effects of cyber security knowledge on attack detection. Computers in Human Behavior 48, 51–61 (2015)
Article Google Scholar
Bilge, L., Dumitras, T.: Before we knew it: an empirical study of zero-day attacks in the real world. In: Proceedings of the 2012 ACM conference on Computer and communications security, pp. 833–844. ACM (2012)
Google Scholar
Bringer, M.L., Chelmecki, C.A., Fujinoki, H.: A survey: Recent advances and future trends in honeypot research. International Journal of Computer Network and Information Security 4(10), 63 (2012)
Article Google Scholar
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. arXiv preprint arXiv:1204.5721 (2012)
Google Scholar
Buhrmester, M., Kwang, T., Gosling, S.D.: Amazon’s mechanical turk: A new source of inexpensive, yet high-quality, data? Perspectives on psychological science 6(1), 3–5 (2011)
Article Google Scholar
Carroll, T.E., Grosu, D.: A game theoretic investigation of deception in network security. Security and Communication Networks 4(10), 1162–1172 (2011)
Article Google Scholar
Du, M., Li, Y., Lu, Q., Wang, K.: Bayesian game based pseudo honeypot model in social networks. In: International Conference on Cloud Computing and Security, pp. 62–71. Springer (2017)
Google Scholar
Frei, S., May, M., Fiedler, U., Plattner, B.: Large-scale vulnerability analysis. In: Proceedings of the 2006 SIGCOMM workshop on Large-scale attack defense, pp. 131–138. ACM (2006)
Google Scholar
Gai, Y., Krishnamachari, B., Jain, R.: Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations. IEEE/ACM Transactions on Networking (TON) 20(5), 1466–1478 (2012)
Article Google Scholar
Garg, N., Grosu, D.: Deception in honeynets: A game-theoretic analysis. In: 2007 IEEE SMC Information Assurance and Security Workshop, pp. 107–113. IEEE (2007)
Google Scholar
Gutierrez, M., Černý, J., Ben-Asher, N., Aharonov, E., Bošanský, B., Kiekintveld, C., Gonzalez, C.: Evaluating models of human adversarial behavior against defense algorithms in a contextual multi-armed bandit task. In: 41st Annual Meeting of the Cognitive Science Society (CogSci 2019), Montreal, QC (2019 (in press))
Google Scholar
Gutierrez, M.P., Kiekintveld, C.: Adapting honeypot configurations to detect evolving exploits. In: Workshops at the Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Kiekintveld, C., Lisỳ, V., Píbil, R.: Game-theoretic foundations for the strategic use of honeypots in network security. In: Cyber Warfare, pp. 81–101. Springer (2015)
Google Scholar
Klíma, R., Lisỳ, V., Kiekintveld, C.: Combining online learning and equilibrium computation in security games. In: International Conference on Decision and Game Theory for Security, pp. 130–149. Springer (2015)
Google Scholar
La, Q.D., Quek, T.Q., Lee, J., Jin, S., Zhu, H.: Deceptive attack and defense game in honeypot-enabled networks for the internet of things. IEEE Internet of Things Journal 3(6), 1025–1035 (2016)
Article Google Scholar
Laszka, A., Vorobeychik, Y., Koutsoukos, X.D.: Optimal personalized filtering against spear-phishing attacks. In: AAAI (2015)
Google Scholar
Luo, T., Xu, Z., Jin, X., Jia, Y., Ouyang, X.: Iotcandyjar: Towards an intelligent-interaction honeypot for iot devices. Black Hat (2017)
Google Scholar
Mairh, A., Barik, D., Verma, K., Jena, D.: Honeypot in network security: a survey. In: Proceedings of the 2011 international conference on communication, computing & security, pp. 600–605. ACM (2011)
Google Scholar
McQueen, M.A., McQueen, T.A., Boyer, W.F., Chaffin, M.R.: Empirical estimates and observations of 0day vulnerabilities. In: System Sciences, 2009. HICSS’09. 42nd Hawaii International Conference on, pp. 1–12. IEEE (2009)
Google Scholar
Mell, P., Kent, K.A., Romanosky, S.: The common vulnerability scoring system (CVSS) and its applicability to federal agency systems. Citeseer (2007)
Google Scholar
Nawrocki, M., Wählisch, M., Schmidt, T.C., Keil, C., Schönfelder, J.: A survey on honeypot software and data analysis. arXiv preprint arXiv:1608.06249 (2016)
Google Scholar
Pauna, A., Iacob, A.C., Bica, I.: Qrassh-a self-adaptive ssh honeypot driven by q-learning. In: 2018 international conference on communications (COMM), pp. 441–446. IEEE (2018)
Google Scholar
Píbil, R., Lisỳ, V., Kiekintveld, C., Bošanskỳ, B., Pěchouček, M.: Game theoretic model of strategic honeypot selection in computer networks. In: International Conference on Decision and Game Theory for Security, pp. 201–220. Springer (2012)
Google Scholar
Provos, N.: Honeyd-a virtual honeypot daemon. In: 10th DFN-CERT Workshop, Hamburg, Germany, vol. 2, p. 4 (2003)
Google Scholar
Rowe, N.C., Custy, E.J., Duong, B.T.: Defending cyberspace with fake honeypots. JOURNAL OF COMPUTERS 2(2), 25 (2007)
Article Google Scholar
Sagha, H., Shouraki, S.B., Khasteh, H., Dehghani, M.: Real-time ids using reinforcement learning. In: 2008 Second International Symposium on Intelligent Information Technology Application, vol. 2, pp. 593–597. IEEE (2008)
Google Scholar
Schlenker, A., Thakoor, O., Xu, H., Fang, F., Tambe, M., Tran-Thanh, L., Vayanos, P., Vorobeychik, Y.: Deceiving cyber adversaries: A game theoretic approach. In: AAMAS (2018). http://dl.acm.org/citation.cfm?id=3237383.3237833
Schlenker, A., Xu, H., Guirguis, M., Kiekintveld, C., Sinha, A., Tambe, M., Sonya, S.Y., Balderas, D., Dunstatter, N.: Don’t bury your head in warnings: A game-theoretic approach for intelligent allocation of cyber-security alerts. In: IJCAI, pp. 381–387 (2017)
Google Scholar
Serra, E., Jajodia, S., Pugliese, A., Rullo, A., Subrahmanian, V.: Pareto-optimal adversarial defense of enterprise systems. ACM Transactions on Information and System Security (TISSEC) 17(3), 11 (2015)
Article Google Scholar
Servin, A., Kudenko, D.: Multi-agent reinforcement learning for intrusion detection. In: Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, pp. 211–223. Springer (2005)
Google Scholar
Servin, A., Kudenko, D.: Multi-agent reinforcement learning for intrusion detection: A case study and evaluation. In: German Conference on Multiagent System Technologies, pp. 159–170. Springer (2008)
Google Scholar
Shi, L., Zhao, J., Jiang, L., Xing, W., Gong, J., Liu, X.: Game theoretic simulation on the mimicry honeypot. Wuhan University Journal of Natural Sciences 21(1), 69–74 (2016)
Article MathSciNet Google Scholar
Spitzner, L.: Honeypots: tracking hackers, vol. 1. Addison-Wesley Reading (2003)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT press (2018)
Google Scholar
Tsikerdekis, M., Zeadally, S., Schlesener, A., Sklavos, N.: Approaches for preventing honeypot detection and compromise. In: 2018 Global Information Infrastructure and Networking Symposium (GIIS), pp. 1–6. IEEE (2018)
Google Scholar
Venkatesan, S., Albanese, M., Shah, A., Ganesan, R., Jajodia, S.: Detecting stealthy botnets in a resource-constrained environment using reinforcement learning. In: MTD@ CCS, pp. 75–85 (2017)
Google Scholar
Wagener, G., Dulaunoy, A., Engel, T., et al.: Self adaptive high interaction honeypots driven by game theory. In: Symposium on Self-Stabilizing Systems, pp. 741–755. Springer (2009)
Google Scholar
Wagener, G., State, R., Engel, T., Dulaunoy, A.: Adaptive and self-configurable honeypots. In: 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops, pp. 345–352. IEEE (2011)
Google Scholar
Wang, K., Du, M., Maharjan, S., Sun, Y.: Strategic honeypot game model for distributed denial of service attacks in the smart grid. IEEE Transactions on Smart Grid 8(5), 2474–2482 (2017)
Article Google Scholar
Wang, W., Zeng, B.: A two-stage deception game for network defense. In: Decision and Game Theory for Security (2018)
Google Scholar
Xu, X., Xie, T.: A reinforcement learning approach for host-based intrusion detection using sequences of system calls. In: International Conference on Intelligent Computing, pp. 995–1003. Springer (2005)
Google Scholar

Download references

Acknowledgements

The work found in Sect. 2 first appeared in the Artificial Intelligence for Cyber Security workshop held at the 31st AAAI Conference on Artificial Intelligence in San Francisco, CA in 2017 [15].

The work found in Sect. 3 is currently in press for the 41st Annual Meeting of the Cognitive Science Society (2019) to be held in Montreal, Canada at the time this was written [14]. The authors would like to thank Jakub Černý, Palvi Aggarwal, Noam Ben-Asher, Efrat Aharonov, Branislav Bošanský, Orsolya Kovacs, and Cleotilde Gonzalez for their contributions to this work.

Author information

Authors and Affiliations

The University of Texas at El Paso, El Paso, TX, USA
Marcus Gutierrez & Christopher Kiekintveld

Authors

Marcus Gutierrez
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Kiekintveld
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christopher Kiekintveld .

Editor information

Editors and Affiliations

Center for Secure Information Systems, George Mason University, Fairfax, VA, USA
Sushil Jajodia
Thayer School of Engineering, Dartmouth College, Hanover, NH, USA
George Cybenko
Department of Computer Science, Dartmouth College, Hanover, NH, USA
V.S. Subrahmanian
MS T310, MITRE Corporation, McLean, VA, USA
Vipin Swarup
Computing and Information Science Division, Army Research Office, Durham, NC, USA
Cliff Wang
Computer Science & Engineering, University of Michigan, Ann Arbor, MI, USA
Michael Wellman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gutierrez, M., Kiekintveld, C. (2020). Online Learning Methods for Controlling Dynamic Cyber Deception Strategies. In: Jajodia, S., Cybenko, G., Subrahmanian, V., Swarup, V., Wang, C., Wellman, M. (eds) Adaptive Autonomous Secure Cyber Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-33432-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-33432-1_11
Published: 05 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33431-4
Online ISBN: 978-3-030-33432-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics