Abstract
Prior work has explored the use of defensive cyber deception to manipulate the information available to attackers and to proactively misinform on behalf of both real and decoy systems. Such approaches can provide advantages to defenders by detecting inadvertent attacker interactions with decoy systems, by delaying attacker forward progress, by decreasing or eliminating attacker payoffs in multi-round interactions, and by predicting and interfering with (or incentivizing) likely attacker actions (probe, attack, and walk-away). In this work, we extend our prior model by examining the ability of a defender to learn an attacker’s preferences through observations of their interactions with targeted systems. Knowledge of an attacker’s preferences can be used to guide defensive systems, particularly those which present deceptive features to an attacker. Prior work did not distinguish between targets other than as real or decoy and only modeled an attacker’s behaviors as it related to their costs for probing or attacking defended systems. While this was able to predict an attacker’s likelihood of continuing their interactions or walking away from the game, it did not inform a defender as to an attacker’s likely future actions as expressed through preferences for various defended systems. In this paper, we first present a theoretical model in which lower and upper bounds on the number of observations needed for a defender to learn an attacker’s preferences is expressed. We then present empirical results in the form of simulated interactions between an attacker with fixed preferences and a learning defender. Lastly we discuss how these bounds can be used to inform an adaptive deceptive defense in which a defender can leverage their knowledge of attacker preferences to more optimally interfere with an attacker’s future actions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Herley, C.: Unfalsifiability of security claims. Proc. Natl. Acad. Sci. 113(23), 6415–6420 (2016)
Shi, Z.R., et al.: Learning and planning in the feature deception problem. In: Decision and Game Theory for Security, College Park, MD (2020)
Haghtalab, N., Fang, F., Nguyen, T.H., Sinha, A., Procaccia, A.D., Tambe, M.: Three strategies to success: learning adversary models in security games. In: International Joint Conference on Artificial Intelligence (IJCAI) (2016)
Luce, R.D.: Individual Choice Behavior: A Theoretical Analysis, Courier Corporation (2005)
McFadden, D.L.: Quantal choice analysis: a survey. Ann. Econ. Soc. Meas. 5(4), 363–390 (1976)
Nguyen, T., Yang, R., Azaria, A., Kraus, S., Tambe, M.: Analyzing the effectiveness of adversary modeling in security games. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI), pp. 718–724 (2013)
Abbasi, Y., et al.: Know your adversary: insights for a better adversarial behavioral model. In: CogSci (2016)
Rudelson, M., Vershynin, R.: Smallest singular value of a random rectangular matrix. Commun. Pure Appl. Math. 62(12), 1707–1739 (2009)
Pawlick, J., Colbert, E., Zhu, Q.: A game-theoretic taxonomy and survey of defensive deception for cybersecurity and privacy. ACM Comput. Surv. 52(4), 1–28 (2019)
Mairh, A., Barik, D., Verma, K., Jena, D.: Honeypot in network security: a survey. In: Proceedings of the 2011 International Conference on Communication, Computing and Security, pp. 600–605 (2011)
Xu, H., Tran-Thanh, L., Jennings, N.R.: Playing repeated security games with no prior knowledge. In: Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems, pp. 104–112 (2016)
Balcan, M.F., Blum, A., Haghtalab, N., Procaccia, A.D.: Commitment without regrets: online learning in stackelberg security games. In: Proceedings of the Sixteenth ACM Conference on Economics and Computation, pp. 61–78 (2015)
Heckman, K.E., Stech, F.J., Thomas, R.K., Schmoker, B., Tsow, A.W.: Cyber Denial, Deception and Counter Deception: A Framework for Supporting Active Cyber Defense. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-25133-2
Rowe, N.C., Rrushi, J.: Introduction to Cyberdeception. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-41187-3
Ferguson-Walter, K.J., Major, M.M., Johnson, C.K., Muhleman, D.H.: Examining the efficacy of decoy-based and psychological cyber deception. In: USENIX Security Symposium (2021)
Cranford, E.A., Gonzalez, C., Aggarwal, P., Cooney, S., Tambe, M., Lebiere, C.: Toward personalized deceptive signaling for cyber defense using cognitive models. Top. Cogn. Sci. 12(3), 992–1011 (2020)
Mandiant: M-Trends (2020). https://content.fireeye.com/m-trends/rpt-m-trends-2020. Accessed 20 July 2021
Acknowledgement
This work was partially funded by Cyber Technologies, C5ISREW Directorate, Office of the Under Secretary of Defense Research and Engineering as well as the Laboratory for Advanced Cybersecurity Research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Proof of Lemma 4
The first statement follows from a standard concentration inequality. In particular, if a variable \(X \sim \mathcal {N}(\log m,\frac{1}{2}) = \mathcal {N}(\mu , \sigma ^2)\) it follows that
which implies \(\text {Pr}( X \geqslant \log m + \log m) \leqslant \frac{1}{2 \pi \log m} \exp (-(\log m)^2/2) = \frac{1}{2 \pi \log m} m^{-\frac{\log m}{2}} < m^{-\frac{\log m}{2}}\) so that by symmetry \(\text {Pr}( X<0) \leqslant \frac{1}{2 \pi \log m} \exp (-(\log m)^2/2) < m^{-\log m/2}\). Therefore, applying a union bound, this implies \(\text {Pr}(a_{i,j} < 0)\) for any i, j is at most
which implies the desired result.
For the second statement in the claim, first note that the matrix \(A_1 - A_2\) is comprised of elements from the standard normal distribution \(\mathcal {N}(0,1)\), since each element is the difference of two elements \(\sim \mathcal {N}(\log m, \frac{1}{2})\). It is well-known that the smallest singular value of a random \(\mathcal {N}(0,1)\) matrix satisfies \(\text {Pr}(\sigma _{min} \leqslant \sqrt{m \log m} - \sqrt{m} - t) \leqslant \exp (-t^2/2)\) for \(t > 0\) (see [8] for instance), which implies
which implies the second statement of the claim.
B Proof of Lemma 5
Proof
Define \(\mathbf {b}\in \mathbb {R}^{N-1}\) so that
Define \(A^{(1)}, A^{(2)}\) so that
Then it follows from (1) and (2) that if \(\widehat{P} = P\), we can write
where the entries of \(A^{(1)}, A^{(2)}\) are \(\sim \mathcal {N}(\log m, \frac{1}{2})\). Note that according to Claim 4, since with high probability the smallest singular value of \(A^{(1)} - A^{(2)}\) is greater than zero, it follows that there are m linearly independent rows in \(A^{(1)} - A^{(2)}\). Let \(\mathcal {I}=\{k_{i_1}, k_{i_2}, \ldots , k_{i_m}\}\) denote the indices of these linearly independent rows. Assuming \(\widehat{P} = P\), we can solve for \(\mathbf {w}\) as follows
Then, using the same logic for the case where \(\widehat{P}\) may not be equal to P, we have
where
Since \((1-\zeta ) P_i< \widehat{P}_i < (1+\zeta ) P_i\), it follows that \(| | \widetilde{\mathbf {b}} | |_2 \leqslant \sqrt{m \log \frac{1+\zeta }{1-\zeta }}\) and so \(|| \mathbf {w}- \widetilde{\mathbf {w}} ||_2 \leqslant \frac{1}{\sigma _{\min }} \sqrt{m \log \frac{1+\zeta }{1-\zeta }}\). Since \(\sigma _{\min } \geqslant \sqrt{m \log m} - \sqrt{m} - \log m\) with high probability the result follows.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bilinski, M. et al. (2021). No Time to Lie: Bounds on the Learning Rate of a Defender for Inferring Attacker Target Preferences. In: Bošanský, B., Gonzalez, C., Rass, S., Sinha, A. (eds) Decision and Game Theory for Security. GameSec 2021. Lecture Notes in Computer Science(), vol 13061. Springer, Cham. https://doi.org/10.1007/978-3-030-90370-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-90370-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90369-5
Online ISBN: 978-3-030-90370-1
eBook Packages: Computer ScienceComputer Science (R0)