No Time to Lie: Bounds on the Learning Rate of a Defender for Inferring Attacker Target Preferences

Bilinski, Mark; diVita, Joseph; Ferguson-Walter, Kimberly; Fugate, Sunny; Gabrys, Ryan; Mauger, Justin; Souza, Brian

doi:10.1007/978-3-030-90370-1_8

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 13061))

Included in the following conference series:

International Conference on Decision and Game Theory for Security

973 Accesses
2 Citations

Abstract

Prior work has explored the use of defensive cyber deception to manipulate the information available to attackers and to proactively misinform on behalf of both real and decoy systems. Such approaches can provide advantages to defenders by detecting inadvertent attacker interactions with decoy systems, by delaying attacker forward progress, by decreasing or eliminating attacker payoffs in multi-round interactions, and by predicting and interfering with (or incentivizing) likely attacker actions (probe, attack, and walk-away). In this work, we extend our prior model by examining the ability of a defender to learn an attacker’s preferences through observations of their interactions with targeted systems. Knowledge of an attacker’s preferences can be used to guide defensive systems, particularly those which present deceptive features to an attacker. Prior work did not distinguish between targets other than as real or decoy and only modeled an attacker’s behaviors as it related to their costs for probing or attacking defended systems. While this was able to predict an attacker’s likelihood of continuing their interactions or walking away from the game, it did not inform a defender as to an attacker’s likely future actions as expressed through preferences for various defended systems. In this paper, we first present a theoretical model in which lower and upper bounds on the number of observations needed for a defender to learn an attacker’s preferences is expressed. We then present empirical results in the form of simulated interactions between an attacker with fixed preferences and a learning defender. Lastly we discuss how these bounds can be used to inform an adaptive deceptive defense in which a defender can leverage their knowledge of attacker preferences to more optimally interfere with an attacker’s future actions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Lie Another Day: Demonstrating Bias in a Multi-round Cyber Deception Game of Questionable Veracity

The Price of Pessimism for Automated Defense

Be Careful When Learning Against Adversaries: Imitative Attacker Deception in Stackelberg Security Games

References

Herley, C.: Unfalsifiability of security claims. Proc. Natl. Acad. Sci. 113(23), 6415–6420 (2016)
Article Google Scholar
Shi, Z.R., et al.: Learning and planning in the feature deception problem. In: Decision and Game Theory for Security, College Park, MD (2020)
Google Scholar
Haghtalab, N., Fang, F., Nguyen, T.H., Sinha, A., Procaccia, A.D., Tambe, M.: Three strategies to success: learning adversary models in security games. In: International Joint Conference on Artificial Intelligence (IJCAI) (2016)
Google Scholar
Luce, R.D.: Individual Choice Behavior: A Theoretical Analysis, Courier Corporation (2005)
Google Scholar
McFadden, D.L.: Quantal choice analysis: a survey. Ann. Econ. Soc. Meas. 5(4), 363–390 (1976)
Google Scholar
Nguyen, T., Yang, R., Azaria, A., Kraus, S., Tambe, M.: Analyzing the effectiveness of adversary modeling in security games. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI), pp. 718–724 (2013)
Google Scholar
Abbasi, Y., et al.: Know your adversary: insights for a better adversarial behavioral model. In: CogSci (2016)
Google Scholar
Rudelson, M., Vershynin, R.: Smallest singular value of a random rectangular matrix. Commun. Pure Appl. Math. 62(12), 1707–1739 (2009)
Article MathSciNet Google Scholar
Pawlick, J., Colbert, E., Zhu, Q.: A game-theoretic taxonomy and survey of defensive deception for cybersecurity and privacy. ACM Comput. Surv. 52(4), 1–28 (2019)
Article Google Scholar
Mairh, A., Barik, D., Verma, K., Jena, D.: Honeypot in network security: a survey. In: Proceedings of the 2011 International Conference on Communication, Computing and Security, pp. 600–605 (2011)
Google Scholar
Xu, H., Tran-Thanh, L., Jennings, N.R.: Playing repeated security games with no prior knowledge. In: Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems, pp. 104–112 (2016)
Google Scholar
Balcan, M.F., Blum, A., Haghtalab, N., Procaccia, A.D.: Commitment without regrets: online learning in stackelberg security games. In: Proceedings of the Sixteenth ACM Conference on Economics and Computation, pp. 61–78 (2015)
Google Scholar
Heckman, K.E., Stech, F.J., Thomas, R.K., Schmoker, B., Tsow, A.W.: Cyber Denial, Deception and Counter Deception: A Framework for Supporting Active Cyber Defense. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-25133-2
Book Google Scholar
Rowe, N.C., Rrushi, J.: Introduction to Cyberdeception. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-319-41187-3
Book Google Scholar
Ferguson-Walter, K.J., Major, M.M., Johnson, C.K., Muhleman, D.H.: Examining the efficacy of decoy-based and psychological cyber deception. In: USENIX Security Symposium (2021)
Google Scholar
Cranford, E.A., Gonzalez, C., Aggarwal, P., Cooney, S., Tambe, M., Lebiere, C.: Toward personalized deceptive signaling for cyber defense using cognitive models. Top. Cogn. Sci. 12(3), 992–1011 (2020)
Article Google Scholar
Mandiant: M-Trends (2020). https://content.fireeye.com/m-trends/rpt-m-trends-2020. Accessed 20 July 2021

Download references

Acknowledgement

This work was partially funded by Cyber Technologies, C5ISREW Directorate, Office of the Under Secretary of Defense Research and Engineering as well as the Laboratory for Advanced Cybersecurity Research.

Author information

Authors and Affiliations

Naval Information Warfare Center Pacific, San Diego, CA, USA
Mark Bilinski, Joseph diVita, Sunny Fugate, Ryan Gabrys, Justin Mauger & Brian Souza
Laboratory for Advanced Cybersecurity Research, Laurel, MD, USA
Kimberly Ferguson-Walter

Authors

Mark Bilinski
View author publications
You can also search for this author in PubMed Google Scholar
Joseph diVita
View author publications
You can also search for this author in PubMed Google Scholar
Kimberly Ferguson-Walter
View author publications
You can also search for this author in PubMed Google Scholar
Sunny Fugate
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Gabrys
View author publications
You can also search for this author in PubMed Google Scholar
Justin Mauger
View author publications
You can also search for this author in PubMed Google Scholar
Brian Souza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryan Gabrys .

Editor information

Editors and Affiliations

Czech Technical University, Prague, Czech Republic
Branislav Bošanský
Carnegie Mellon University, Pittsburgh, PA, USA
Cleotilde Gonzalez
Johannes Kepler University Linz, Linz, Austria
Stefan Rass
Singapore Management University, Singapore, Singapore
Arunesh Sinha

Appendices

A Proof of Lemma 4

The first statement follows from a standard concentration inequality. In particular, if a variable $X \sim \mathcal {N}(\log m,\frac{1}{2}) = \mathcal {N}(\mu , \sigma ^2)$ it follows that

$$\begin{aligned} \text {Pr}( X \geqslant \mu + \sigma x) \leqslant \frac{1}{2 \pi x} \exp (-x^2/2), \end{aligned}$$

which implies $\text {Pr}( X \geqslant \log m + \log m) \leqslant \frac{1}{2 \pi \log m} \exp (-(\log m)^2/2) = \frac{1}{2 \pi \log m} m^{-\frac{\log m}{2}} < m^{-\frac{\log m}{2}}$ so that by symmetry $\text {Pr}( X<0) \leqslant \frac{1}{2 \pi \log m} \exp (-(\log m)^2/2) < m^{-\log m/2}$. Therefore, applying a union bound, this implies $\text {Pr}(a_{i,j} < 0)$ for any i, j is at most

$$ m^{-\frac{\log m}{2}} m^2 \log m, $$

which implies the desired result.

For the second statement in the claim, first note that the matrix $A_1 - A_2$ is comprised of elements from the standard normal distribution $\mathcal {N}(0,1)$, since each element is the difference of two elements $\sim \mathcal {N}(\log m, \frac{1}{2})$. It is well-known that the smallest singular value of a random $\mathcal {N}(0,1)$ matrix satisfies $\text {Pr}(\sigma _{min} \leqslant \sqrt{m \log m} - \sqrt{m} - t) \leqslant \exp (-t^2/2)$ for $t > 0$ (see [8] for instance), which implies

$$ \text {Pr}(\sigma _{\min } \leqslant \sqrt{m \log m} - \sqrt{m} - \log m) \leqslant \exp \big ( -(\log m)^2/2 \big ), $$

which implies the second statement of the claim.

B Proof of Lemma 5

Proof

Define $\mathbf {b}\in \mathbb {R}^{N-1}$ so that

$$\begin{aligned} \mathbf {b}= \begin{bmatrix} \log (\widehat{P}_{1}/\widehat{P}_{2}) \\ \log (\widehat{P}_{2}/\widehat{P}_{3}) \\ \vdots \\ \log (\widehat{P}_{{N-1}}/\widehat{P}_{{N}}) \end{bmatrix}. \end{aligned}$$

Define $A^{(1)}, A^{(2)}$ so that

$$\begin{aligned} A^{(1)} = \begin{bmatrix} \mathbf {x}^{(1)} \\ \mathbf {x}^{(2)} \\ \vdots \\ \mathbf {x}^{(N-1)} \end{bmatrix}, \ \ A^{(2)} = \begin{bmatrix} \mathbf {x}^{(2)} \\ \mathbf {x}^{(3)} \\ \vdots \\ \mathbf {x}^{(N)} \end{bmatrix}. \ \ \end{aligned}$$

Then it follows from (1) and (2) that if $\widehat{P} = P$, we can write

$$\begin{aligned} \Big ( A^{(1)} - A^{(2)} \Big ) \mathbf {w}= \mathbf {b}, \end{aligned}$$

where the entries of $A^{(1)}, A^{(2)}$ are $\sim \mathcal {N}(\log m, \frac{1}{2})$. Note that according to Claim 4, since with high probability the smallest singular value of $A^{(1)} - A^{(2)}$ is greater than zero, it follows that there are m linearly independent rows in $A^{(1)} - A^{(2)}$. Let $\mathcal {I}=\{k_{i_1}, k_{i_2}, \ldots , k_{i_m}\}$ denote the indices of these linearly independent rows. Assuming $\widehat{P} = P$, we can solve for $\mathbf {w}$ as follows

$$\begin{aligned} \mathbf {w}= \Big ( A^{(1)} - A^{(2)} \Big )^{-1}_{\mathcal {I}} \cdot \mathbf {b}_{\mathcal {I}}. \end{aligned}$$

Then, using the same logic for the case where $\widehat{P}$ may not be equal to P, we have

$$\begin{aligned} || \mathbf {w}- \widetilde{\mathbf {w}} ||_2 = \Big | \Big | \big (A^{(1)} - A^{(2)}\big )_{\mathcal {I}}^{-1} \Big ( \widetilde{\mathbf {b}}_{\mathcal {I}} - {\mathbf {b}}_{\mathcal {I}} \Big ) \Big | \Big |_2 \leqslant \Big | \Big | \big (A^{(1)} - A^{(2)}\big )_{\mathcal {I}}^{-1} \Big | \Big |_2 \Big | \Big | \widetilde{\mathbf {b}}_{\mathcal {I}} - {\mathbf {b}}_{\mathcal {I}} \Big | \Big |_2, \end{aligned}$$

where

$$\begin{aligned} \widetilde{\mathbf {b}}_{\mathcal {I}} - \mathbf {b}_{\mathcal {I}} = \begin{bmatrix} \log \Big ( \frac{\widehat{P}_{k_{i_1}} P_{k_{i_1+1}}}{\widehat{P}_{k_{i_1}+1} P_{k_{i_1}}} \Big ) \\ \log \Big ( \frac{\widehat{P}_{k_{i_2}}P_{k_{i_2+1}}}{\widehat{P}_{k_{i_2+1}} P_{k_{i_2}}} \Big ) \\ \vdots \\ \log \Big ( \frac{\widehat{P}_{k_{i_{m}}} P_{k_{i_{m+1}}}}{\widehat{P}_{k_{i_{m+1}}} P_{k_{i_{m}}}} \Big ) \end{bmatrix} \end{aligned}$$

Since $(1-\zeta ) P_i< \widehat{P}_i < (1+\zeta ) P_i$, it follows that $| | \widetilde{\mathbf {b}} | |_2 \leqslant \sqrt{m \log \frac{1+\zeta }{1-\zeta }}$ and so $|| \mathbf {w}- \widetilde{\mathbf {w}} ||_2 \leqslant \frac{1}{\sigma _{\min }} \sqrt{m \log \frac{1+\zeta }{1-\zeta }}$. Since $\sigma _{\min } \geqslant \sqrt{m \log m} - \sqrt{m} - \log m$ with high probability the result follows.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bilinski, M. et al. (2021). No Time to Lie: Bounds on the Learning Rate of a Defender for Inferring Attacker Target Preferences. In: Bošanský, B., Gonzalez, C., Rass, S., Sinha, A. (eds) Decision and Game Theory for Security. GameSec 2021. Lecture Notes in Computer Science(), vol 13061. Springer, Cham. https://doi.org/10.1007/978-3-030-90370-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-90370-1_8
Published: 31 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90369-5
Online ISBN: 978-3-030-90370-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics