Abstract
Stealthy attacks on control systems are bound to go unnoticed, which makes them a severe threat to critical infrastructure such as power systems, smart grids, and vehicular networks. This paper investigates a subset of stealthy attacks known as zero-dynamics-based stealthy attacks. While previous works on zero-dynamics attacks have highlighted the necessity of highly accurate knowledge of the system’s state space for generating attack signals, our study requires none. We propose a deep reinforcement learning based attacker to generate attack signals without prior knowledge of the system’s state space. We develop several attackers and detectors iteratively until the attacker and detectors no longer improve. In addition, we also show that the reinforcement learning based attacker successfully executes an attack in the same manner as the theoretical attacker described in previous literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alabugin, S.K., Sokolov, A.N.: Applying of generative adversarial networks for anomaly detection in industrial control systems. In: 2020 Global Smart Industry Conference (GloSIC), pp. 199–203. IEEE (2020)
Anderson, B.D.: Output-nulling invariant and controllability subspaces. IFAC Proc. Vol. 8(1), 337–345 (1975)
Aoufi, S., Derhab, A., Guerroumi, M.: Survey of false data injection in smart power grid: attacks, countermeasures and challenges. J. Inf. Secur. Appl. 54, 102518 (2020)
Defense Use Case: Analysis of the cyber attack on the Ukrainian power grid. Electr. Inf. Shar. Anal. Center (E-ISAC) 388, 1–29 (2016)
Chen, Y., Huang, S., Liu, F., Wang, Z., Sun, X.: Evaluation of reinforcement learning-based false data injection attack to automatic voltage control. IEEE Trans. Smart Grid 10(2), 2158–2169 (2018)
Dash, P., Karimibiuki, M., Pattabiraman, K.: Out of control: stealthy attacks against robotic vehicles protected by control-based techniques. In: Proceedings of the 35th Annual Computer Security Applications Conference, pp. 660–672 (2019)
Deng, R., Xiao, G., Lu, R., Liang, H., Vasilakos, A.V.: False data injection on state estimation in power systems attacks, impacts, and defense: a survey. IEEE Trans. Industr. Inf. 13(2), 411–423 (2016)
Duan, J., et al.: Deep-reinforcement-learning-based autonomous voltage control for power grid operations. IEEE Trans. Power Syst. 35(1), 814–817 (2019)
Feng, C., Li, T., Zhu, Z., Chana, D.: A deep learning-based framework for conducting stealthy attacks in industrial control systems. arXiv preprint arXiv:1709.06397 (2017)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018)
Harshbarger, S.: The impact of zero-dynamics stealthy attacks on control systems: stealthy attack success probability and attack prevention (2022). https://krex.k-state.edu/dspace/handle/2097/42853
Harshbarger, S., Hosseinzadehtaher, M., Natarajan, B., Vasserman, E., Shadmand, M., Amariucai, G.: (A little) ignorance is bliss: The effect of imperfect model information on stealthy attacks in power grids. In: 2020 IEEE Kansas Power and Energy Conference (KPEC), pp. 1–6. IEEE (2020)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Johansson, K.H.: The quadruple-tank process: a multivariable laboratory process with an adjustable zero. IEEE Trans. Control Syst. Technol. 8(3), 456–465 (2000)
Kim, S., Park, K.J.: A survey on machine-learning based security design for cyber-physical systems. Appl. Sci. 11(12), 5458 (2021)
Langner, R.: Stuxnet: dissecting a cyberwarfare weapon. IEEE Secur. Privacy 9(3), 49–51 (2011)
Li, C., Qiu, M.: Reinforcement Learning for Cyber-Physical Systems: With Cybersecurity Case Studies. Chapman and Hall/CRC, London (2019)
Liu, Z., Wang, Q., Ye, Y., Tang, Y.: A GAN-based data injection attack method on data-driven strategies in power systems. IEEE Trans. Smart Grid 13(4), 3203–3213 (2022)
Sayghe, A., Zhao, J., Konstantinou, C.: Evasion attacks with adversarial deep learning against power system state estimation. In: 2020 IEEE Power & Energy Society General Meeting (PESGM), pp. 1–5. IEEE (2020)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems 12 (1999)
Teixeira, A., Pérez, D., Sandberg, H., Johansson, K.H.: Attack models and scenarios for networked control systems. In: Proceedings of the 1st International Conference on High Confidence Networked Systems, pp. 55–64 (2012)
Teixeira, A., Shames, I., Sandberg, H., Johansson, K.H.: Revealing stealthy attacks in control systems. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1806–1813. IEEE (2012)
Teixeira, A., Sou, K.C., Sandberg, H., Johansson, K.H.: Secure control systems: a quantitative risk management approach. IEEE Control Syst. Mag. 35(1), 24–45 (2015)
Zenati, H., Foo, C.S., Lecouat, B., Manek, G., Chandrasekhar, V.R.: Efficient GAN-based anomaly detection. arXiv preprint arXiv:1802.06222 (2018)
Zhang, R., Venkitasubramaniam, P.: Stealthy control signal attacks in linear quadratic Gaussian control systems: detectability reward tradeoff. IEEE Trans. Inf. Forensics Secur. 12(7), 1555–1570 (2017)
Acknowledgement
This publication was made possible by NPRP grant #12C-33905-SP-165 from the Qatar National Research Fund (a member of Qatar Foundation). The findings achieved herein are solely the responsibility of the authors. We appreciate the anonymous reviewers’ valuable suggestions and comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Proof for Attack Generation
Since matrix D is 0, we replace the solution in [2] as shown below. We need a value of \(F_2\) that satisfies

From this, we can easily obtain

where, \(\begin{bmatrix} V & B \end{bmatrix} ^ +\) and \(V^+\) represent the Moore-Penrose pseudo-inverses of matrices \(\begin{bmatrix} V & B \end{bmatrix}\) and V, respectively.
If we choose \(\begin{bmatrix} F_1\\ F_2 \end{bmatrix}\) to be the right-hand side of (8b), that is,
then using one of the properties of the pseudo-inverse, that states that for a general matrix M we have \(M^+MM^+=M^+\), we see that this choice of \(\begin{bmatrix}F_1\\ F_2\end{bmatrix}\) satisfies (8b). Now substituting the same in the left-hand side of (8a), and using the property that for a general matrix M we have that \(MM^+M=M\), we get
the right-hand side of which is equal to AV (meaning that our choice of \(\begin{bmatrix}F_1\\ F_2\end{bmatrix}\) satisfies (8a)) whenever \(\begin{bmatrix} V B\end{bmatrix}\) has linearly independent rows. What remains is to set \(F=-F_2\).
B Architectures and Hyperparameters
Table 4 shows the architecture of neural network models associated with the attackers. The table displays the number of nodes in each subsequent layers of the model, activation function in hidden and output layers, and learning rate and optimizer used to update the model’s parameters. Similarly, the Table 5 consists of hyperparameter settings for the neural network models. The target network is updated using a soft update method whose coefficient is 0.005 for each attacker. Similarly, the output of these attackers is scaled using the Tanh activation function. Hence, to get the desired attack signals, we scaled the output by some constant, depicted in the same table. In addition to this, the value of the parameters a and b from the reward function in Sect. 4.2 for each attacker are provided in the table. In addition, Table 6 consists of architectures and hyperparameters of LSTM detectors. All the detectors have the same architecture but differ in learning rate parameter, which is given in the table.
C Attacker Rewards and Performance
Cosine similarity as described in Sect. 3.2.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Paudel, B., Amariucai, G. (2024). Reinforcement Learning Approach to Generate Zero-Dynamics Attacks on Control Systems Without State Space Models. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. ESORICS 2023. Lecture Notes in Computer Science, vol 14347. Springer, Cham. https://doi.org/10.1007/978-3-031-51482-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-51482-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51481-4
Online ISBN: 978-3-031-51482-1
eBook Packages: Computer ScienceComputer Science (R0)