Perspectives from a Comprehensive Evaluation of Reconstruction-based Anomaly Detection in Industrial Control Systems

Fung, Clement; Srinarasi, Shreya; Lucas, Keane; Phee, Hay Bryan; Bauer, Lujo

doi:10.1007/978-3-031-17143-7_24

Clement Fung¹¹,
Shreya Srinarasi¹¹,
Keane Lucas¹¹,
Hay Bryan Phee¹¹ &
…
Lujo Bauer¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13556))

Included in the following conference series:

European Symposium on Research in Computer Security

2310 Accesses
3 Citations

Abstract

Industrial control systems (ICS) provide critical functions to society and are enticing attack targets. Machine learning (ML) models—in particular, reconstruction-based ML models—are commonly used to identify attacks during ICS operation. However, the variety of ML model architectures, datasets, metrics, and techniques used in prior work makes broad comparisons and identifying optimal solutions difficult. To assist ICS security practitioners in choosing and configuring the most effective reconstruction-based anomaly detector for their ICS environment, this paper: (1) comprehensively evaluates previously proposed reconstruction-based ICS anomaly-detection approaches, and (2) shows that commonly used metrics for evaluating ML algorithms, like the point-F1 score, are inadequate for evaluating anomaly detection systems for practical use. Among our findings is that the performance of anomaly-detection systems is not closely tied to the choice of ML model architecture or hyperparameters, and that the models proposed in prior work are often larger than necessary. We also show that evaluating ICS anomaly detection over temporal ranges, e.g., with the range-F1 metric, better describes ICS anomaly-detection performance than the commonly used point-F1 metric. These so-called range-based metrics measure objectives more specific to ICS environments, such as reducing false alarms or reducing detection latency. We further show that using range-based metrics to evaluate candidate anomaly detectors leads to different conclusions about what anomaly-detection strategies are optimal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/pwwl/ics-anomaly-detection.
2.
Autoencoders are a special case since they do not consider a sequence of states (\(h = 0\)), and instead reconstruct the current state \(\mathbf {X'_t}\).
3.
We use the first 30% of the SWaT and WADI test datasets as their corresponding attack validation datasets. We use the final 30% of the BATADAL test dataset as its corresponding attack validation dataset, since the first 30% of the BATADAL test dataset does not contain any attacks.
4.
The recommended SWaT corrections can be found at https://github.com/pwwl/ics-anomaly-detection.

References

Abdelaty, M., Doriguzzi-Corin, R., Siracusa, D.: DAICS: a deep learning solution for anomaly detection in industrial control systems. arXiv:2009.06299 (2020)
Abokifa, A.A., Haddad, K., Lo, C.S., Biswas, P.: Detection of cyber physical attacks on water distribution systems via principal component analysis and artificial neural networks. In: World Environmental and Water Resources Congress (2017)
Google Scholar
Adepu, S., Kandasamy, N.K., Mathur, A.: EPIC: an electric power testbed for research and training in cyber physical systems security. In: Katsikas, S.K., et al. (eds.) SECPRE/CyberICPS -2018. LNCS, vol. 11387, pp. 37–52. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12786-2_3
Chapter Google Scholar
Ahmed, C.M., Palleti, V.R., Mathur, A.P.: WADI: a water distribution testbed for research in the design of secure cyber physical systems. In: 3rd International Workshop on Cyber-Physical Systems for Smart Water Networks (2017)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (Almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Di Pinto, A., Dragoni, Y., Carcano, A.: TRITON: the first ICS cyber attack on safety instrument systems. In: Black Hat USA (2018)
Google Scholar
Erba, A., et al.: Constrained concealment attacks against reconstruction-based anomaly detectors in industrial control systems. In: Annual Computer Security Applications Conference (2020)
Google Scholar
Feng, C., Palleti, V.R., Mathur, A., Chana, D.: A systematic framework to generate invariants for anomaly detection in industrial control systems. In: Network and Distributed System Security Symposium (2019)
Google Scholar
Goh, J., Adepu, S., Tan, M., Lee, Z.S.: Anomaly detection in cyber physical systems using recurrent neural networks. In: 18th International Symposium on High Assurance Systems Engineering (2017)
Google Scholar
Goh, J., Adepu, S., Junejo, K.N., Mathur, A.: A dataset to support research in the design of secure water treatment systems. In: Havarneanu, G., Setola, R., Nassopoulos, H., Wolthusen, S. (eds.) CRITIS 2016. LNCS, vol. 10242, pp. 88–99. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71368-7_8
Chapter Google Scholar
Hasselquist, D., Rawat, A., Gurtov, A.: Trends and detection avoidance of internet-connected industrial control systems. IEEE Access 7, 155504–155512 (2019)
Article Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hwang, W.S., Yun, J.H., Kim, J., Kim, H.C.: Time-series aware precision and recall for anomaly detection: considering variety of detection result and addressing ambiguous labeling. In: 28th ACM International Conference on Information and Knowledge Management (2019)
Google Scholar
Inoue, J., Yamagata, Y., Chen, Y., Poskitt, C.M., Sun, J.: Anomaly detection for a water treatment system using unsupervised machine learning. In: IEEE International Conference on Data Mining Workshops (2017)
Google Scholar
Jones, A.T., McLean, C.R.: A proposed hierarchical control model for automated manufacturing systems. J. Manufact. Syst. 5(1), 15–25 (1986)
Article Google Scholar
Kim, J., Yun, J.-H., Kim, H.C.: Anomaly detection for industrial control systems using sequence-to-sequence neural networks. In: Katsikas, S., et al. (eds.) CyberICPS/SECPRE/SPOSE/ADIoT -2019. LNCS, vol. 11980, pp. 3–18. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42048-2_1
Chapter Google Scholar
Kravchik, M., Shabtai, A.: Detecting cyber attacks in industrial control systems using convolutional neural networks. In: Workshop on Cyber-Physical Systems Security and Privacy (2018)
Google Scholar
Kravchik, M., Shabtai, A.: Efficient cyber attack detection in industrial control systems using lightweight neural networks and PCA. IEEE Trans. Dependable Secure Comput. 19, 2179–2197 (2021)
Article Google Scholar
Kshetri, N., Voas, J.: Hacking power grids: a current problem. Computer 50(12), 91–95 (2017)
Article Google Scholar
Lavin, A., Ahmad, S.: Evaluating real-time anomaly detection algorithms-the Numenta anomaly benchmark. In: 14th International Conference on Machine Learning and Applications (2015)
Google Scholar
Li, D., Chen, D., Jin, B., Shi, L., Goh, J., Ng, S.-K.: MAD-GAN: multivariate anomaly detection for time series data with generative adversarial networks. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11730, pp. 703–716. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30490-4_56
Chapter Google Scholar
Lin, Q., Adepu, S., Verwer, S., Mathur, A.: TABOR: a graphical model-based approach for anomaly detection in industrial control systems. In: Asia Conference on Computer and Communications Security (2018)
Google Scholar
Morris, T.H., Thornton, Z., Turnipseed, I.: Industrial control system simulation and data logging for intrusion detection system research. In: 7th Annual Southeastern Cyber Security Summit (2015)
Google Scholar
Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. (CSUR) 54(2), 1–38 (2021)
Article Google Scholar
Perales Gómez, Á.L., Fernández Maimó, L., Huertas Celdrán, A., García Clemente, F.J.: MADICS: a methodology for anomaly detection in industrial control systems. Symmetry 12(10), 1583 (2020)
Article Google Scholar
Shalyga, D., Filonov, P., Lavrentyev, A.: Anomaly detection for water treatment system based on neural network with automatic architecture optimization. arXiv:1807.07282 (2018)
Shin, H.K., Lee, W., Yun, J.H., Kim, H.: HAI 1.0: HIL-based augmented ICS security dataset. In: 13th USENIX Workshop on Cyber Security Experimentation and Test (2020)
Google Scholar
Singh, N., Olinsky, C.: Demystifying Numenta anomaly benchmark. In: International Joint Conference on Neural Networks (2017)
Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)
Article Google Scholar
Stouffer, K.: Guide to industrial control systems (ICS) security. NIST Special Publication 800(82) (2011)
Google Scholar
Taormina, R., Galelli, S.: Deep-learning approach to the detection and localization of cyber-physical attacks on water distribution systems. J. Water Res. Planning Manag. 144(10), 04018065 (2018)
Article Google Scholar
Taormina, R., et al.: Battle of the attack detection algorithms: disclosing cyber attacks on water distribution networks. J. Water Res. Planning Manag. 144(8), 04018048 (2018)
Article Google Scholar
Tatbul, N., Lee, T.J., Zdonik, S., Alam, M., Gottschlich, J.: Precision and recall for time series. In: Advances in Neural Information Processing Systems (2018)
Google Scholar
Turrin, F., Erba, A., Tippenhauer, N.O., Conti, M.: A statistical analysis framework for ICS process datasets. In: Joint Workshop on CPS and IoT Security and Privacy (2020)
Google Scholar
Ye, D., Zhang, T.Y.: Summation detector for false data-injection attack in cyber-physical systems. IEEE Trans. Cybernetics 50(6), 2338–2345 (2020)
Article Google Scholar
Zizzo, G., Hankin, C., Maffeis, S., Jones, K.: Intrusion detection for industrial control systems: evaluation analysis and adversarial attacks. arXiv:1911.04278 (2019)

Download references

Acknowledgment

We thank our shepherd and our anonymous reviewers for their insightful feedback. We also thank Camille Cobb, Trevor Kann, and Brian Singer for helpful comments on prior drafts of this paper. This material is based upon work supported by: the U.S. Army Research Office and the U.S. Army Futures Command under Contract No. W911NF-20-D-0002; DARPA GARD under Cooperative Agreement No. HR00112020006; a DoD National Defense Science and Engineering Graduate fellowship; the Secure and Private IoT initiative at Carnegie Mellon Cylab (IoT@CyLab); and Mitsubishi Heavy Industries through the Carnegie Mellon CyLab partnership program.

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, USA
Clement Fung, Shreya Srinarasi, Keane Lucas, Hay Bryan Phee & Lujo Bauer

Authors

Clement Fung
View author publications
You can also search for this author in PubMed Google Scholar
Shreya Srinarasi
View author publications
You can also search for this author in PubMed Google Scholar
Keane Lucas
View author publications
You can also search for this author in PubMed Google Scholar
Hay Bryan Phee
View author publications
You can also search for this author in PubMed Google Scholar
Lujo Bauer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Clement Fung .

Editor information

Editors and Affiliations

Rutgers University, Newark, NJ, USA
Vijayalakshmi Atluri
Hamad Bin Khalifa University, Doha, Qatar
Roberto Di Pietro
Technical University of Denmark, Kongens Lyngby, Denmark
Christian D. Jensen
Technical University of Denmark, Kongens Lyngby, Denmark
Weizhi Meng

A Key Findings in the Optimization Process

We identified four techniques that enhance the quality and reproducibility of anomaly detection performance. Table 4 shows which previous works use these techniques; no prior work incorporates all four.

Finding 1c: Techniques such as benign data shuffling, attack cleaning, feature selection, and early stopping increase the quality and reproducability of results, but are applied inconsistently in prior work.

Table 4. Identifying key pre-processing and model training techniques from prior ICS anomaly-detection work. ‘ ’, ‘ ’, and ‘ ’ indicate if the technique was used, partially used, or not used respectively. ‘?’ indicates that we could not determine if the technique was used

Finding #1: Feature Selection. In WADI and SWaT, some benign-labeled test data appears significantly different from benign-labeled training data [19, 35]. To address this problem, statistical tests are used to select features for the ML model. Prior work used a modified version of the Kolmogorov-Smirnov test (called K-S*) [19] to identify features with a significant difference between their training and test distributions. 11 features are removed from SWaT, and 10 features are removed from WADI, which matches the proportion of features removed from these datasets in prior work [19]. We found that feature selection is only effective on the SWaT dataset, so we only use feature selection for SWaT.

Finding #2: Attack Cleaning. Some attacks in the SWaT dataset do not execute as described [15, 17]: although labelled as attacks, the SWaT description [10] notes that they did not actually perform as intended. These cases should not be evaluated as attacks, yet the majority of prior work does. We recommend removing the benign “attacks” from the dataset. Furthermore, other prior work has noted that the start and end times of attacks in SWaT are incorrect [37]. Hence, we recommend that the times of the labelled attacks be corrected.^{Footnote 4}

Finding #3: Benign Data Shuffling. When most prior work divides the benign dataset into training and validation portions, it divides by a fixed time [8] or does not describe how the division is performed. Since system behavior can differ between days (e.g., if the final 30% of timesteps in SWaT are used for validation, the distributions of the training and validation datasets are significantly different), splitting should be random across the benign dataset. For CNNs and LSTMs, each timestep’s history should be collected before splitting.

Finding #4: Early Stopping.

When early stopping is not used, models overfit quickly and tend to diverge. We train a 4-layer, 64-unit CNN with a history length of 50, repeated three times across random seeds; the model hyperparameters, data ordering, and training parameters are all unchanged. Figure 6a shows the training and validation losses for 100 epochs. When early stopping is not used, the models overfit (validation loss plateaus after the 6th epoch and begins to increase afterward) and diverge after 10–20 epochs; this happens across all model architectures, model hyperparameters, and datasets. Across CNN sizes, Fig. 6b compares the final training and validation loss difference (overfit amount) with and without early stopping, averaged across three random seeds. With early stopping, the overfit amount is small for all model sizes. Without early stopping, larger models overfit more.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fung, C., Srinarasi, S., Lucas, K., Phee, H.B., Bauer, L. (2022). Perspectives from a Comprehensive Evaluation of Reconstruction-based Anomaly Detection in Industrial Control Systems. In: Atluri, V., Di Pietro, R., Jensen, C.D., Meng, W. (eds) Computer Security – ESORICS 2022. ESORICS 2022. Lecture Notes in Computer Science, vol 13556. Springer, Cham. https://doi.org/10.1007/978-3-031-17143-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-17143-7_24
Published: 24 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17142-0
Online ISBN: 978-3-031-17143-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Perspectives from a Comprehensive Evaluation of Reconstruction-based Anomaly Detection in Industrial Control Systems

Abstract

Access this chapter

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Key Findings in the Optimization Process

A Key Findings in the Optimization Process

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation