Simpler Is Better: On the Use of Autoencoders for Intrusion Detection

Catillo, Marta; Pecchia, Antonio; Villano, Umberto

doi:10.1007/978-3-031-14179-9_15

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1621))

Included in the following conference series:

International Conference on the Quality of Information and Communications Technology

1 Citations

Abstract

The ever-growing occurrence of computer security incidents calls for advanced intrusion detection techniques. A wide body of literature dealing with Intrusion Detection Systems (IDSes) is based on machine learning; many proposals rely on the use of autoencoders (AEs), due to their capability to analyze complex, high-dimensional and large-scale data. Most of the times, AEs are used as building blocks of much more complex detection architectures, possibly in combination with sophisticated feature selection techniques. This paper summarizes several years of work in this field, suggesting that “simpler is better” and that a carefully tuned and trained AE can be used in isolation, obtaining recognition results comparable with those attained by more complex designs. The best practices presented here, regarding dataset production and sanitization, AE set-up and training, threshold setting, possible use of simple feature selection techniques for performance improvement can be valuable for any practitioner willing to use autoencoders for intrusion detection purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/ahlashkari/CICFlowMeter.
2.
https://downloads.distrinet-research.be/WTMC2021/tools_datasets.html.
3.
Attacks are taken from the USB-IDS-1 dataset.

References

Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U.: USB-IDS-1: a public multilayer dataset of labeled network flows for IDS evaluation. In: Proceedings International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1–6. IEEE (2021)
Google Scholar
Catillo, M., Pecchia, A., Villano, U.: AutoLog: anomaly detection by deep autoencoding of system logs. Expert Syst. Appl. 191, 116263 (2022)
Google Scholar
Catillo, M., Rak, M., Villano, U.: Discovery of DoS attacks by the ZED-IDS anomaly detector. J. High Speed Netw. 25(4), 349–365 (2019)
Google Scholar
Catillo, M., Del Vecchio, A., Pecchia, A., Villano, U.: Transferability of machine learning models learned from public intrusion detection datasets: the CICIDS2017 case study. Softw. Qual. J. (2022). https://doi.org/10.1007/s11219-022-09587-0
Catillo, M., Pecchia, A., Rak, M., Villano, U.: Demystifying the role of public intrusion datasets: a replication study of DoS network traffic data. Comput. Secur. 108, 102341 (2021)
Google Scholar
Catillo, M., Rak, M., Villano, U.: 2L-ZED-IDS: a two-level anomaly detector for multiple attack classes. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) WAINA 2020. AISC, vol. 1150, pp. 687–696. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44038-1_63
Dina, A.S., Manivannan, D.: Intrusion detection based on machine learning techniques in computer networks. Internet Things 16, 100462 (2021)
Google Scholar
Engelen, G., Rimmer, V., Joosen, W.: Troubleshooting an intrusion detection dataset: the CICIDS2017 case study. In: 2021 IEEE Security and Privacy Workshops (SPW), pp. 7–12. IEEE (2021)
Google Scholar
Feng, S., Duarte, M.F.: Graph regularized autoencoder-based unsupervised feature selection. In: Proceedings International Conference on Signals, Systems, and Computers, pp. 55–59. IEEE (2018)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Google Scholar
Jiang, J., Han, G., Liu, L., Shu, L., Guizani, M.: Outlier detection approaches based on machine learning in the Internet-of-Things. IEEE Wirel. Commun. 27(3), 53–59 (2020)
Article Google Scholar
Khraisat, A., Gondal, I., Vamplew, P., Kamruzzaman, J.: Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2(1), 1–22 (2019). https://doi.org/10.1186/s42400-019-0038-7
Article Google Scholar
Kilincer, I., Ertam, F., Sengur, A.: Machine learning methods for cyber security intrusion detection: datasets and comparative study. Comput. Netw. 188, 107840 (2021)
Google Scholar
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991)
Article Google Scholar
Kunang, Y.N., Nurmaini, S., Stiawan, D., Zarkasi, A., Firdaus, Jasmir: Automatic features extraction using autoencoder in intrusion detection system. In: Proceedings International Conference on Electrical Engineering and Computer Science (ICECOS), pp. 219–224. IEEE (2018)
Google Scholar
Kwon, D., Kim, H., Kim, J., Suh, S.C., Kim, I., Kim, K.J.: A survey of deep learning-based network anomaly detection. Clust. Comput. 22(1), 949–961 (2017). https://doi.org/10.1007/s10586-017-1117-8
Article Google Scholar
Maciá-Fernández, G., Camacho, J., Magán-Carrión, R., García-Teodoro, P., Therón, R.: UGR’16: a new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 73, 411–424 (2017)
Google Scholar
Maseer, Z.K., Yusof, R., Bahaman, N., Mostafa, S.A., Foozy, C.F.M.: Benchmarking of machine learning for anomaly based intrusion detection systems in the CICIDS2017 dataset. IEEE Access 9, 22351–22370 (2021)
Google Scholar
Min, B., Yoo, J., Kim, S., Shin, D., Shin, D.: Network anomaly detection using memory-augmented deep autoencoder. IEEE Access 9, 104695–104706 (2021)
Google Scholar
Mirsky, Y., Doitshman, T., Elovici, Y., Shabtai, A.: Kitsune: an ensemble of autoencoders for online network intrusion detection. In: Proceedings International Conference of Network and Distributed System Security Symposium (NDSS) (2018)
Google Scholar
Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: Proceedings International Conference Military Communications and Information Systems Conference, pp. 1–6. IEEE (2015)
Google Scholar
Panigrahi, R., et al.: Performance assessment of supervised classifiers for designing intrusion detection systems: a comprehensive review and recommendations for future research. Mathematics 9(6), 690 (2021)
Google Scholar
Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A.: A survey of network-based intrusion detection data sets. Comput. Secur. 86, 147–167 (2019)
Google Scholar
Sharafaldin, I., Lashkari, A.H., Ghorbani., A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings International Conference on Information Systems Security and Privacy, pp. 108–116. SciTePress (2018)
Google Scholar
Taher, K.A., Mohammed Yasin Jisan, B., Rahman, M.M.: Network intrusion detection using supervised machine learning technique with feature selection. In: Proceedings International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST). IEEE (2019)
Google Scholar
Thakur, S., Chakraborty, A., De, R., Kumar, N., Sarkar, R.: Intrusion detection in cyber-physical systems using a generic and domain specific deep autoencoder model. Comput. Electr. Eng. 91, 107044 (2021)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Google Scholar
Wei-Chao, L., Shih-Wen, K., Chih-Fong, T.: CANN: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl. Based Syst. 78, 13–21 (2015)
Google Scholar
XuKui, L., Wei, C., Qianru, Z., Lifa, W.: Building auto-encoder intrusion detection system based on random forest feature selection. Comput. Secur. 95, 101851 (2020)
Google Scholar
Zhong, Y., et al.: HELAD: a novel network anomaly detection model based on heterogeneous ensemble learning. Comput. Netw. 169 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Ingegneria, Università degli Studi del Sannio, Benevento, Italy
Marta Catillo, Antonio Pecchia & Umberto Villano

Authors

Marta Catillo
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Pecchia
View author publications
You can also search for this author in PubMed Google Scholar
Umberto Villano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Umberto Villano .

Editor information

Editors and Affiliations

University of Malaga, Málaga, Spain
Antonio Vallecillo
Leiden University, Leiden, The Netherlands
Joost Visser
University of Castila-La Mancha, Ciudad Real, Spain
Ricardo Pérez-Castillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Catillo, M., Pecchia, A., Villano, U. (2022). Simpler Is Better: On the Use of Autoencoders for Intrusion Detection. In: Vallecillo, A., Visser, J., Pérez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2022. Communications in Computer and Information Science, vol 1621. Springer, Cham. https://doi.org/10.1007/978-3-031-14179-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-14179-9_15
Published: 05 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14178-2
Online ISBN: 978-3-031-14179-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Simpler Is Better: On the Use of Autoencoders for Intrusion Detection