TAElog: A Novel Transformer AutoEncoder-Based Log Anomaly Detection Method

Zhao, Changzhi; Huang, Kezhen; Wu, Di; Han, Xueying; Du, Dan; Zhou, Yutian; Lu, Zhigang; Liu, Yuling

doi:10.1007/978-981-97-0945-8_3

Changzhi Zhao^9,10,
Kezhen Huang¹¹,
Di Wu¹²,
Xueying Han^9,10,
Dan Du^9,10,
Yutian Zhou¹³,
Zhigang Lu^9,10 &
…
Yuling Liu^9,10

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14527))

Included in the following conference series:

International Conference on Information Security and Cryptology

138 Accesses

Abstract

Log anomaly detection serves as an effective approach for identifying threats. Autoencoder-based detection methods address positive and negative sample imbalance issues and have been extensively adopted in practical applications. However, most existing methods necessitate a sliding window to adapt to the autoencoder’s base network, leading to information confusion and diminished resilience. Furthermore, detection results may be worthless when a single log comprises numerous unbalanced log records. In response, we propose TAElog, a novel framework employing a transformer-based autoencoder designed to extract precise information from logs without the need for sliding windows. TAElog also incorporates a new loss calculation that computes both high-dimensional metrics and divergence information, enhancing detection performance in intricate situations with diverse and unbalanced log records. Moreover, our framework covers preprocessing to increase the compatibility between text and numeric logs. To verify the effectiveness of TAElog, we evaluate its performance against other methods on both textual and numerical logs. Additionally, we assess various preprocessing and loss computation approaches to determine the optimal configuration within our method. Experimental results demonstrate that TAElog not only achieves superior accuracy rates but also boasts increased processing speed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Astekin, M., Özcan, S., Sözer, H.: Incremental analysis of large-scale system logs for anomaly detection. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 2119–2127 (2019). https://doi.org/10.1109/BigData47090.2019.9006593
Berlin, K., Slater, D., Saxe, J.: Malicious behavior detection using windows audit logs. In: Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, pp. 35–44. Association for Computing Machinery (2015). https://doi.org/10.1145/2808769.2808773
Bian, H., Bai, T., Salahuddin, M.A., Limam, N., Daya, A.A., Boutaba, R.: Uncovering lateral movement using authentication logs. IEEE Trans. Netw. Serv. Manage. 18(1), 1049–1063 (2021). https://doi.org/10.1109/TNSM.2021.3054356
Article Google Scholar
Borghesi, A., Bartolini, A., Lombardi, M., Milano, M., Benini, L.: Anomaly detection using autoencoders in high performance computing systems. In: Proceedings of the AAAI Conference on artificial intelligence, vol. 33, pp. 9428–9433 (2019). https://doi.org/10.1609/aaai.v33i01.33019428
Catillo, M., Pecchia, A., Villano, U.: AutoLog: anomaly detection by deep autoencoding of system logs. Expert Syst. Appl. 191, 116263 (2022). https://doi.org/10.1016/j.eswa.2021.116263
Article Google Scholar
Dongre, P.B., Malik, L.G.: A review on real time data stream classification and adapting to various concept drift scenarios. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 533–537 (2014). https://doi.org/10.1109/IAdCC.2014.6779381
Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 1285–1298. Association for Computing Machinery (2017). https://doi.org/10.1145/3133956.3134015
Dunia, R., Qin, S.J.: Multi-dimensional fault diagnosis using a subspace approach. In: American Control Conference, vol. 5. Citeseer (1997)
Google Scholar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Article MathSciNet Google Scholar
Gao, Y., Ma, Y., Li, D.: Anomaly detection of malicious users’ behaviors for web applications based on web logs. In: 2017 IEEE 17th International Conference on Communication Technology (ICCT), pp. 1352–1355 (2017). https://doi.org/10.1109/ICCT.2017.8359854
Ho, G., et al.: Hopper: modeling and detecting lateral movement. In: USENIX Security Symposium, pp. 3093–3110 (2021)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet Google Scholar
Kwon, D., Natarajan, K., Suh, S.C., Kim, H., Kim, J.: An empirical study on network anomaly detection using convolutional neural networks. In: 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp. 1595–1598 (2018). https://doi.org/10.1109/ICDCS.2018.00178
Li, T., Ma, J., Pei, Q., Shen, Y., Lin, C., Ma, S., Obaidat, M.S.: AClog: attack chain construction based on log correlation. In: 2019 IEEE Global Communications Conference (GLOBECOM), pp. 1–6 (2019)
Google Scholar
Lu, S., Wei, X., Li, Y., Wang, L.: Detecting anomaly in big data system logs using convolutional neural network. In: 2018 IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, 16th International Conference on Pervasive Intelligence and Computing, 4th International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 151–158 (2018)
Google Scholar
Meng, W., et al.: Device-agnostic log anomaly classification with partial labels. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–6 (2018). https://doi.org/10.1109/IWQoS.2018.8624141
Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6 (2015). https://doi.org/10.1109/MilCIS.2015.7348942
Wang, Z., Tian, J., Fang, H., Chen, L., Qin, J.: LightLog: a lightweight temporal convolutional network for log anomaly detection on the edge. Comput. Netw. 203, 108616 (2022)
Article Google Scholar
wuyifan18: Deeplog (2019). https://github.com/wuyifan18/DeepLog
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.: Online system problem detection by mining patterns of console logs. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 588–597 (2009). https://doi.org/10.1109/ICDM.2009.19
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.I.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP 2009, pp. 117–132. Association for Computing Machinery, New York (2009). https://doi.org/10.1145/1629575.1629587
Yen, S., Moh, M., Moh, T.S.: CausalConvLSTM: semi-supervised log anomaly detection through sequence modeling. In: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1334–1341 (2019). https://doi.org/10.1109/ICMLA.2019.00217
Yoo, Y.H., Kim, U.H., Kim, J.H.: Recurrent reconstructive network for sequential anomaly detection. IEEE Trans. Cybern. 51(3), 1704–1715 (2021). https://doi.org/10.1109/TCYB.2019.2933548
Article Google Scholar
Yuan, L.P., Liu, P., Zhu, S.: Recompose event sequences vs. predict next events: a novel anomaly detection approach for discrete event logs. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 336–348. Association for Computing Machinery (2021). https://doi.org/10.1145/3433210.3453098
Zhang, X., et al.: Robust log-based anomaly detection on unstable log data. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 807–817. Association for Computing Machinery (2019). https://doi.org/10.1145/3338906.3338931

Download references

Acknowledgements

We, the authors, thank the anonymous reviewers for their helpful suggestions. This work is supported by the National Key Research and Development Program of China (Grant No. 2020YFB1806504), the National Key Research and Development Program of China (Grant No. 2021YFF0307203), the Strategic Priority Research Program of Chinese Academy of Sciences (Grant No. XDC02040100). This work is also supported by the Program of Key Laboratory of Network Assessment Technology, the Chinese Academy of Sciences, Program of Beijing Key Laboratory of Network Security and Protection Technology.

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Bejing, 10085, China
Changzhi Zhao, Xueying Han, Dan Du, Zhigang Lu & Yuling Liu
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, 100049, China
Changzhi Zhao, Xueying Han, Dan Du, Zhigang Lu & Yuling Liu
Institute of Software, Chinese Academy of Sciences, Beijing, 100190, China
Kezhen Huang
China Cybersecurity Review Technology and Certification Center, Beijing, China
Di Wu
School of Data Science, Fudan University, Shanghai, 200433, China
Yutian Zhou

Authors

Changzhi Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Kezhen Huang
View author publications
You can also search for this author in PubMed Google Scholar
Di Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xueying Han
View author publications
You can also search for this author in PubMed Google Scholar
Dan Du
View author publications
You can also search for this author in PubMed Google Scholar
Yutian Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yuling Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Di Wu .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Chunpeng Ge
Columbia University, New York, NY, USA
Moti Yung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, C. et al. (2024). TAElog: A Novel Transformer AutoEncoder-Based Log Anomaly Detection Method. In: Ge, C., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2023. Lecture Notes in Computer Science, vol 14527. Springer, Singapore. https://doi.org/10.1007/978-981-97-0945-8_3

Download citation

DOI: https://doi.org/10.1007/978-981-97-0945-8_3
Published: 25 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0944-1
Online ISBN: 978-981-97-0945-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics