Log message anomaly detection with fuzzy C-means and MLP

Farzad, Amir; Gulliver, T. Aaron

doi:10.1007/s10489-022-03300-1

Log message anomaly detection with fuzzy C-means and MLP

Published: 04 April 2022

Volume 52, pages 17708–17717, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

457 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Log messages are one the most valuable sources of information in the cloud and other software systems. These logs can be used for audits and ensuring system security. Many millions of log messages are produced each day which makes anomaly detection challenging. Automating the detection of anomalies can save time and money as well as improve detection performance. In this paper, an anomaly detection method is proposed using radius-based fuzzy C-means with more clusters than the number of data classes and a multilayer perceptron (MLP) network. The cluster centers and a radius are used to select reliable positive and negative log messages. Moreover, class probabilities are used with an expert to correct the network output for suspect logs. The proposed model is evaluated with three well-known data sets, namely BGL, Openstack and Thunderbird. The results obtained show that this model provides better results than existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

K-means Application for Anomaly Detection and Log Classification in HPC

Development of an Early Warning System for Network Intrusion Detection Using Benford’s Law Features

Intrusion Detection in Computer Networks Based on KNN, K-Means++ and J48

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

References

Zhu J, He S, Liu J, He P, Xie Q, Zheng Z, Lyu MR (2019) Tools and benchmarks for automated log parsing. In: International conference on software engineering: software engineering in practice, pp 121–130
He S, Lin Q, Lou J-G, Zhang H, Lyu MR, Zhang D (2018) Identifying impactful service system problems via log analysis. In: ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 60–70
Yuan D, Mai H, Xiong W, Tan L, Zhou Y, Pasupathy S (2010) SherLog: Error diagnosis by connecting clues from run-time logs. In: Architectural support for programming languages and operating systems, pp 143–154
Wu F, Anchuri P, Li Z (2017) Structural event detection from log messages. In: Proceedings of the ACM international conference on knowledge discovery and data mining, pp 1175–1184
Vaarandi R, Blumbergs B, Kont M (2018) An unsupervised framework for detecting anomalous messages from syslog log files. In: IEEE/IFIP network operations and management symposium, pp 1–6
Yen T-F, Oprea A, Onarlioglu K, Leetham T, Robertson W, Juels A, Kirda E (2013) Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In: Annual computer security applications conference, pp 199–208
Lin Q, Zhang H, Lou J, Zhang Y, Chen X (2016) Log clustering based problem identification for online service systems. In: IEEE/ACM international conference on software engineering, pp 102–111
Simeone O (2018) A Very brief introduction to machine learning with applications to communication systems. IEEE Trans Cogn Commun Netw 4(4):648–664. https://doi.org/10.1109/TCCN.2018.2881442
Article Google Scholar
Handrich S, Herzog A, Wolf A, Herrmann CS (2011) Combining supervised, unsupervised, and reinforcement learning in a network of spiking neurons. In: Advances in cognitive neurodynamics (II). Springer, Berlin, pp 163–176
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge, MA
MATH Google Scholar
Affonso C, Rossi ALD, Vieira FHA, de Leon Ferreira de Carvalho ACP (2017) Deep learning for biological image classification. Expert Syst Appl 85:114–122. https://doi.org/10.1016/j.eswa.2017.05.039, http://www.sciencedirect.com/science/article/pii/S0957417417303627
Article Google Scholar
Chen S, Wang L, Li W, Zhang K (2019) Deep learning method with attention for extreme multi-label text classification. In: Trends in artificial intelligence. Springer, Berlin, pp 179–190
Wazid M, Das AK (2016) An efficient hybrid anomaly detection scheme using k-means clustering for wireless sensor networks. Wirel Pers Commun 90(4):1971–2000. https://doi.org/10.1007/s11277-016-3433-3
Article Google Scholar
Ding N, Ma H, Gao H, Ma Y, Tan G (2019) Real-time anomaly detection based on long short-term memory and gaussian mixture model. Comput Electr Eng 79:106458. https://doi.org/10.1016/j.compeleceng.2019.106458, http://www.sciencedirect.com/science/article/pii/S0045790618334372
Article Google Scholar
Ma MX, Ngan HYT, Liu W (2016) Density-based outlier detection by local outlier factor on largescale traffic data. Image Processing: Machine Vision Applications IX. https://www.ingentaconnect.com/content/ist/ei/2016/00002016/00000014/art00003 https://www.ingentaconnect.com/content/ist/ei/2016/00002016/00000014/art00003
Antonini M, Vecchio M, Antonelli F, Ducange P, Perera C (2018) Smart audio sensors in the internet of things edge for anomaly detection. IEEE Access 6:67594–67610. https://doi.org/10.1109/ACCESS.2018.2877523
Article Google Scholar
Taylor A, Japkowicz N, Leblanc S (2015) Frequency-based anomaly detection for the automotive CAN bus. In: World congress on industrial control systems security, pp 45–49
Farzad A, Gulliver TA (2020) Unsupervised log message anomaly detection. ICT Express 6(3):229–237. https://doi.org/10.1016/j.icte.2020.06.003, http://www.sciencedirect.com/science/article/pii/S2405959520300643
Article Google Scholar
Reidemeister T, Jiang M, Ward PAS (2011) Mining unstructured log files for recurrent fault diagnosis. In: IFIP/IEEE international symposium on integrated network management and workshops, pp 377–384
Wang B, Ying S, Cheng G, Wang R, Yang Z, Dong B (2020) Log-based anomaly detection with the improved K-nearest neighbor. Int J Softw Eng Knowl Eng 30 (2):239–262. https://doi.org/10.1142/S0218194020500114
Article Google Scholar
Hirakawa R, Uchida H, Nakano A, Tominaga K, Nakatoh Y (2021) Large scale log anomaly detection via spatial pooling. Cognitive Robotics 1:188–196. https://doi.org/10.1016/j.cogr.2021.10.001
Article Google Scholar
Savaridassan P, Maragatham G (2021) Integrated deep auto-encoder and Q-learning-based scheme to detect anomalies and supporting forensics in cloud computing environments. Wirel Pers Commun, https://doi.org/10.1007/s11277-021-08785-6
Wang J, Zhao C, He S, Gu Y, Alfarraj O, Abugabah A (2022) LogUAD: log unsupervised anomaly detection based on Word2Vec. Comput Syst Sci Eng 41(3):1207–1222. https://doi.org/10.32604/csse.2022.022365
Article Google Scholar
Farzad A, Gulliver TA (2021) Two class pruned log message anomaly detection. SN Computer Science 2(5):391. https://doi.org/10.1007/s42979-021-00772-9
Article Google Scholar
Du M, Li F, Zheng G, Srikumar V (2017) DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: ACM conference on computer and communications security, pp 1285–1298
Zhang D-Q, Chen S-C (2004) A novel kernelized fuzzy C-means algorithm with application in medical image segmentation. Artif Intell Med 32(1):37–50. https://doi.org/10.1016/j.artmed.2004.01.012
Article Google Scholar
Koohi H, Kiani K (2016) User based collaborative filtering using fuzzy C-means. Measurement 91:134–139. https://doi.org/10.1016/j.measurement.2016.05.058, http://www.sciencedirect.com/science/article/pii/S0263224116302159
Article Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Parallel distributed processing – explorations in the microstructure of cognition. MIT Press, Cambridge, MA, pp 318–362
Palo HK, Mohanty MN, Chandra M (2015) Use of different features for emotion recognition using MLP network. In: Computational vision and robotics. https://link.springer.com/chapter/10.1007/978-81-322-2196-8_2 https://link.springer.com/chapter/10.1007/978-81-322-2196-8_2. Springer, Berlin, pp 7–15
Zhang C, Pan X, Li H, Gardiner A, Sargent I, Hare J, Atkinson PM (2018) A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification. J Photogramm Remote Sens 140:133–144. https://doi.org/10.1016/j.isprsjprs.2017.07.014
Article Google Scholar
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Advanced applications in pattern recognition. Springer, Berlin. https://www.springer.com/gp/book/9781475704525
Book MATH Google Scholar
He S, Zhu J, He P, Lyu MR (2016) Experience report: System log analysis for anomaly detection. In: IEEE international symposium on software reliability engineering, pp 207–218
Yang R, Qu D, Gao Y, Qian Y, Tang Y (2019) nLSALog: An anomaly detection framework for log sequence in security management. IEEE Access 7:181152–181164
Article Google Scholar
Ma H, Ekanayake C, Saha T K (2012) Power transformer fault diagnosis under measurement originated uncertainties. IEEE Trans Dielectr Electr Insul 19(6):1982–1990. https://doi.org/10.1109/TDEI.2012.6396956
Article Google Scholar
Xianfeng Y, Pengfei L (2015) Tailoring fuzzy C-means clustering algorithm for big data using random sampling and particle swarm optimization. Int J Database Theory Appl 8(3):191–202. https://doi.org/10.14257/ijdta.2015.8.3.16
Article Google Scholar

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Victoria, PO Box 1700, STN CSC, Victoria, BC, V8W 2Y2, Canada
Amir Farzad & T. Aaron Gulliver

Authors

Amir Farzad
View author publications
You can also search for this author inPubMed Google Scholar
T. Aaron Gulliver
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Amir Farzad.

Ethics declarations

Conflict of Interests

The authors declare no conflict of interest with regards to this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Farzad, A., Gulliver, T.A. Log message anomaly detection with fuzzy C-means and MLP. Appl Intell 52, 17708–17717 (2022). https://doi.org/10.1007/s10489-022-03300-1

Download citation

Accepted: 25 January 2022
Published: 04 April 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10489-022-03300-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Log message anomaly detection with fuzzy C-means and MLP

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

K-means Application for Anomaly Detection and Log Classification in HPC

Development of an Early Warning System for Network Intrusion Detection Using Benford’s Law Features

Intrusion Detection in Computer Networks Based on KNN, K-Means++ and J48

Explore related subjects

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now