Abstract
Large-scale distributed system suffers from the problem that system manager can’t discover, locate and fix system anomaly in time when system malfunctions. People often use system logs for anomaly detection. However, manually inspecting system logs to detect anomaly is unfeasible due to the increasing scale and complexity of distributed systems. As a result, various methods of automatically mining log patterns for anomaly detection have been developed. Existing methods for log pattern mining have drawbacks of either time-consuming or low-accuracy. In order to address these problems, we propose Lopper, a hybrid clustering tree for online log pattern mining. Our method accelerates the mining process by clustering raw log data in one-pass manner and ensures the accuracy by merging and combing similar patterns with different kernel functions in each step. We evaluate our method on massive sets of log data generated in different industrial applications. The experimental results show that Lopper achieves the accuracy with 92.26% on average which is much better than comparative methods and remains high efficiency at the same time. We also conduct experiments on system anomaly detection task using the log patterns generated by Lopper, the results show an average F-Measure performance of 91.97%, which further proves the effectiveness of Lopper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lu, S., et al.: Detecting anomaly in big data system logs using convolutional neural network. In: 2018 IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, 16th International Conference on Pervasive Intelligence and Computing, 4th International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE (2018)
He, P., et al.: An evaluation study on log parsing and its use in log mining. In: 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE (2016)
He, P., et al.: Drain: an online log parsing approach with fixed depth tree. In: 2017 IEEE International Conference on Web Services (ICWS). IEEE (2017)
Fu, Q., et al.: Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 Ninth IEEE International Conference on Data Mining. IEEE (2009)
Zhu, K.Q., Fisher, K., Walker, D.: Incremental learning of system log formats. ACM SIGOPS Operating Syst. Rev. 44(1), 85–90 (2010)
Xu, W., et al.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles. ACM (2009)
Hamooni, H., et al.: LogMine: fast pattern recognition for log analytics. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM (2016)
Makanju, A.A.O., Nur Zincir-Heywood, A., Milios, E.E.: Clustering event logs using iterative partitioning. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2009)
Mizutani, M.: Incremental mining of system log format. In: 2013 IEEE International Conference on Services Computing. IEEE (2013)
Tang, L., Tao, L., Perng, C.-S.: LogSig: generating system events from raw textual logs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. ACM (2011)
Cheng, J., et al.: Deep convolutional neural networks for anomaly event classification on distributed systems. arXiv preprint arXiv:1710.09052 (2017)
Vaarandi, R.: A breadth-first algorithm for mining frequent patterns from event logs. In: Aagesen, F.A., Anutariya, C., Wuwongse, V. (eds.) INTELLCOMM 2004. LNCS, vol. 3283, pp. 293–308. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30179-0_27
Vaarandi, R.: A data clustering algorithm for mining patterns from event logs. In: Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003) (IEEE Cat. No. 03EX764). IEEE (2003)
Du, M., Li, F.: Spell: streaming parsing of system event logs. In: 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE (2016)
Stearley, J.: Towards informatic analysis of syslogs. In: 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No. 04EX935). IEEE (2004)
Edit distance. https://en.wikipedia.org/wiki/Edit_distance
Smith–Waterman_algorithm. https://en.wikipedia.org/wiki/Smith-Waterman_algorithm
Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Nat. Lang. Eng. 16(1), 100–103 (2010)
Acknowledgments
This work is supported by UINNOVA Joint Innovation project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, J., Hou, Z., Li, Y. (2019). Lopper: An Efficient Method for Online Log Pattern Mining Based on Hybrid Clustering Tree. In: Hartmann, S., KĂĽng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-27615-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27614-0
Online ISBN: 978-3-030-27615-7
eBook Packages: Computer ScienceComputer Science (R0)