Skip to main content

Lopper: An Efficient Method for Online Log Pattern Mining Based on Hybrid Clustering Tree

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Included in the following conference series:

Abstract

Large-scale distributed system suffers from the problem that system manager can’t discover, locate and fix system anomaly in time when system malfunctions. People often use system logs for anomaly detection. However, manually inspecting system logs to detect anomaly is unfeasible due to the increasing scale and complexity of distributed systems. As a result, various methods of automatically mining log patterns for anomaly detection have been developed. Existing methods for log pattern mining have drawbacks of either time-consuming or low-accuracy. In order to address these problems, we propose Lopper, a hybrid clustering tree for online log pattern mining. Our method accelerates the mining process by clustering raw log data in one-pass manner and ensures the accuracy by merging and combing similar patterns with different kernel functions in each step. We evaluate our method on massive sets of log data generated in different industrial applications. The experimental results show that Lopper achieves the accuracy with 92.26% on average which is much better than comparative methods and remains high efficiency at the same time. We also conduct experiments on system anomaly detection task using the log patterns generated by Lopper, the results show an average F-Measure performance of 91.97%, which further proves the effectiveness of Lopper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lu, S., et al.: Detecting anomaly in big data system logs using convolutional neural network. In: 2018 IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, 16th International Conference on Pervasive Intelligence and Computing, 4th International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE (2018)

    Google Scholar 

  2. He, P., et al.: An evaluation study on log parsing and its use in log mining. In: 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE (2016)

    Google Scholar 

  3. He, P., et al.: Drain: an online log parsing approach with fixed depth tree. In: 2017 IEEE International Conference on Web Services (ICWS). IEEE (2017)

    Google Scholar 

  4. Fu, Q., et al.: Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 Ninth IEEE International Conference on Data Mining. IEEE (2009)

    Google Scholar 

  5. Zhu, K.Q., Fisher, K., Walker, D.: Incremental learning of system log formats. ACM SIGOPS Operating Syst. Rev. 44(1), 85–90 (2010)

    Article  Google Scholar 

  6. Xu, W., et al.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles. ACM (2009)

    Google Scholar 

  7. Hamooni, H., et al.: LogMine: fast pattern recognition for log analytics. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM (2016)

    Google Scholar 

  8. Makanju, A.A.O., Nur Zincir-Heywood, A., Milios, E.E.: Clustering event logs using iterative partitioning. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2009)

    Google Scholar 

  9. Mizutani, M.: Incremental mining of system log format. In: 2013 IEEE International Conference on Services Computing. IEEE (2013)

    Google Scholar 

  10. Tang, L., Tao, L., Perng, C.-S.: LogSig: generating system events from raw textual logs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. ACM (2011)

    Google Scholar 

  11. Cheng, J., et al.: Deep convolutional neural networks for anomaly event classification on distributed systems. arXiv preprint arXiv:1710.09052 (2017)

  12. Vaarandi, R.: A breadth-first algorithm for mining frequent patterns from event logs. In: Aagesen, F.A., Anutariya, C., Wuwongse, V. (eds.) INTELLCOMM 2004. LNCS, vol. 3283, pp. 293–308. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30179-0_27

    Chapter  Google Scholar 

  13. Vaarandi, R.: A data clustering algorithm for mining patterns from event logs. In: Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003) (IEEE Cat. No. 03EX764). IEEE (2003)

    Google Scholar 

  14. Du, M., Li, F.: Spell: streaming parsing of system event logs. In: 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE (2016)

    Google Scholar 

  15. Stearley, J.: Towards informatic analysis of syslogs. In: 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No. 04EX935). IEEE (2004)

    Google Scholar 

  16. Edit distance. https://en.wikipedia.org/wiki/Edit_distance

  17. Smith–Waterman_algorithm. https://en.wikipedia.org/wiki/Smith-Waterman_algorithm

  18. Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Nat. Lang. Eng. 16(1), 100–103 (2010)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by UINNOVA Joint Innovation project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, J., Hou, Z., Li, Y. (2019). Lopper: An Efficient Method for Online Log Pattern Mining Based on Hybrid Clustering Tree. In: Hartmann, S., KĂĽng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27615-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27614-0

  • Online ISBN: 978-3-030-27615-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics