Skip to main content

Real-Time Anomaly Detection for Distributed Systems Logs Using Apache Kafka and H2O.ai

  • Conference paper
  • First Online:
Information and Software Technologies (ICIST 2022)

Abstract

System monitoring is crucial to ensure that the system is working correctly. Usually, it encompasses solutions from the simple configuration of static thresholds for hardware/software key performance indicators to employing anomaly detection algorithms on a stream of numerical data. System logs, on the other hand, is another golden source of the system state, but often it is overlooked. Combining system logs with load metrics could potentially increase the accuracy of anomaly detection. We propose a robust pipeline and evaluate several of its variants for solving such a task at scale and in real-time. Experiments with proprietary logs from an enterprise Kafka cluster reveal that pre-processing with an autoencoder prior to applying the isolation forest method can significantly improve the detection performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ahmad, S., Lavin, A., Purdy, S., Agha, Z.: Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262, 06 (2017)

    Article  Google Scholar 

  2. Andreolini, M., Colajanni, M., Pietri, M., Tosi, S.: Adaptive, scalable and reliable monitoring of big data on clouds. J. Parallel Distrib. Comput. 79, 67–79 (2015). Special Issue on Scalable Systems for Big Data Management and Analytics

    Article  Google Scholar 

  3. Chong, F., Chua, T., Lim, E.P., Huberman, B.A.: Detecting flow anomalies in distributed systems. In: 2014 IEEE International Conference on Data Mining. IEEE (2014)

    Google Scholar 

  4. Chong, F., Chua, T., Lim, E.P., Huberman, B.A.: Detecting flow anomalies in distributed systems. In: Proceedings of the 2014 IEEE International Conference on Data Mining, ICDM 2014, pp. 100–109, USA. IEEE Computer Society (2014)

    Google Scholar 

  5. Decker, L., Leite, D., Giommi, L., Bonacorsi, D.: Real-time anomaly detection in data centers for log-based predictive maintenance using an evolving fuzzy-rule-based approach. In: 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE (2020)

    Google Scholar 

  6. Dorofeev, A., Kurganov, V., Fillipova, N., Pashkova, T.: Ensuring the integrity of transportation and logistics during the COVID-19 pandemic. Transp. Res. Procedia 50, 96–105 (2020). XIV International Conference on Organization and Traffic Safety Management in Large Cities (OTS-2020)

    Article  Google Scholar 

  7. Farzad, A., Gulliver, T.A.: Unsupervised log message anomaly detection. ICT Express 6(3), 229–237 (2020)

    Article  Google Scholar 

  8. Fu, Q., Lou, J.G., Wang, Y., Li, J.: Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 Ninth IEEE International Conference on Data Mining. IEEE (2009)

    Google Scholar 

  9. Poojitha, G., Sowmyarani, C.: Pipeline for real-time anomaly detection in log data streams using Apache Kafka and Apache Spark. Int. J. Comput. Appl. 182(24), 8–13 (2018)

    Google Scholar 

  10. H2O.ai. H2O: Scalable Machine Learning Platform, 2020. version 3.30.0.6

    Google Scholar 

  11. Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: a survey. Int. J. Inf. Manage. 45, 289–307 (2019)

    Article  Google Scholar 

  12. He, S., Zhu, J., He, P., Lyu, M.R.: Experience report: system log analysis for anomaly detection. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 207–218 (2016)

    Google Scholar 

  13. He, S., Zhu, J., He, P., Lyu, M.R.: Loghub: a large collection of system log datasets towards automated log analytics. ArXiv, abs/2008.06448, 2020

    Google Scholar 

  14. Hesse, G., Matthies, C., Rabl, T., Uflacker, M.: How fast can we insert? a performance study of apache kafka. ArXiv, abs/2003.06452 (2020)

    Google Scholar 

  15. Jafarpour, H., Desai, R.: KSQL: streaming SQL engine for Apache Kafka. In: Herschel, M., Galhardas, H., Reinwald, B., Fundulaki, I., Binnig, C., Kaoudi, Z. (eds.) Advances in Database Technology - 22nd International Conference on Extending Database Technology, EDBT 2019, Lisbon, Portugal, 26–29 March 2019, pp. 524–533. OpenProceedings.org (2019)

    Google Scholar 

  16. Kochura, Y., Stirenko, S., Alienin, O., Novotarskiy, M., Gordienko, Y.: Performance analysis of open source machine learning frameworks for various parameters in single-threaded and multi-threaded modes. In: Shakhovska, N., Stepashko, V. (eds.) CSIT 2017. AISC, vol. 689, pp. 243–256. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-70581-1_17

    Chapter  Google Scholar 

  17. Kreps, J., Kafka : a distributed messaging system for log processing (2011)

    Google Scholar 

  18. Kumarage, H., Khalil, I., Tari, Z., Zomaya, A.: Distributed anomaly detection for industrial wireless sensor networks based on fuzzy data modelling. J. Parallel Distrib. Comput. 73(6), 790–806 (2013)

    Article  Google Scholar 

  19. Kumari, R., Singh, M.K., Jha, R., Singh, N.K.: Anomaly detection in network traffic using k-mean clustering. In: 2016 3rd International Conference on Recent Advances in Information Technology (RAIT), pp. 387–393 (2016)

    Google Scholar 

  20. Leite, L., Rocha, C., Kon, F., Milojicic, D., Meirelles, P.: A survey of DevOps concepts and challenges. ACM Comput. Surv. 52(6), 1–35 (2019)

    Article  Google Scholar 

  21. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining. IEEE (2008)

    Google Scholar 

  22. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data 6(1), 1–39 (2012)

    Article  Google Scholar 

  23. Myers, D., Suriadi, S., Radke, K., Foo, E.: Anomaly detection for industrial control systems using process mining. Comput. Secur. 78, 103–125 (2018)

    Article  Google Scholar 

  24. Nguyen, T.-B.-T., Liao, T.-L., Vu, T.-A.: Anomaly detection using one-class SVM for logs of juniper router devices. In: Duong, T.Q., Vo, N.-S., Nguyen, L.K., Vien, Q.-T., Nguyen, V.-D. (eds.) INISCOM 2019. LNICST, vol. 293, pp. 302–312. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30149-1_24

    Chapter  Google Scholar 

  25. Nixon, C., Sedky, M., Hassan, M. .: Autoencoders: a low cost anomaly detection method for computer network data streams. In: Proceedings of the 2020 4th International Conference on Cloud and Big Data Computing, ICCBDC 2020, pp. 58–62, New York, NY, USA. Association for Computing Machinery (2020)

    Google Scholar 

  26. Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. CoRR, abs/2007.02500 (2020)

    Google Scholar 

  27. Rabatel, J., Bringay, S., Poncelet, P.: Anomaly detection in monitoring sensor data for preventive maintenance. Expert Syst. Appl. 38(6), 7003–7015 (2011)

    Article  Google Scholar 

  28. Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10(3), e0118432 (2015)

    Article  Google Scholar 

  29. Sax, M.J.: Apache Kafka, pp. 1–8. Springer International Publishing, Cham (2018)

    Google Scholar 

  30. Sax, M.J., Wang, G., Weidlich, M., Freytag, J.C.: Streams and tables: two sides of the same coin. In: Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics, BIRTE 2018, New York, NY, USA. Association for Computing Machinery (2018)

    Google Scholar 

  31. Xu, H., et al.: Unsupervised anomaly detection via variational auto-encoder for seasonal KPIs in web applications. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp. 187–196. International World Wide Web Conferences Steering Committee (2018)

    Google Scholar 

  32. Zasadziński, M., Solé, M., Brandon, A., Muntés-Mulero, V., Carrera, D.: Next stop “NoOps”: enabling cross-system diagnostics through graph-based composition of logs and metrics. In: 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp. 212–222 (2018)

    Google Scholar 

  33. Zhang, X., et al.: Robust log-based anomaly detection on unstable log data. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pp. 807–817, New York, NY, USA. Association for Computing Machinery (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kęstutis Daugėla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Daugėla, K., Vaičiukynas, E. (2022). Real-Time Anomaly Detection for Distributed Systems Logs Using Apache Kafka and H2O.ai. In: Lopata, A., Gudonienė, D., Butkienė, R. (eds) Information and Software Technologies. ICIST 2022. Communications in Computer and Information Science, vol 1665. Springer, Cham. https://doi.org/10.1007/978-3-031-16302-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16302-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16301-2

  • Online ISBN: 978-3-031-16302-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics