Skip to main content

K-means Application for Anomaly Detection and Log Classification in HPC

  • Conference paper
  • First Online:
Advances in Artificial Intelligence: From Theory to Practice (IEA/AIE 2017)

Abstract

Detecting anomalies in the flow of system logs of a high performance computing (HPC) facility is a challenging task. Although previous research has been conducted to identify nominal and abnormal phases; practical ways to provide system administrators with a reduced set of the most useful messages to identify abnormal behaviour remains a challenge. In this paper we describe an extensive study of logs classification and anomaly detection using K-means on real HPC unlabelled data extracted from the Curie supercomputer. This method involves (1) classifying logs by format, which is a valuable information for admin, then (2) build normal and abnormal classes for anomaly detection. Our methodology shows good performances for clustering and detecting abnormal logs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://shinken-monitoring.org.

  2. 2.

    https://graphiteapp.org.

References

  1. Morey, J.-M.: Numerical simulation at CEA. In: Proceedings of SNA + MC (2013)

    Google Scholar 

  2. David, J.: Building a Monitoring Infrastructure with Nagios. Prentice Hall PTR, Upper Saddle River (2007)

    Google Scholar 

  3. Bautista, E., Whitney, C., Davis, T.: Big data behind big data. In: Arora, R. (ed.) Conquering Big Data with High Performance Computing, pp. 163–189. Springer, Cham (2016)

    Chapter  Google Scholar 

  4. Sigoure, B.: OpenTSDB scalable time series database (TSDB) (2012)

    Google Scholar 

  5. Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. In: Proceedings of The NetDB, pp. 1–7 (2011)

    Google Scholar 

  6. Reelsen, A.: Using elasticsearch, logstash and kibana to create realtime dashboards (2014)

    Google Scholar 

  7. Ning, X., Jiang, G., Chen, H., Yoshihira, K.: HLAer: a system for heterogeneous log analysis

    Google Scholar 

  8. Aggarwal, C.C., Yu, P.: Outlier detection with uncertain data. In: SDM (2008)

    Google Scholar 

  9. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41, 15 (2009)

    Article  Google Scholar 

  10. Gupta, M., Han, J., Aggarwal, C., Gao, J.: Outlier detection for temporal data: a survey. IEEE Trans. Knowl. Data Eng. 26, 2250–2267 (2014)

    Article  MATH  Google Scholar 

  11. Stearley, J.: Towards informatic analysis of syslogs. In: Cluster Computing. IEEE (2004)

    Google Scholar 

  12. Chuah, E., Jhumka, A., Narasimhamurthy, S., et al.: Linking resource usage anomalies with system failures from cluster log data. IEEE (2013)

    Google Scholar 

  13. Gurumdimma, N., Jhumka, A., et al.: CRUDE: combining resource usage data and error logs for accurate error detection in large-scale distributed systems. IEEE (2016)

    Google Scholar 

  14. Rajaraman, A., Ullman, J.D.: Data mining. In: Mining of Massive Datasets (PDF) (2011)

    Google Scholar 

  15. MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. University of California Press, Berkeley (1967)

    MATH  Google Scholar 

  16. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csáki, F. (eds.) 2nd International Symposium on Information Theory, Tsahkadsor, Armenia, USSR, September 2–8 (1971)

    Google Scholar 

  17. Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  18. Münz, G., Li, S., Carle, G.: Traffic anomaly detection using k-means clustering. In: GI/ITG-Workshop MMBnet, September 2007

    Google Scholar 

  19. Larsen, B., Aone, C.: Fast and effective text mining using linear-time document clustering. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henri Doreau .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Dani, M.C., Doreau, H., Alt, S. (2017). K-means Application for Anomaly Detection and Log Classification in HPC. In: Benferhat, S., Tabia, K., Ali, M. (eds) Advances in Artificial Intelligence: From Theory to Practice. IEA/AIE 2017. Lecture Notes in Computer Science(), vol 10351. Springer, Cham. https://doi.org/10.1007/978-3-319-60045-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60045-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60044-4

  • Online ISBN: 978-3-319-60045-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics