Skip to main content

Real-Time I/O-Monitoring of HPC Applications with SIOX, Elasticsearch, Grafana and FUSE

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10524))

Included in the following conference series:

Abstract

The starting point for our work was a demand for an overview of application’s I/O behavior, that provides information about the usage of our HPC “Mistral”. We suspect that some applications are running using inefficient I/O patterns, and probably, are wasting a significant amount of machine hours. To tackle the problem, we focus on detection of poor I/O performance, identification of these applications, and description of I/O behavior.

Instead of gathering I/O statistics from global system variables, like many other monitoring tools do, in our approach statistics come directly from I/O interfaces POSIX, MPI, HDF5 and NetCDF. For interception of I/O calls we use an instrumentation library that is dynamically linked with LD_PRELOAD at program startup.

The HPC on-line monitoring framework is built on top of open source software: Grafana, SIOX, Elasticsearch and FUSE. This framework collects I/O statistics from applications and mount points. The latter is used for non-intrusive monitoring of virtual memory allocated with mmap(), i.e., no code adaption is necessary. The framework is evaluated showing its effectiveness and critically discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Darshan HPC I/O Characterization Tool (2015). http://www.mcs.anl.gov/research/projects/darshan/

  2. SCORE-P (2015). http://www.vi-hps.org/projects/score-p/

  3. Vampir (2015). http://www.paratools.com/Vampir

  4. Mistral (2016). https://www.dkrz.de/Nutzerportal-en/doku/mistral

  5. Beautiful metric & analytic dashboards (2017). http://grafana.org/

  6. Carns, P.: Darshan. In: High Performance Parallel I/O. Computational Science Series, pp. 309–315. Chapman & Hall/CRC (2015)

    Google Scholar 

  7. Gormley, C., Tong, Z.: Elasticsearch: The Definitive Guide, 1st edn. O’Reilly Media, Inc., Sebastopol (2015)

    Google Scholar 

  8. Kahanwal, B.: File System Design Approaches. CoRR abs/1403.5976 (2014). http://arxiv.org/abs/1403.5976

  9. Knüpfer, A., Rössel, C., an Mey, D., Biersdorff, S., Diethelm, K., Eschweiler, D., Geimer, M., Gerndt, M., Lorenz, D., Malony, A., Nagel, W.E., Oleynik, Y., Philippen, P., Saviankou, P., Schmidl, D., Shende, S., Tschüter, R., Wagner, M., Wesarg, B., Wolf, F.: Score-P: a joint performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir. In: Brunst, H., Müller, M., Nagel, W., Resch, M. (eds.) Tools for High Performance Computing, pp. 79–91. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31476-6_7

    Google Scholar 

  10. Kunkel, J., Zimmer, M., Hübbe, N., Aguilera, A., Mickler, H., Xuan Wang, A.C., Thomas Bönisch, J.L., Michel, R., Weging, J.: The SIOX architecture – coupling automatic monitoring and optimization of parallel I/O (2014)

    Google Scholar 

  11. Thakur, R., Gropp, W., Lusk, E.: On implementing MPI-IO portably and with high performance. In: Proceedings of the Sixth Workshop on I/O in Parallel and Distributed Systems, IOPADS 1999, pp. 23–32. ACM, New York (1999). http://doi.acm.org/10.1145/301816.301826

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Eugen Betke or Julian Kunkel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Betke, E., Kunkel, J. (2017). Real-Time I/O-Monitoring of HPC Applications with SIOX, Elasticsearch, Grafana and FUSE. In: Kunkel, J., Yokota, R., Taufer, M., Shalf, J. (eds) High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science(), vol 10524. Springer, Cham. https://doi.org/10.1007/978-3-319-67630-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67630-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67629-6

  • Online ISBN: 978-3-319-67630-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics