Abstract
Understanding the computational requirements of scientific applications and their relation to power consumption is a fundamental task to overcome the current barriers to achieve the computational exascale. However, this imposes some challenging tasks, such as to monitor a wide range of parameters in heterogeneous environments, to enable fine grained profiling and power consumed across different components, to be language independent and to avoid code instrumentation. Considering these challenges, this work proposes the SMCis, an application monitoring tool developed with the goal of collecting all these aspects in an effective and accurate way, as well as to correlate these data graphically, with the environment of analysis and visualization. In addition, SMCis integrates and facilitates the use of Machine Learning tools for the development of predictive runtime and power consumption models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The SMCis is available in https://github.com/ViniciusPrataKloh/SMCis.
References
High Performance Computing for Energy (HPC4E), August 2017. https://hpc4e.eu
Wattsup? pro, September 2017. http://www.wattsupmeters.com
Adhianto, L., et al.: HPCTOOLKIT: tools for performance analysis of optimized parallel programs. Concurr. Comput. : Pract. Exper. 22(6), 685–701 (2010). https://doi.org/10.1002/cpe.v22:6
Alvin, K., et al.: On the path to exascale. Int. J. Distrib. Syst. Technol. 1(2), 1–22 (2010). https://doi.org/10.4018/jdst.2010040101
Ashby, S., et al.: The opportunities and challenges of exascale computing. Summary report of the advanced scientific computing advisory committee (ASCAC) subcommittee at the US Department of Energy Office of Science (2010)
Balladini, J., Morán, M., Rexachs del Rosario, D., et al.: Metodología para predecir el consumo energético de checkpoints en sistemas de hpc. In: XX Congreso Argentino de Ciencias de la Computación (Buenos Aires 2014) (2014)
Bedard, D., Fowler, R., Lim, M.Y., Porterfield, A.: PowerMon 2: fine-grained, integrated power measurement. Technical report TR-09-04, RENCI Technical Report (2009). http://renci.org/technical-reports/tr-09-04/
Bergman, K., et al.: Exascale computing study: technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Technical report 15 (2008)
Berral, J.L., Gavalda, R., Torres, J.: Power-aware multi-data center management using machine learning. In: 2013 42nd International Conference on Parallel Processing, pp. 858–867. IEEE (2013)
Bhimani, J., Mi, N., Leeser, M., Yang, Z.: FIM: performance prediction for parallel computation in iterative data processing applications. In: 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), pp. 359–366. IEEE (2017)
Bridges, R.A., Imam, N., Mintz, T.M.: Understanding GPU power: a survey of profiling, modeling and simulation methods. ACM Comput. Surv. 49(3), 41:1–41:27 (2016). https://doi.org/10.1145/2962131
Che, S., et al.: Rodinia: a benchmark suite for heterogeneous computing. In: IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54. IEEE (2009)
Ferro, M., Nicolás, M.F., del Rosario, Q., Saji, G., Mury, A.R., Schulze, B.: Leveraging high performance computing for bioinformatics: a methodology that enables a reliable decision-making. In: 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2016, Cartagena, Colômbia, 16–19 May 2016, pp. 684–692. IEEE Computer Society (2016). https://doi.org/10.1109/CCGrid.2016.69
Ferro, M., Silva, G.D., Klóh, V.P., Schulze, B.: Challenges in HPC Evaluation: Towards a Methodology for Scientific Applications’ Requirements. IOS Press, Amsterdam (2017, accepted to publish)
Ge, R., Li, D., Chang, H.C., Cameron, K.W., Feng, X., Song, S.: PowerPack: energy profiling and analysis of high-performance systems and applications. IEEE Trans. Parallel Distrib. Syst. 21, 658–671 (2009). https://doi.org/doi.ieeecomputersociety.org/10.1109/TPDS.2009.76
Guthrie, M.: Instant Nagios Starter. Packt Publishing (2013)
Ibeid, H., Meng, S., Dobon, O., Olson, L., Gropp, W.: Learning with analytical models. In: 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019, pp. 778–786. IEEE Computer Society (2019)
Jaiantilal, A., Jiang, Y., Mishra, S.: Modeling CPU energy consumption for energy efficient scheduling. In: Proceedings of the 1st Workshop on Green Computing, pp. 10–15. ACM (2010)
Klôh, V.P., Ferro, M., Silva, G.D., Schulze, B.: Performance monitoring using nagios core. Relatórios de Pesquisa e Desenvolvimento do LNCC 03/2016, Laboratório Nacional de Computação Científica, Petropolis - RJ (2016). www.lncc.br
Kogge, P., et al.: Exascale computing study: technology challenges in achieving exascale systems. Technical report, DARPA IPTO, Air Force Research Labs, September 2008
Labasan, S.: Energy-efficient and power-constrained techniques for exascale computing (2016). Oral Comprehensive Exam. http://www.cs.uoregon.edu/Reports/ORAL-201610-Labasan.pdf. Accessed 18 May 2017
Ll Berral, J., Gavaldà, R., Torres, J.: Empowering automatic data-center management with machine learning. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, pp. 170–172. ACM (2013)
Martínez, V., Dupros, F., Castro, M., Navaux, P.: Performance improvement of stencil computations for multi-core architectures based on machine learning. Procedia Comput. Sci. 108, 305–314 (2017)
Messina, P.: The exascale computing project. Comput. Sci. Eng. 19(3), 63–67 (2017). https://doi.org/10.1109/MCSE.2017.57
Nagios Team: Nagios Core Documentation (2016). https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/
NVIDIA: NVML API Reference Manual (2012). http://developer.download.nvidia.com/assets/cuda/files/CUDADownloads/NVML/nvml.pdf
Patterson, D.: Orgins and Vision of the UC Berkeley Parallel Computing Laboratory, chap. 1, 1 edn, pp. 11–42. Microsoft Corporation (2013). http://books.google.com.br/books?id=2mJxngEACAAJ
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Rajovic, N., Carpenter, P.M., Gelado, I., Puzovic, N., Ramirez, A., Valero, M.: Supercomputing with commodity CPUs: are mobile SoCs ready for HPC? In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2013, pp. 40:1–40:12. ACM, New York (2013). https://doi.org/10.1145/2503210.2503281
Reed, D.A., Aydt, R.A., Madhyastha, T.M., Noe, R.J., Shields, K.A., Schwartz, B.W.: An overview of the Pablo performance analysis environment. Department of Computer Science, University of Illinois 1304 (1992)
Reinders, J.: VTune performance analyzer essentials (2005). http://nacad.ufrj.br/online/intel/vtune/Essentials_Excerpts.pdf
Rodola, G.: psutil documentation (2018). https://media.readthedocs.org/pdf/psutil/latest/psutil.pdf
Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006). https://doi.org/10.1177/1094342006064482
Siegmund, N., Grebhahn, A., Apel, S., Kästner, C.: Performance-influence models for highly configurable systems. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, pp. 284–294. ACM (2015)
Wang, Q., Chu, X.: GPGPU power estimation with core and memory frequency scaling. SIGMETRICS Perform. Eval. Rev. 45(2), 73–78 (2017). https://doi.org/10.1145/3152042.3152066
Wu, X., Taylor, V., Cook, J., Mucci, P.J.: Using performance-power modeling to improve energy efficiency of hpc applications. Computer 49(10), 20–29 (2016)
Zomaya, A.Y., Lee, Y.C.: Energy Efficient Distributed Computing Systems, 1st edn. Wiley-IEEE Computer Society Press (2012)
Acknowledgments
This work received financial support from the CNPQ, the EU Program H2020 and the MCTI/RNP-Brazil in the scope of project HPC4e, subsidy contract to No. 689772. Also from FAPERJ process number 26/202.500/2018 and CAPES.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Silva, G., Klôh, V., Yokoyama, A., Gritz, M., Schulze, B., Ferro, M. (2020). SMCis: Scientific Applications Monitoring and Prediction for HPC Environments. In: Bianchini, C., Osthoff, C., Souza, P., Ferreira, R. (eds) High Performance Computing Systems. WSCAD 2018. Communications in Computer and Information Science, vol 1171. Springer, Cham. https://doi.org/10.1007/978-3-030-41050-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-41050-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41049-0
Online ISBN: 978-3-030-41050-6
eBook Packages: Computer ScienceComputer Science (R0)