Skip to main content
Log in

Automatic detection of power bottlenecks in parallel scientific applications

  • Special Issue Paper
  • Published:
Computer Science - Research and Development

Abstract

In this paper we present an extension of the pmlib framework for power-performance analysis that permits a rapid and automatic detection of power sinks during the execution of concurrent scientific workloads. The extension is shaped in the form of a multithreaded Python module that offers high reliability and flexibility, rendering an overall inspection process that introduces low overhead. Additionally, we investigate the advantages and drawbacks of the RAPL power model, introduced in the Intel Xeon “Sandy-Bridge” CPU, versus a data acquisition system from National Instruments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://ilupack.tu-bs.de.

References

  1. Albers S (2010) Energy-efficient algorithms. Commun ACM 53:86–96

    Article  Google Scholar 

  2. Aliaga JI, Bollhöfer M, Martín AF, Quintana-Ortí ES (2011) Exploiting thread-level parallelism in the iterative solution of sparse linear systems. Parallel Comput 37(3):183–202

    Article  MATH  MathSciNet  Google Scholar 

  3. Aliaga JI, Dolz MF, Martín AF, Mayo R, Quintana-Ortí ES (2012) Leveraging task-parallelism in energy-efficient ILU preconditioners. In: 2nd int con on ICT as key technology against global warming—ICT-GLOW. Lecture notes in computer science, vol 7453, pp 55–63

    Chapter  Google Scholar 

  4. Alonso P, Badia RM, Labarta J, Barreda M, Dolz MF, Mayo R, Quintana-Ortí ES, Reyes R (2012) Tools for power-energy modelling and analysis of parallel scientific applications. In: 41st int conf on parallel processing—ICPP, pp 420–429

    Google Scholar 

  5. Alonso P, Dolz MF, Igual FD, Mayo R, Quintana-Ortí ES (2012) Reducing energy consumption of dense linear algebra operations on hybrid CPU-GPU platforms. In: Proc 10th IEEE int symp on parallel and distributed processing with applications—ISPA 2012, pp 56–62

    Chapter  Google Scholar 

  6. Alonso P, Dolz MF, Igual FD, Quintana-Ortí ES, Mayo R (2013) Runtime scheduling of the LU factorization: performance and energy. In: Proc energy efficiency in large scale distributed systems conference—EE-LSDS 2013 (to appear)

    Google Scholar 

  7. Ashby S et al. (2010) The opportunities and challenges of Exascale computing. In: Summary report of the advanced scientific computing advisory committee (ASCAC) subcommittee. http://science.energy.gov/~/media/ascr/ascac/pdf/reports/Exascale_subcommittee_report.pdf

    Google Scholar 

  8. Barreda M, Barrachina S, Catalán S, Dolz MF, Fabregat G, Mayo R, Quintana ES (2013) A framework for power-performance analysis of parallel scientific applications. In: Third int conference on smart grids, green communications and IT energy-aware technologies—Energy 2013, pp 114–119

    Google Scholar 

  9. Bergman K et al. (2008) Exascale computing study: technology challenges in achieving exascale systems. In: DARPA IPTO Exascale computing study. http://computationalsciencesolutions.com/docs/DARPA

    Google Scholar 

  10. Castillo M, Fernández JC, Mayo R, Quintana-Ortí ES, Roca V (2012) Analysis of strategies to save energy for message-passing dense linear algebra kernels. In: Proc 20th euromicro conference on parallel, distributed and network based processing, pp 346–352

    Google Scholar 

  11. Dongarra J et al. (2011) The international Exascale software project roadmap. Int J High Perform Comput Appl 25(1):3–60

    Article  Google Scholar 

  12. Duranton M et al. (2013) The HiPEAC vision for advanced computing in horizon 2020. http://www.hipeac.net/roadmap

    Google Scholar 

  13. El Mehdi Diouri M, Dolz MF, Glück O, Lefèvre L, Alonso P, Catalán S, Mayo R, Quintana-Ortí ES (2013) Solving some mysteries in power monitoring of servers: take care of your wattmeters! In: Proc energy efficiency in large scale distributed systems conference—EE-LSDS 2013 (to appear)

    Google Scholar 

  14. HP Corp, Intel Corp, Microsoft Corp, Phoenix Tech Ltd, Toshiba Corp (2011) Advanced configuration and power interface specification, revision 5.0

    Google Scholar 

  15. Intel Corp (2012) Intel 64 and IA-32 architectures software developer manual

    Google Scholar 

  16. Intel Corp (2012) Intel Xeon processor. http://www.intel.com/xeon

    Google Scholar 

  17. Intel: Intel math kernel library (mkl) 11.0. http://software.intel.com/en-us/intel-mkl

  18. Knüpfer A, Brunst H et al. (2008) The vampir performance analysis tool-set. In: Tools for high performance computing, pp 139–155

    Chapter  Google Scholar 

  19. Kunkel J (2011) HDTrace—a tracing and simulation environment of application and system interaction. Tech Rep 2, Department of Informatics, Scientific Computing. Universität Hamburg

  20. Mienik M CPU burn-in v1.01. http://www.cpuburnin.com/

  21. NVIDIA Corporation (2009) NVIDIA CUDA compute unified device architecture programming guide, 2.3.1 edn.

    Google Scholar 

  22. Official Website. Python Programming Language. http://www.python.org/

  23. Pillet V, Labarta J, Cortes T, Girona S (1995) Paraver: a tool to visualize and analyze parallel code. In: 18th world OCCAM and transputer user group technical meeting

    Google Scholar 

  24. Quintana-Ortí G, Igual FD, Quintana-Ortí ES, van de Geijn RA (2009) Solving dense linear systems on platforms with multiple hardware accelerators. SIGPLAN Not 44(4):121–130. doi:10.1145/1594835.1504196

    Article  Google Scholar 

  25. Quintana-Ortí G, Quintana-Ortí E, van de Geijn R, Zee FV, Chan E (2009) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36(3):14:1–14:26

    Article  Google Scholar 

  26. Saxe E (2010) Power-efficient software. In: ACM queue

    Google Scholar 

  27. Servat H, Llort G Extrae user guide manual for version 2.1.1

  28. Shende SS, Malony AD (2006) The tau parallel performance system. Int J High Perform Comput Appl 20(2):287–311

    Article  Google Scholar 

  29. The Green500 list (2012). http://www.green500.org

Download references

Acknowledgements

This work was supported by the CICYT project TIN2011-23283 of the Ministerio de Economía y Competitividad and FEDER, and the EU Project FP7 318793 “EXA2GREEN”.

We thank Rosa M. Badia, from the Barcelona Supercomputing Center (BSC) and the Spanish National Research Council (CSIC), for providing several of the OMPSs applications used for the calibration of pmlib.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to María Barreda.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barreda, M., Catalán, S., Dolz, M.F. et al. Automatic detection of power bottlenecks in parallel scientific applications. Comput Sci Res Dev 29, 221–229 (2014). https://doi.org/10.1007/s00450-013-0242-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-013-0242-8

Keywords

Navigation