Skip to main content

Case Studies of Multi-core Energy Efficiency in Task Based Programs

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7453))

Abstract

In this paper, we present three performance and energy case studies of benchmark applications in the OmpSs environment for task based programming. Different parallel and vectorized implementations are evaluated on an Intel® CoreTMi7-2600 quad-core processor. Using FLOPS/W derived from chip MSR registers, we find AVX code to be clearly most energy efficient in general. The peak on-chip GFLOPS/W rates are: Black-Scholes (BS) 0.89, FFTW 1.38 and Matrix Multiply (MM) 1.97. Experiments cover variable degrees of thread parallelism and different OmpSs task pool scheduling policies. We find that maximum energy efficiency for small and medium sized problems is obtained by limiting the number of parallel threads. Comparison of AVX variants with non-vectorized code shows ≈ 6 − 7 × (BS) and ≈ 3 − 5 × (FFTW) improvements in on-chip energy efficiency, depending on the problem size and degree of multithreading.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mont Blanc project website, http://www.montblanc-project.eu/

  2. The Green 500 - Ranking the World’s Most Energy Efficient Supercomputers, http://www.green500.org

  3. Perez, J., Badia, R., Labarta, J.: A dependency-aware task-based programming environment for multi-core architectures. In: 2008 IEEE Int’l Conf. on Cluster Computing, pp. 142–151 (October 2008)

    Google Scholar 

  4. Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: OmpSs: A Proposal for Programming Heterogeneous Multi-core Architetcures. Parallel Processing Letters 21, 173–193 (2011)

    Article  MathSciNet  Google Scholar 

  5. Ramirez, A.: European scalable and power efficient HPC platform based on low-power embedded technology. Presentation at the EESI Conference (October 2011), http://www.eesi-project.eu/

  6. Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated empirical optimizations of software and the ATLAS project. Parallel Computing 27(12), 3–35 (2001)

    Article  MATH  Google Scholar 

  7. Intel, Intel®64 and IA-32 Architectures Optimization Reference Manual (June 2011)

    Google Scholar 

  8. Intel, Avoiding AVX-SSE Transition Penalties (November 2011)

    Google Scholar 

  9. Rivoire, S., Shah, M., Ranganatban, P., Kozyrakis, C., Meza, J.: Models and metrics to enable energy-efficiency optimizations. Computer 40, 39–48 (2007)

    Article  Google Scholar 

  10. Lien, H.: Case Studies in Multi-core Energy Efficiency of Task Based Programs (preliminary title). Master’s thesis, Norwegian University of Science and Technologoy, Trondheim, Norway (Work in progress, to be submitted July 2012)

    Google Scholar 

  11. Intel, Intel®64 and IA-32 Architecture Software Development Manual (December 2011)

    Google Scholar 

  12. Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proc. of the 17th Int’l Conf. on Parallel Architectures and Compilation Techniques, PACT 2008, pp. 72–81 (2008)

    Google Scholar 

  13. Frigo, M., Johnson, S.: The Design and Implementation of FFTW3. Proceedings of the IEEE 93, 216–231 (2005)

    Article  Google Scholar 

  14. Moshier, S.L.: Cephes Math Library, http://www.netlib.org/cephes

  15. Ge, R., Feng, X., Song, S., Chang, H.-C., Li, D., Cameron, K.: Powerpack: Energy profiling and analysis of high-performance systems and applications. IEEE Transactions on Parallel and Distributed Systems 21, 658–671 (2010)

    Article  Google Scholar 

  16. Li, J., Martínez, J.F.: Power-performance considerations of parallel computing on chip multiprocessors. ACM Transactions on Architecture and Code Optimization 2, 397–422 (2005)

    Article  Google Scholar 

  17. Molka, D., Hackenberg, D., Schöne, R., Minartz, T., Nagel, W.: Flexible workload generation for HPC cluster efficiency benchmarking. Computer Science - Research and Development, 1–9

    Google Scholar 

  18. Anzt, H., Castillo, M., Fernández, J., Heuveline, V., Igual, F., Mayo, R., Quintana-Ortí, E.: Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors. Computer Science - Research and Development, 1–9

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lien, H., Natvig, L., Al Hasib, A., Meyer, J.C. (2012). Case Studies of Multi-core Energy Efficiency in Task Based Programs. In: Auweter, A., Kranzlmüller, D., Tahamtan, A., Tjoa, A.M. (eds) ICT as Key Technology against Global Warming. ICT-GLOW 2012. Lecture Notes in Computer Science, vol 7453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32606-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32606-6_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32605-9

  • Online ISBN: 978-3-642-32606-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics