Skip to main content

Performance/Energy Aware Optimization of Parallel Applications on GPUs Under Power Capping

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2019)

Abstract

In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the benchmarks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm-benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance and energy consumption. We present two: both energy savings and performance drops for same power caps as well as a normalized performance-energy consumption product. It is important that optimal configurations are often non-trivial i.e. are obtained for power caps smaller than default and larger than minimal allowed limits. Tests have been performed for two modern GPUs of Pascal and Turing generations i.e. NVIDIA GTX 1070 and NVIDIA RTX 2080 respectively and thus results can be useful for many applications with profiles similar to the benchmarks executed on modern GPU based systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abe, Y., Sasaki, H., Peres, M., Inoue, K., Murakami, K., Kato, S.: Power and performance analysis of GPU-accelerated systems. Presented as part of the 2012 Workshop on Power-Aware Computing and Systems, Hollywood, CA. USENIX (2012)

    Google Scholar 

  2. Bridges, R.A., Imam, N., Mintz, T.M.: Understanding GPU power: a survey of profiling, modeling, and simulation methods. ACM Comput. Surv. 49(3), 41:1–41:27 (2016)

    Article  Google Scholar 

  3. Carreño, E.D., Sarates Jr., A.S., Navaux, P.O.A.: A mechanism to reduce energy waste in the post-execution of GPU applications. J. Phys.: Conf. Ser. 649(1), 012002 (2015)

    Google Scholar 

  4. Choi, J.W., Bedard, D., Fowler, R., Vuduc, R.: A roofline model of energy. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 661–672, May 2013

    Google Scholar 

  5. Czarnul, P.: Parallelization of large vector similarity computations in a hybrid CPU+GPU environment. J. Supercomput. 74(2), 768–786 (2017). https://doi.org/10.1007/s11227-017-2159-7

    Article  Google Scholar 

  6. Czarnul, P., Proficz, J., Krzywaniak, A.: Energy-aware high-performance computing: survey of state-of-the-art tools, techniques, and environments. Sci. Program. 2019, 8348791:1–8348791:19 (2019)

    Google Scholar 

  7. Dümmler, J.: NPB-CUDA. Technische Universitat Chemnitz. https://www.tu-chemnitz.de/informatik/PI/sonstiges/downloads/npb-gpu/index.php.en

  8. Ge, R., Vogt, R., Majumder, J., Alam, A., Burtscher, M., Zong, Z.: Effects of dynamic voltage and frequency scaling on a k20 GPU. In: 2013 42nd International Conference on Parallel Processing, pp. 826–833, October 2013

    Google Scholar 

  9. Hong, S., Kim, H.: An integrated GPU power and performance model. SIGARCH Comput. Archit. News 38(3), 280–289 (2010)

    Article  Google Scholar 

  10. Huang, S., Xiao, S., Feng, W.: On the energy efficiency of graphics processing units for scientific computing. In: 2009 IEEE International Symposium on Parallel Distributed Processing, pp. 1–8, May 2009

    Google Scholar 

  11. Komoda, T., Hayashi, S., Nakada, T., Miwa, S., Nakamura, H.: Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task mapping. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 349–356, October 2013

    Google Scholar 

  12. Krzywaniak, A., Proficz, J., Czarnul, P.: Analyzing energy/performance trade-offs with power capping for parallel applications on modern multi and many core processors. In: 2018 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 339–346, September 2018

    Google Scholar 

  13. Krzywaniak, A., Czarnul, P.: Parallelization of selected algorithms on multi-core CPUs, a cluster and in a hybrid CPU+Xeon Phi environment. In: Borzemski, L., Świątek, J., Wilimowska, Z. (eds.) ISAT 2017. AISC, vol. 655, pp. 292–301. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67220-5_27

    Chapter  Google Scholar 

  14. Krzywaniak, A., Czarnul, P., Proficz, J.: Extended investigation of performance-energy trade-offs under power capping in HPC environments. Accepted for International Conference on High Performance Computing & Simulation (HPCS 2019), Dublin, Ireland (in press)

    Google Scholar 

  15. Leng, J., et al.: GPUWattch: enabling energy optimizations in GPGPUs. SIGARCH Comput. Archit. News 41(3), 487–498 (2013)

    Article  Google Scholar 

  16. Libuschewski, P., Marwedel, P., Siedhoff, D., Müller, H.: Multi-objective, energy-aware GPGPU design space exploration for medical or industrial applications. In: 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems, pp. 637–644, November 2014

    Google Scholar 

  17. Lucas, J., Juurlink, B.: MEMPower: data-aware GPU memory power model. In: Schoeberl, M., Hochberger, C., Uhrig, S., Brehm, J., Pionteck, T. (eds.) ARCS 2019. LNCS, vol. 11479, pp. 195–207. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18656-2_15

    Chapter  Google Scholar 

  18. He Ma. cublasgemm-benchmark. University of Guelph, Canada. https://github.com/hma02/cublasgemm-benchmark

  19. Mittal, S., Vetter, J.S.: A survey of methods for analyzing and improving GPU energy efficiency. CoRR, abs/1404.4629 (2014)

    Google Scholar 

  20. Rojek, K.: Machine learning method for energy reduction by utilizing dynamic mixed precision on GPU-based supercomputers. Concurr. Comput.: Pract. Exp. 31(6), e4644 (2019)

    Article  MathSciNet  Google Scholar 

  21. Tsuzuku, K., Endo, T.: Power capping of CPU-GPU heterogeneous systems using power and performance models. In: 2015 International Conference on Smart Cities and Green ICT Systems (SMARTGREENS), pp. 1–8, May 2015

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paweł Czarnul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Krzywaniak, A., Czarnul, P. (2020). Performance/Energy Aware Optimization of Parallel Applications on GPUs Under Power Capping. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2019. Lecture Notes in Computer Science(), vol 12044. Springer, Cham. https://doi.org/10.1007/978-3-030-43222-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-43222-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-43221-8

  • Online ISBN: 978-3-030-43222-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics