Multi Objective Optimization of HPC Kernels for Performance, Power, and Energy

Balaprakash, Prasanna; Tiwari, Ananta; Wild, Stefan M.

doi:10.1007/978-3-319-10214-6_12

Prasanna Balaprakash^16,17,
Ananta Tiwari¹⁸ &
Stefan M. Wild¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8551))

Included in the following conference series:

International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems

918 Accesses
14 Citations

Abstract

Code optimization in the high-performance computing realm has traditionally focused on reducing execution time. The problem, in mathematical terms, has been expressed as a single objective optimization problem. The expected concerns of next-generation systems, however, demand a more detailed analysis of the interplay among execution time and other metrics. Metrics such as power, performance, energy, and resiliency may all be targeted together and traded against one another. We present a multi objective formulation of the code optimization problem. Our proposed framework helps one explore potential tradeoffs among multiple objectives and provides a significantly richer analysis than can be achieved by treating additional metrics as hard constraints. We empirically examine a variety of metrics, architectures, and code optimization decisions and provide evidence that such tradeoffs exist in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kogge, P.: The tops in flops. IEEE Spectrum 48(2), 48–54 (2011)
Article Google Scholar
TOP500 List: June 2013 Report, http://www.top500.org
Balaprakash, P., Wild, S.M., Hovland, P.D.: Can search algorithms save large-scale automatic performance tuning? Procedia Computer Science 4, 2136–2145 (2011)
Article Google Scholar
Kadayif, I., Kandemir, M., Vijaykrishnan, N., Irwin, M., Sivasubramaniam, A.: EAC: A compiler framework for high-level energy estimation and optimization. In: Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pp. 436–442. IEEE (2002)
Google Scholar
Kodi, A., Louri, A.: Performance adaptive power-aware reconfigurable optical interconnects for high-performance computing (HPC) systems. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC), pp. 1–12 (2007)
Google Scholar
Ahmad, I., Ranka, S., Khan, S.U.: Using game theory for scheduling tasks on multi-core processors for simultaneous optimization of performance and energy. In: IEEE International Symposium on Parallel and Distributed Processing (IPDPS), pp. 1–6. IEEE (2008)
Google Scholar
Azizi, O., Mahesri, A., Lee, B.C., Patel, S.J., Horowitz, M.: Energy-performance tradeoffs in processor architecture and circuit design: A marginal cost analysis. In: ACM SIGARCH Computer Architecture News, vol. 38, pp. 26–36. ACM (2010)
Google Scholar
Tiwari, A., Laurenzano, M.A., Carrington, L., Snavely, A.: Modeling power and energy usage of HPC kernels. In: IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 990–998. IEEE (2012)
Google Scholar
Choi, J.W., Bedard, D., Fowler, R., Vuduc, R.: A roofline model of energy. In: 2013 IEEE 27th International Symposium on Parallel Distributed Processing (IPDPS), pp. 661–672. IEEE (May 2013)
Google Scholar
Ascia, G., Catania, V., Palesi, M.: Multi-objective mapping for mesh-based NoC architectures. In: Proceedings of the 2nd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, pp. 182–187. ACM (2004)
Google Scholar
Jahr, R., Ungerer, T., Calborean, H., Vintan, L.: Automatic multi-objective optimization of parameters for hardware and code optimizations. In: International Conference on High Performance Computing and Simulation (HPCS), pp. 308–316. IEEE (2011)
Google Scholar
Park, S., Jiang, W., Zhou, Y., Adve, S.: Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures. In: ACM SIGMETRICS Performance Evaluation Review, vol. 35, pp. 169–180 (2007)
Google Scholar
Bedard, D., Lim, M.Y., Fowler, R., Porterfield, A.: PowerMon: Fine-grained and integrated power monitoring for commodity computer systems. In: IEEE SoutheastCon 2010, pp. 479–484 (2010)
Google Scholar
Li, D., de Supinski, B.R., Schulz, M., Cameron, K., Nikolopoulos, D.S.: Hybrid MPI/OpenMP power-aware computing. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–12. IEEE (2010)
Google Scholar
Rahman, S.F., Guo, J., Yi, Q.: Automated empirical tuning of scientific codes for performance and power consumption. In: Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, pp. 107–116. ACM (2011)
Google Scholar
Lively, C., Wu, X., Taylor, V., Moore, S., Chang, H.C., Cameron, K.: Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems. International Journal of High Performance Computing Applications 25(3), 342–350 (2011)
Article Google Scholar
Ţăpuş, C., Chung, I.H., Hollingsworth, J.K.: Active harmony: towards automated performance tuning. In: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, Supercomputing 2002, pp. 1–11. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Tiwari, A., Laurenzano, M.A., Carrington, L., Snavely, A.: Auto-tuning for energy usage in scientific applications. In: Alexander, M., et al. (eds.) Euro-Par 2011, Part II. LNCS, vol. 7156, pp. 178–187. Springer, Heidelberg (2012)
Google Scholar
Laros III, J.H.: Measuring and tuning energy efficiency on large scale high performance computing platforms. Technical Report SAND2011-5702, Sandia National Laboratories (August 2011)
Google Scholar
Heydemann, K., Bodin, F.: Iterative compilation for two antagonistic criteria: Application to code size and performance. In: Proceedings of the 4th Workshop on Optimizations for DSP and Embedded Systems (2006)
Google Scholar
Hoste, K., Eeckhout, L.: Cole: Compiler optimization level exploration. In: Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 165–174. ACM (2008)
Google Scholar
Lokuciejewski, P., Plazar, S., Falk, H., Marwedel, P., Thiele, L.: Multi-objective exploration of compiler optimizations for real-time systems. In: 13th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing (ISORC), pp. 115–122 (2010)
Google Scholar
Hoste, K., Georges, A., Eeckhout, L.: Automated just-in-time compiler tuning. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 62–72. ACM (2010)
Google Scholar
Fursin, G., Kashnikov, Y., Memon, A.W., Chamski, Z., Temam, O., Namolaru, M., Yom-Tov, E., Mendelson, B., Zaks, A., Courtois, E., et al.: Milepost gcc: Machine learning enabled self-tuning compiler. International Journal of Parallel Programming 39(3), 296–327 (2011)
Article Google Scholar
Jordan, H., Thoman, P., Durillo, J.J., Pellegrini, S., Gschwandtner, P., Fahringer, T., Moritsch, H.: A multi-objective auto-tuning framework for parallel codes. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 10:1–10:12. IEEE Computer Society Press, Los Alamitos (2012)
Google Scholar
Ehrgott, M.: Multicriteria Optimization. 2nd edn. Springer (2005)
Google Scholar
Balaprakash, P., Wild, S.M., Norris, B.: SPAPT: Search problems in automatic performance tuning. Procedia Computer Science 9, 1959–1968 (2012)
Article Google Scholar
Kaiser, A., Williams, S., Madduri, K., Ibrahim, K., Bailey, D., Demmel, J., Strohmaier, E.: TORCH computational reference kernels: A testbed for computer science research. Technical Report UCB/EECS-2010-144, EECS Department, University of California, Berkeley (December 2010)
Google Scholar
Davis, T.A.: Direct methods for sparse linear systems, vol. 2. SIAM (2006)
Google Scholar
Heroux, M.A., Doerer, D.W., Crozier, P.S., Willenbring, J.M.: Improving performance via mini-applications. Technical Report SAND2009-5574, Sandia National Laboratories (September 2009)
Google Scholar
Norris, B., Hartono, A., Gropp, W.: Annotations for productivity and performance portability. In: Petascale Computing: Algorithms and Applications. Computational Science, pp. 443–462. Chapman & Hall/CRC Press (2007)
Google Scholar
Intel Xeon Phi Coprocessor - the Architecture: http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-codename-knights-corner
Albers, S., Antoniadis, A.: Race to idle: New algorithms for speed scaling with a sleep state. In: Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1266–1285. SIAM (2012)
Google Scholar
Intel Xeon Phi Coprocessor System Software Developers Guide: http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-system-software-developers-guide
Alonso, P., Dolz, M.F., Igual, F.D., Mayo, R., Quintana-Orti, E.S.: Saving energy in the LU factorization with partial pivoting on multi-core processors. In: 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 353–358. IEEE (2012)
Google Scholar
Springer, R., Lowenthal, D.K., Rountree, B., Freeh, V.W.: Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster. In: Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 230–238. ACM (2006)
Google Scholar
Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Transactions on Mathematical Software 38(1) 1:1–1:25 (2011)
Google Scholar
CPU Freq. Scaling, https://wiki.archlinux.org/index.php/Cpufrequtils
WattsUp? Meters, https://www.wattsupmeters.com/
IBM System Blue Gene Solution - Overview, http://www-03.ibm.com/systems/technicalcomputing/solutions/bluegene/
Yoshii, K., Iskra, K., Gupta, R., Beckman, P., Vishwanath, V., Yu, C., Coghlan, S.: Evaluating power-monitoring capabilities on IBM Blue Gene/P and Blue Gene/Q. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 36–44. IEEE (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, USA
Prasanna Balaprakash & Stefan M. Wild
Argonne National Laboratory, Leadership Computing Facility, Argonne, IL, USA
Prasanna Balaprakash
Performance Modeling and Characterization (PMaC) Lab, San Diego Supercomputer Center, La Jolla, CA, USA
Ananta Tiwari

Authors

Prasanna Balaprakash
View author publications
You can also search for this author in PubMed Google Scholar
Ananta Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Stefan M. Wild
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefan M. Wild .

Editor information

Editors and Affiliations

University of Warwick Coventry, West Midlands, United Kingdom
Stephen A. Jarvis
University of Warwick Coventry, West Midlands, United Kingdom
Steven A. Wright
Sandia National Laboratories CSRI, Albuquerque, New Mexico, USA
Simon D. Hammond

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balaprakash, P., Tiwari, A., Wild, S.M. (2014). Multi Objective Optimization of HPC Kernels for Performance, Power, and Energy. In: Jarvis, S., Wright, S., Hammond, S. (eds) High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation. PMBS 2013. Lecture Notes in Computer Science(), vol 8551. Springer, Cham. https://doi.org/10.1007/978-3-319-10214-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-10214-6_12
Published: 01 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10213-9
Online ISBN: 978-3-319-10214-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics