Skip to main content

NuPow: Managing Power on NUMA Multiprocessors with Domain-Level Voltage and Frequency Control

  • Conference paper
  • First Online:
  • 303 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 12441))

Abstract

Power management and task placement pose two of the greatest challenges for future many-core processors in data centers. With hundreds of cores on a single die, cores experience varying memory latencies and cannot individually regulate voltage and frequency, therefore calling for new approaches to scheduling and power management. This work presents NuPow, a hierarchical scheduling and power management framework for architectures with multiple cores per voltage and frequency domain and non-uniform memory access (NUMA) properties. NuPow considers the conflicting goals of grouping virtual machines (VMs) with similar load patterns while also placing them as close as possible to the accessed data. Implemented and evaluated on existing hardware, NuPow achieves significantly better performance per watt compared to competing approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ali, Q., Zheng, H., Mann, T., Srinivasan, R.: Power aware NUMA scheduler in vmware’s esxi hypervisor. In: Proceedings of the 2015 IEEE International Symposium on Workload Characterization, IISWC 2015, Washington, DC, USA (2015)

    Google Scholar 

  2. Bohnenstiehl, B., et al.: KiloCore: a 32-nm 1000-processor computational array. IEEE J. Solid-State Circ. 52(4), 891–902 (2017)

    Article  Google Scholar 

  3. Borkar, S.: Thousand core chips: a technology perspective. In: Proceedings of the 44th Annual Design Automation Conference, DAC 2007 (2007)

    Google Scholar 

  4. Borkar, S., Chien, A.A.: The future of microprocessors. CACM 54(5), 67–77 (2011)

    Article  Google Scholar 

  5. Burd, T.D., Brodersen, R.W.: Energy efficient CMOS microprocessor design. In: Proceedings of the Twenty-Eighth Hawaii International Conference on System Sciences, vol. 1 (1995)

    Google Scholar 

  6. Butts, M.: Synchronization through communication in a massively parallel processor array. IEEE Micro 27(5), 32–40 (2007)

    Article  Google Scholar 

  7. Cai, Q., González, J., Magklis, G., Chaparro, P., González, A.: Thread shuffling: combining DVFS and thread migration to reduce energy consumptions for multi-core systems. In: Proceedings of the 17th IEEE/ACM International Symposium on Low-power Electronics and Design, ISLPED 2011 (2011)

    Google Scholar 

  8. Deng, Q., Meisner, D., Bhattacharjee, A., Wenisch, T.F., Bianchini, R.: Multiscale: memory system DVFS with multiple memory controllers. In: Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED 2012 (2012)

    Google Scholar 

  9. Dighe, S., et al.: Within-die variation-aware dynamic-voltage-frequency-scaling with optimal core allocation and thread hopping for the 80-core teraflops processor. IEEE J. Solid-State Circ. 46(1), 184–193 (2011)

    Article  Google Scholar 

  10. Duran, A., Klemm, M.: The intel many integrated core architecture. In: 2012 International Conference on High Performance Computing and Simulation (HPCS) (2012)

    Google Scholar 

  11. Fu, X., Wang, X.: Utilization-controlled task consolidation for power optimization in multi-core real-time systems. In: 2011 IEEE 17th International Conference on Embedded and Real-Time Computing Systems and Applications, vol. 1 (2011)

    Google Scholar 

  12. Ghiasi, S.: Aide de camp: asymmetric multi-core design for dynamic thermal management. Ph.D. thesis, Boulder, CO, USA (2004). aAI3136618

    Google Scholar 

  13. Herbert, S., Marculescu, D.: Analysis of dynamic voltage/frequency scaling in chip-multiprocessors. In: 2007 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED) (2007)

    Google Scholar 

  14. Howard, J., et al.: A 48-core IA-32 message-passing processor with DVFS in 45nm CMOS. In: 2010 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) (2010)

    Google Scholar 

  15. Imamura, S., Sasaki, H., Inoue, K., Nikolopoulos, D.S.: Power-capped DVFS and thread allocation with ANN models on modern NUMA systems. In: 2014 IEEE 32nd International Conference on Computer Design (ICCD) (2014)

    Google Scholar 

  16. Ioannou, N., Kauschke, M., Gries, M., Cintra, M.: Phase-based application-driven hierarchical power management on the single-chip cloud computer. In: Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, PACT 2011 (2011)

    Google Scholar 

  17. Isci, C., Buyuktosunoglu, A., Cher, C.Y., Bose, P., Martonosi, M.: An analysis of efficient multi-core global power management policies: maximizing performance for a given power budget. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 39 (2006)

    Google Scholar 

  18. Jain, V.: Fast process migration on intel SCC using lookup tables (LUTs). Technical report Masters thesis, Arizona State University, May 2013

    Google Scholar 

  19. Jha, S.S., Heirman, W., Falcón, A., Tubella, J., González, A., Eeckhout, L.: Shared resource aware scheduling on power-constrained tiled many-core processors. J. Parallel Distrib. Comput. 100, 30–41 (2017)

    Article  Google Scholar 

  20. Kang, C., Lee, S., Lee, Y.J., Lee, J., Egger, B.: Scheduling for better energy efficiency on many-core chips. In: Job Scheduling Strategies for Parallel Processing: 19th and 20th International Workshops, JSSPP 2015, Hyderabad, India, 26 May 2015 and JSSPP 2016, Chicago, IL, USA, 27 May 2016, Revised Selected Papers (2017)

    Google Scholar 

  21. Kim, W., Gupta, M.S., Wei, G.Y., Brooks, D.: System level analysis of fast, per-core DVFS using on-chip switching regulators. In: IEEE 14th International Symposium on High Performance Computer Architecture (HPCA 2008) (2008)

    Google Scholar 

  22. Kumar, R., Tullsen, D.M., Ranganathan, P., Jouppi, N.P., Farkas, K.I.: Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. In: Proceedings of the 31st Annual International Symposium on Computer Architecture, ISCA 2004 (2004)

    Google Scholar 

  23. Li, J., Martínez, J.F.: Power-performance implications of thread-level parallelism on chip multiprocessors. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2005) (2005)

    Google Scholar 

  24. Ma, K., Li, X., Chen, M., Wang, X.: Scalable power control for many-core architectures running multi-threaded applications. In: Proceedings of the 38th Annual International Symposium on Computer Architecture, ISCA 2011 (2011)

    Google Scholar 

  25. Meisner, D., Gold, B.T., Wenisch, T.F.: Powernap: eliminating server idle power. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XIV (2009)

    Google Scholar 

  26. Meisner, D., Wenisch, T.F.: Dreamweaver: architectural support for deep sleep. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS XVII (2012)

    Google Scholar 

  27. Meng, K., Joseph, R., Dick, R.P., Shang, L.: Multi-optimization power management for chip multiprocessors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008 (2008)

    Google Scholar 

  28. Mishra, A.K., Srikantaiah, S., Kandemir, M., Das, C.R.: CPM in CMPS: coordinated power management in chip-multiprocessors. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010 (2010)

    Google Scholar 

  29. Mudge, T.: Power: a first-class architectural design constraint. Computer 34(4), 52–58 (2001)

    Article  Google Scholar 

  30. Olofsson, A.: Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip. https://arxiv.org/abs/1610.01832 (2016). Accessed July 2020

  31. Rangan, K.K., Wei, G.Y., Brooks, D.: Thread motion: fine-grained power management for multi-core systems. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009 (2009)

    Google Scholar 

  32. Rotem, E., Mendelson, A., Ginosar, R., Weiser, U.: Multiple clock and voltage domains for chip multi processors. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 42 (2009)

    Google Scholar 

  33. Wilkes, J.: More Google cluster data. Google Research Blog (2011). http://googleresearch.blogspot.com/2011/11/more-google-cluster-data.html. Accessed July 2020

  34. Yang, H., Chen, Q., Riaz, M., Luan, Z., Tang, L., Mars, J.: Powerchief: intelligent power allocation for multi-stage applications to improve responsiveness on power constrained CMP. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017 (2017)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) funded by the Korean government, in part, by grants NRF-2015K1A3A1A14021288, 2016R1A2B4009193, by the BK21 Plus for Pioneers in Innovative Computing (Dept. of Computer Science and Engineering, SNU, grant 21A20151113068), and by the Promising-Pioneering Researcher Program of Seoul National University in 2015. ICT at Seoul National University provided research facilities for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernhard Egger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ahn, C., Lee, S., Kang, C., Egger, B. (2020). NuPow: Managing Power on NUMA Multiprocessors with Domain-Level Voltage and Frequency Control. In: Djemame, K., Altmann, J., Bañares, J.Á., Agmon Ben-Yehuda, O., Stankovski, V., Tuffin, B. (eds) Economics of Grids, Clouds, Systems, and Services. GECON 2020. Lecture Notes in Computer Science(), vol 12441. Springer, Cham. https://doi.org/10.1007/978-3-030-63058-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63058-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63057-7

  • Online ISBN: 978-3-030-63058-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics