Skip to main content

Advertisement

Log in

Energy-Efficient Thermal-Aware Autonomic Management of Virtualized HPC Cloud Infrastructure

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Virtualized datacenters and clouds are being increasingly considered for traditional High-Performance Computing (HPC) workloads that have typically targeted Grids and conventional HPC platforms. However, maximizing energy efficiency and utilization of datacenter resources, and minimizing undesired thermal behavior while ensuring application performance and other Quality of Service (QoS) guarantees for HPC applications requires careful consideration of important and extremely challenging tradeoffs. Virtual Machine (VM) migration is one of the most common techniques used to alleviate thermal anomalies (i.e., hotspots) in cloud datacenter servers as it reduces load and, hence, the server utilization. In this article, the benefits of using other techniques such as voltage scaling and pinning (traditionally used for reducing energy consumption) for thermal management over VM migrations are studied in detail. As no single technique is the most efficient to meet temperature/performance optimization goals in all situations, an autonomic approach that performs energy-efficient thermal management while ensuring the QoS delivered to the users is proposed. To address the problem of VM allocation that arises during VM migrations, an innovative application-centric energy-aware strategy for Virtual Machine (VM) allocation is proposed. The proposed strategy ensures high resource utilization and energy efficiency through VM consolidation while satisfying application QoS by exploiting knowledge obtained through application profiling along multiple dimensions (CPU, memory, and network bandwidth utilization). To support our arguments, we present the results obtained from an experimental evaluation on real hardware using HPC workloads under different scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Report to congress on server and data center energy efficiency. Tech. rep., U.S. Environmental Protection Agency (2007)

  2. Ajiro, Y., Tanaka, A.: Improving packing algorithms for server consolidation. In: Proc. of Computer Measurement Group Conf. (CMG), San Diego, CA, pp. 399–406 (2007)

  3. Apparao, P., Iyer, R., Zhang, X., Newell, D., Adelmeyer, T.: Characterization & analysis of a server consolidation benchmark. In: Proc. of ACM SIGPLAN/SIGOPS Conf. on Virtual Execution Environments (VEE), Seattle, WA, pp. 21–30 (2008)

  4. Bash, C., Forman, G.: Cool job allocation: Measuring the power savings of placing jobs at cooling-efficient locations in the data center. In: Proc. of USENIX Annual Technical Conf. (ATEC), Santa Clara, CA, pp. 363–368 (2007)

  5. Beitelmal, A., Patel, C.: Thermo-fluids provisioning of a high performance high density data center. Distrib. Parallel Dat. 21(2–3), 227–238 (2007)

    Article  Google Scholar 

  6. Bobroff, N., Kochut, A., Beaty, K.: Dynamic placement of virtual machines for managing SLA violations. In: Proc. of IFIP/IEEE Symp. on Integrated Network Management (IM), Munich, Germany, pp. 119–128 (2007)

  7. Chen, Y., Das, A., Qin, W., Sivasubramaniam, A., Wang, Q., Gautam, N.: Managing server energy and operational costs in hosting centers. In: Proc. of ACM Intl. Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS), Banff, Canada, pp. 303–314 (2005)

  8. Curtis-Maury, M., Shah, A., Blagojevic, F., Nikolopoulos, D.S., de Supinski, B.R., Schulz, M.: Prediction models for multi-dimensional power-performance optimization on many cores. In: Proc. of the 17th Intl. Conf. on Parallel Architectures and Compilation Techniques, pp. 250–259 (2008)

  9. Das, R., Kephart, J.O., Lefurgy, C., Tesauro, G., Levine, D.W., Chan, H.: Autonomic multi-agent management of power and performance in data centers. In: Proc. of Intl. joint Conf. on Autonomous Agents and Multiagent Systems (AAMAS), Estoril, Portugal, pp. 107–114 (2008)

  10. Enabling Grid for E-sciencE (2010). http://www.eu-egee.org/. Accessed 1 Oct 2011

  11. Feitelson, D.: Parallel workload archive (2010). http://www.cs.huji.ac.il/labs/parallel/workload/. Accessed 1 Oct 2011

  12. Garday, D., Housley, J.: Thermal storage system provides emergency data center cooling. Tech. rep., Intel Corporation (2007)

  13. Grid Observatory (2010). http://www.grid-observatory.org/. Accessed 1 Oct 2011

  14. Gonzalez, R., Horowitz, M.: Energy dissipation in general purpose microprocessors. IEEE J. Solid-State Circuits 31(9), 1277–1284 (1996)

    Article  Google Scholar 

  15. Govindan, S., Nath, A.R., Das, A., Urgaonkar, B., Sivasubramaniam, A.: Xen and co.: communication-aware CPU scheduling for consolidated Xen-based hosting platforms. In: Proc. of the Intl. Conf. on Virtual Execution Environments (VEE), San Diego, CA, pp. 126–136 (2007)

  16. Greenberg, S., Mills, E., Tschudi, B.: Best practices for data centers: lessons learned from benchmarking 22 data centers. In: Proc. of American Council for an Energy-Efficient Economy (ACEEE), Pacific Grove, CA (2006)

  17. Gupta, D., Lee, S., Vrable, M., Savage, S., Snoeren, A.C., Varghese, G., Voelker, G.M., Vahdat, A.: Difference engine: harnessing memory redundancy in virtual machines. Commun. ACM 53(10), 85–93 (2010)

    Article  Google Scholar 

  18. Heath, T., Centeno, A.P., George, P., Ramos, L., Jaluria, Y., Bianchini, R.: Mercury and freon: temperature emulation and management for server systems. In: Proc. of Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), San Jose, CA, pp. 106–116 (2006)

  19. Hermenier, F., Lorca, X., Menaud, J.M., Muller, G., Lawall, J.: Entropy: a consolidation manager for clusters. In: Proc. of the ACM SIGPLAN/SIGOPS Conf. on Virtual Execution Environments (VEE), Washington, DC, pp. 41–50 (2009)

  20. Kang, S., Schmidt, R.R., Kelkar, K., Patankar, S.: A methodology for the design of perforated tiles in raised floor data centers using computational flow analysis. IEEE Trans. Compon. Packag. Technol. 24(2), 177–183 (2001)

    Article  Google Scholar 

  21. Kaxiras, S., Martonosi, M.: Computer Architecture Techniques for Power-Efficiency. Morgan and Claypool (2008)

  22. Kephart, J.O., Chan, H., Das, R., Levine, D.W., Tesauro, G., Rawson, F., Lefurgy, C.: Coordinating multiple autonomic managers to achieve specified power-performance tradeoffs. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Jacksonville, FL, pp. 24–34 (2007)

  23. Kochut, A., Beaty, K.: On strategies for dynamic resource management in virtualized server environments. In: Proc. of the Intl. Symp. on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), Istanbul, Turkey, pp. 193–200 (2007)

  24. Kumar, S., Talwar, V., Kumar, V., Ranganathan, P., Schwan, K.: vManage: loosely coupled platform and virtualization management in data centers. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Barcelona, Spain, pp. 127–136 (2009)

  25. Laszewski, G., Wang, L., Younge, A.J., He, X.: Power-aware scheduling of virtual machines in DVFS-enabled clusters. In: Proc. of IEEE Intl. Conf. on Cluster Computing (CLUSTER), New Orleans, LA, pp. 1–10 (2009)

  26. Lee, E.K., Kulkarni, I., Pompili, D., Parashar, M.: Proactive thermal management in green datacenter. J. Supercomput. 51(1), 1–31 (2010)

    Article  Google Scholar 

  27. Liu, J., Priyantha, B., Zhao, F., Liang, C., Wang, Q., James, S.: Towards discovering data center genome using sensor networks. In: Proc. of the Workshop on Embedded Networked Sensors (HotEmNets), Charlottesville, VA (2008)

  28. Mannas, E., Jones, S.: Add thermal monitoring to reduce data center energy consumption (2009). http://pdfserv.maxim-ic.com/en/an/AN4334.pdf. Accessed 1 Oct 2011

  29. Menasce, D.A., Bennani, M.N.: Autonomic virtualized environments. In: Proc. of Intl. Conf. on Autonomic and Autonomous Systems (ICAS), Santa Clara, CA, pp. 1–28 (2006)

  30. Meng, X., Isci, C., Kephart, J., Zhang, L., Bouillet, E., Pendarakis, D.: Efficient resource provisioning in compute clouds via VM multiplexing. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Washington, DC, pp. 11–20 (2010)

  31. Moore, J., Chase, J., Ranganathan, P., Sharma, R.: Making scheduling “cool”: temperature-aware workload placement in data centers. In: Proc. of USENIX Annual Technical Conf. (ATEC), pp. 61–75 (2005)

  32. Moore, J.D., Chase, J.S., Ranganathan, P.: Weatherman: automated, online and predictive thermal mapping and management for data centers. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Dublin, Ireland, pp. 155–164 (2006)

  33. Mukherjee, T., Banerjee, A., Varsamopoulos, G., Gupta, S., Rungta, S.: Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers. Comput. Networks 53(17), 2888–2904 (2009)

    Article  MATH  Google Scholar 

  34. Nathuji, R., Isci, C., Gorbatov, E.: Exploiting platform heterogeneity for power efficient data centers. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Jacksonville, FL, pp. 1–5 (2007)

  35. Nathuji, R., Schwan, K.: VirtualPower: coordinated power management in virtualized enterprise systems. In: Proc. of ACM SIGOPS Symp. on Operating Systems Principles (SOSP), pp. 265–278 (2007)

  36. Orlov, M.: Efficient generation of set partitions. Tech. rep., Engineering and Computer Sciences, University of Ulm (2002)

  37. Patel, C., Bash, C., Belady, L., Stahl, L., Sullivan, D.: Computational fluid dynamics modeling of high compute density data centers to assure system inlet air specifications. In: Proc. of Pacific Rim/ASME International Electronic Packaging Technical Conferenceof (IPACK), Kauai, HI (2001)

  38. Rambo, J., Joshi, Y.: Modeling of data center airflow and heat transfer: state of the art and future trends. Distrib. Parallel Dat. 21(2–3), 193–225 (2007)

    Article  Google Scholar 

  39. Ramos, L., Bianchini, R.: C-oracle: predictive thermal management for data centers. In: Proc. of Intl. Symp. on High-Performance Computer Architecture (HPCA), pp. 111–122 (2008)

  40. Ranganathan, P., Leech, P., Irwin, D., Chase, J.: Ensemble-level power management for dense blade servers. SIGARCH Comput. Archit. News 34(2), 66–77 (2006)

    Article  Google Scholar 

  41. Rodero, I., Chandra, S., Parashar, M., Muralidhar, R., Seshadri, H., Poole, S.: Investigating the potential of application-centric aggressive power management for HPC workloads. In: Proc. of the IEEE Intl. Conf. on High Performance Computing (HiPC), Goa, India, pp. 1–10 (2010)

  42. Rodero, I., Lee, E.K., Pompili, D., Parashar, M., Gamell, M., Figueiredo, R.J.: Exploiting VM technologies for reactive thermal management in instrumented datacenters. In: Workshop on Energy Efficient Grids, Clouds, and Clusters in conjunction with IEEE Grid, Brussels, Belguim, pp. 321–328 (2010)

  43. Rusu, C., Ferreira, A., Scordino, C., Watson, A.: Energy-efficient real-time heterogeneous server clusters. In: Proc. of IEEE Real-Time and Embedded Technology and Applications Symp. (RTAS), St. Louis, MO, pp. 418–428 (2006)

  44. Schmidt, R.R., Cruz, E.: Raised floor computer data center: effect on rack inlet temperatures of exiting both the hot and cold aisle. In: Proc. of Itherm Conference (ITHERM), San Diego, CA (2002)

  45. Schmidt, R.R., Cruz, E.E., Iyengar, M.K.: Challenges of data center thermal management. IBM J. Res. Develop. 49(4/5), 709–723 (2005)

    Article  Google Scholar 

  46. Schmidt, R.R., Karki, K., Kelkar, K., Radmehr, A., Patankar, S.: Measurements and predictions of the flow distribution through perforated tiles in raised floor data centers. In: Proc. of Pacific Rim/ASME International Electronic Packaging Technical Conferenceof (IPACK), Kauai, HI (2001)

  47. Sharma, R., Bash, C., Patel, R.: Dimensionless parameters for evaluation of thermal design and performance of large-scale data centers. In: Proc. of ASME/AIAA Joint Thermophysics and Heat Transfer Conference. St. Louis, MO (2002)

  48. Sharma, R.K., Bash, C.E., Patel, C.D., Friedrich, R.J., Chase, J.S.: Balance of power: dynamic thermal management for internet data centers. IEEE Internet Comput. 9(1), 42–49 (2005)

    Article  Google Scholar 

  49. Song, Y., Sun, Y., Wang, H., Song, X.: An adaptive resource flowing scheme amongst VMs in a VM-based utility computing. In: Proc. of IEEE Intl. Conf. on Computer and Information Technology (ICCIT), Fukushima, Japan, pp. 1053–1058 (2007)

  50. Steinder, M., Whalley, I., Carrera, D., Gaweda, I., Chess, D.: Server virtualization in autonomic management of heterogeneous workloads. In: Proc. of IEEE Symp. on Integrated Network Management, pp. 139–148 (2007)

  51. Stoess, J., Lang, C., Bellosa, F.: Energy management for hypervisor-based virtual machines. In: Proc. of USENIX Annual Technical Conf. (ATEC), Santa Clara, CA, pp. 1–14 (2007)

  52. Tang, Q., Gupta, S.K.S., Varsamopoulos, G.: Energy-efficient thermal-aware task scheduling for homogeneous high-performance computing data centers: a cyber-physical approach. IEEE Trans. Parallel Distrib. Syst. 19(11), 1458–1472 (2008)

    Article  Google Scholar 

  53. Verma, A., Ahuja, P., Neogi, A.: pMapper: power and migration cost aware application placement in virtualized systems. In: Proc. of ACM/IFIP/USENIX Intl. Conf. on Middleware (MIDDLEWARE), Leuven, Belgium, pp. 243–264 (2008)

  54. Verma, A., Ahuja, P., Neogi, A.: Power-aware dynamic placement of HPC applications. In: Proc. of Intl. Conf. on Supercomputing (ICS), Island of Kos, Greece, pp. 175–184 (2008)

  55. Voorsluys, W., Broberg, J., Venugopal, S., Buyya, R.: Cost of virtual machine live migration in clouds: a performance evaluation. In: Proc. of Intl. Conf. on Cloud Computing (CloudCom), Beijing, China, pp. 254–265 (2009)

  56. Wood, T., Tarasuk-Levin, G., Shenoy, P., Desnoyers, P., Cecchet, E., Corner, M.D.: Memory buddies: exploiting page sharing for smart colocation in virtualized data centers. In: Proc. of the ACM SIGPLAN/SIGOPS Conf. on Virtual Execution Environments (VEE), Washington, DC, pp. 31–40 (2009)

  57. Zhu, Q., Zhu, J., Agrawal, G.: Power-aware consolidation of scientific workflows in virtualized environments. In: Proc. of Intl. Conf. on High Performance Computing, Networking, Storage and Analysis (SC), New Orleans, LA, pp. 1–12 (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivan Rodero.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rodero, I., Viswanathan, H., Lee, E.K. et al. Energy-Efficient Thermal-Aware Autonomic Management of Virtualized HPC Cloud Infrastructure. J Grid Computing 10, 447–473 (2012). https://doi.org/10.1007/s10723-012-9219-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-012-9219-2

Keywords

Navigation