Rack aware scheduling in HPC data centers: an energy conservation strategy

Patil, Vikas Ashok; Chaudhary, Vipin

doi:10.1007/s10586-012-0224-9

Rack aware scheduling in HPC data centers: an energy conservation strategy

Published: 22 August 2012

Volume 16, pages 559–573, (2013)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Vikas Ashok Patil¹ &
Vipin Chaudhary¹

390 Accesses
11 Citations
Explore all metrics

Abstract

Energy consumption in high performance computing data centers has become a long standing issue. With rising costs of operating the data center, various techniques need to be employed to reduce the overall energy consumption. Currently, among others there are techniques that guarantee reduced energy consumption by powering on/off the idle nodes. However, most of them do not consider the energy consumed by other components in a rack. Our study addresses this aspect of the data center. We show that we can gain considerable energy savings by reducing the energy consumed by these rack components. In this regard, we propose a scheduling technique that will help schedule jobs with the above mentioned goal. We claim that by our scheduling technique we can reduce the energy consumption considerably without affecting other performance metrics of a job. We implement this technique as an enhancement to the well-known Maui scheduler and present our results. We propose three different algorithms as part of this technique. The algorithms evaluate the various trade-offs that could be possibly made with respect to overall cluster performance. We compare our technique with various currently available Maui scheduler configurations. We simulate a wide variety of workloads from real cluster deployments using the simulation mode of Maui. Our results consistently show about 7 to 14 % savings over the currently available Maui scheduler configurations. We shall also see that our technique can be applied in tandem with most of the existing energy aware scheduling techniques to achieve enhanced energy savings.

We also consider the side effects of power losses due to the network switches as a result of deploying our technique. We compare our technique with the existing techniques in terms of the power losses due to these switches based on the results in Sharma and Ranganathan, Lecture Notes in Computer Science, vol. 5550, 2009 and account for the power losses. We there on provide a best fit scheme with the rack considerations.

We then propose an enhanced technique that merges the two extremes of node allocation based on rack information. We see that we can provide a way to configure the scheduler based on the kind of workload that it schedules and reduce the effect of job splitting across multiple racks. We further discuss how the enhancement can be utilized to build a learning model which can be used to adaptively adjust the scheduling parameters based on the workload experienced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Report to Congress on Server and Data Center Energy Efficiency Public Law 109-431. U.S. Environmental Protection Agency ENERGY STAR Program, August, 2007
Komey, J., Belady, C., Patterson, M., Santos, A., Lange, K.-D.: Assessing trends over time in performance, costs and energy use for servers. LLNL, Intel Corporation, Microsoft Corporation and Hewlett-Packard Corporation. Released on the web on August 17, 2009
Liu, Y., Zhu, H.: A survey of the research on power management techniques for high-performance systems. Softw. Practive Experience J. 40(11) (2010). doi:10.1002/spe.v40:11
Jackson, D., Snell, Q., Clement, M.: Core algorithms of the Maui scheduler. SprinkerLink, January 01, 2001
LLNL, H.P., Bull: The simple Linux utility for resource management (SLURM). Available at http://www.llnl.gov/linux/slurm/. Revision 2.0.3, June 30, 2009
Pinheiro, E., Bianchini, R., Carrera, E., Health, R.: Load balancing and unbalancing for power and performance in cluster-based systems. Technical report dcs-tr-440, Department of Computer Science, Rutgers University, May, 2001
Chase, J., Aderson, D., Thakar, P., Vahdat, A., Doyle, R.: Managing energy and server resources in hosting centers. In: Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01), Canada, October 2001
Google Scholar
Chen, Y., Das, A., Qin, W., Sivasubramaniam, A., Wang, Q., Gautam, N.: Managing server energy and operational costs in hosting centers. In: Proceedings of the 2005 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’05), Canada, June 2005
Google Scholar
Verma, A., Ahuja, P., Neogi, A.: Power-aware dynamic placement of HPC applications. In: Proceedings of the 22nd International Conference on Supercomputing (ICS’08), Greece, June 2008
Google Scholar
Dhiman, G., Marchetti, G., Rosing, T.: vGreen: A system for energy efficient computing in virtualized environments. In: ISLPED, California, USA, August 2009
Google Scholar
Nathuji, R., Schwan, K.: VPM tokens: virtual machine-aware power budgeting in datacenters. In: High Performance Distributed Computing, June 2008
Google Scholar
Gabrielyan, E., Hersch, R.D.: Network topology aware scheduling of collective communications. In: 10th International Conference of Telecommunications, March 2003
Google Scholar
Heath, T., Centeno, A., George, P., Ramos, L., Jaluria, Y., Bianchini, R.: Mercury and Freon temperature emulation and management for server systems. In: ASPLOS, October 2006
Google Scholar
Moore, J., Chase, J., Ranganathan, P., Sharma, R.: Temperature-aware workload placement in data centers. In: USENIX (2005)
Google Scholar
HP BladeSystem p-Class Infrastructure Specification: http://h18004.www1.hp.com/products/quickspecs/12330_div/12330_div.html
HP Systems Insight Manager: version 6.2
Product Description of APC Switched Rack Power Distribution Unit: http://www.apc.com/products/family/ind-ex.cfm?id=70
Maui Scheduler Administrative Guide: Version 3.2. http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml
Torque Admin Manual: Version 3.0. http://www-.clusterresources.com/products/torque/docs/
HPC2N Log from Parallel Workloads Archive: HPC2N is a Linux cluster located in Sweden. http://www.cs.huji.ac.il/-labs/parallel/workload/l_hpc2n/index.html
Parallel Workload Archive: http://www.cs.huji.ac.il/-labs/parallel/workload/logs.html
SCD FY 2003: ASR. http://www.cisl.ucar.edu/docs/asr2003/-mss.html
Hermenier, F., Lorca, X., Menaud, J., Muller, G., Lawall, J.: Entropy: a consolidation manager for clusters. In: VEE, Washington (2009)
Google Scholar
Beral, G., Nou, J., Guitart, G.T.: Towards energy-aware scheduling in data centers using machine learning. In: e-Energy, Germany (2010)
Google Scholar
Kandala, K., Subramoni, H., Panda, D., Vishnu, A.: Designing topology-aware collective communication algorithms for large scale InfiniBand clusters: case studies with scatter and gather. In: IPDPS, Atlanta (2010)
Google Scholar
Etsion, Y., Tsafrir, D.: A short survey of commercial cluster batch schedulers. Technical Report 2005-13, Hebrew University, May 2005
Sharma, M., Ranganathan, B.: A power benchmarking framework for network devices. In: Lecture Notes in Computer Science, vol. 5550. Springer, Berlin (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

State University of New York, Buffalo, USA
Vikas Ashok Patil & Vipin Chaudhary

Authors

Vikas Ashok Patil
View author publications
You can also search for this author inPubMed Google Scholar
Vipin Chaudhary
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Vikas Ashok Patil.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patil, V.A., Chaudhary, V. Rack aware scheduling in HPC data centers: an energy conservation strategy. Cluster Comput 16, 559–573 (2013). https://doi.org/10.1007/s10586-012-0224-9

Download citation

Received: 05 January 2012
Accepted: 27 June 2012
Published: 22 August 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10586-012-0224-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rack aware scheduling in HPC data centers: an energy conservation strategy

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Taking Advantage of Node Power Variation in Homogenous HPC Systems to Save Energy

Popularity-based covering sets for energy proportionality in shared-nothing clusters

Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Rack aware scheduling in HPC data centers: an energy conservation strategy

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Taking Advantage of Node Power Variation in Homogenous HPC Systems to Save Energy

Popularity-based covering sets for energy proportionality in shared-nothing clusters

Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now