Abstract
In recent years, data centers have deployed many rapidly growing Internet-scale services. As the size of the data center continues to expand, it not only causes huge energy consumption, but also brings environmental problems such as huge carbon emissions. However, for an Internet-scale data center, it is a challenge to optimize power consumption while meeting the growing demands of Internet-scale applications. Moreover, current data centers usually have heterogeneous servers (with different power consumption and computing capacity), which causes the dynamic task placement problem more complicated. In this work, we first observe two kinds of heterogeneity from the analysis of a public trace that was collected from a Google data center. Considering the demands in energy consumption and performance, we model the dynamic task placement problem in a heterogeneous Internet-scale data center and propose a heuristic cost-aware algorithm to solve it. By simulating and comparing with the other two scheduling algorithms, our proposed heuristic algorithm can gain a well energy saving and keep application performance within acceptable limits.














Similar content being viewed by others
References
Datacenter Dynamics (2014) Is the industry getting better at using power. Focus 3(33):16–17
Ranganathan P (2010) Recipe for efficiency: principles of power-aware computing. Commun ACM 53(10):60–67
Gao PX, Curtis A, Wong B et al (2012) It’s not easy being green. ACM SIGCOMM CCR 42(4):211–222
Nathuji R, Isci C, Gorbatov E (2007) Exploiting platform heterogeneity for power efficient data centers. In: Proceedings of IEEE International Conference on Autonomic Computing, Florida, June
Ahmad F, Chakradhar S, Raghunathan A et al (2012) Tarazu: optimizing MapReduce on heterogeneous clusters. ACM SIGARCH Comput Archit News 40(1):61–74
Chun B-G, Iannaccone G, Iannaccone G et al (2009) An energy case for hybrid datacenters. In: Proceedings of HotPower’09, Big Sky, Oct. 2009
Garg S, Sundaram S, Patel HD (2011) Robust heterogeneous data center design: a principled approach. ACM Sigmetrics Perform Eval Rev 39(3):28–30
Yigitbasi N, Datta K, Jain N et al (2011) Energy efficient scheduling of MapReduce workloads on heterogeneous clusters. In: Proceedings of ACM green computing middleware, New York
Zhang J, Qi H, Guo D et al (2015) ATFQ: a fair and efficient packet scheduling method in multi-resource environments. IEEE Trans Netw Serv Manage 12(4):605–617
Malawski M, Juve G, Deelman E et al (2015) Algorithms for cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. Future Gener Comput Syst 48(C):1–18
Arabnejad H, Barbosa JG (2014) List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans Parallel Distrib Syst 25(3):682–694
Bilgaiyan S, Sagnika S, Das M (2014) Workflow scheduling in cloud computing environment using Cat Swarm Optimization. In: Proceedings of the IEEE International Advance Computing Conference, Haryana
Fang Q, Wang J et al (2017) Thermal-aware energy management of an HPC data center via two-time-scale control. IEEE Trans Industr Inf 13(5):2260–2269
Beloglazov A, Buyya R (2010) Energy efficient allocation of virtual machines in cloud data centers. In: Proceedings of the IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, Melbourne
Chen T, Gao X, Chen G (2016) Optimized virtual machine placement with traffic-aware balancing in data center networks. Sci Program 4:1–10
Zikos S, Karatza H (2011) Performance and energy aware cluster-level scheduling of compute-intensive jobs with unknown service times. Simul Model Pract Theory 19(1):239–250
Poola D, Garg SK, Buyya R et al (2014) Robust scheduling of scientific workflows with deadline and budget constraints in clouds. In: IEEE International Conference on Advanced Information Networking and Applications
Rodriguez MA, Buyya R (2014) Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds. IEEE Trans Cloud Comput 2(2):222–235
Zheng W, Sakellariou R (2013) Stochastic DAG scheduling using a Monte Carlo approach. J Parallel Distrib Comput 73(12):1673–1689
Chen W, Ferreira da Silva R, Deelman E et al (2015) Using imbalance metrics to optimize task clustering in scientific workflow executions. Future Gener Comput Syst 46(C):69–84
Luo L, Shen C, Zhang C et al (2013) Shape similarity analysis by self-tuning locally constrained mixed-diffusion. IEEE Trans Multimedia 15(5):1174–1183
Google cluster data, http://code.google.com/p/googleclusterdata/wiki/ClusterData2011_1
k-means clustering, http://en.wikipedia.org/wiki/K-means_clustering
Zhang Q, Hellerstein J, Boutaba R (2011) Characterizing task usage shapes in Google’s compute clusters. Proc. International workshop on large scale distributed systems and middleware, Seattle
Chen Y, Ganapathi A, Griffith R et al (2010) Analysis and lessons from a publicly available google cluster trace, Technical Report
Mishra AK, Hellerstein JL, Cirne W et al (2010) Towards characterizing cloud backend workloads: insights from Google compute clusters. ACM Sigmetrics Perform Eval Rev 37(4):34–41
Gandhi A, Gupta V, Harchol-Balter M et al (2010) Optimality analysis of energy-performance trade-off for server farm management. Perform Eval 67(11):1155–1171
Acknowledgements
This work is supported by the National Key Research and Development Program of China under Grant No. 2018YFB1003602. Many thanks to Qi Zhang, Mohamed Faten Zhani, Prof. Raouf Boutaba for kind help.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, S., Liu, Y., Hu, N. et al. A novel cost-aware algorithm for dynamic task placement problem in a heterogeneous Internet-scale data center. J Supercomput 76, 6579–6598 (2020). https://doi.org/10.1007/s11227-019-02892-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-02892-9