Skip to main content

Advertisement

Log in

An Energy-Aware Heuristic Scheduling for Data-Intensive Workflows in Virtualized Datacenters

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

With the development of cloud computing, more and more data-intensive workflows have been deployed on virtualized datacenters. As a result, the energy spent on massive data accessing grows rapidly. In this paper, an energy aware scheduling algorithm is proposed, which introduces a novel heuristic called Minimal Data-Accessing Energy Path for scheduling data-intensive workflows aiming to reduce the energy consumption of intensive data accessing. Extensive experiments based on both synthetical and real workloads are conducted to investigate the effectiveness and performance of the proposed scheduling approach. The experimental results show that the proposed heuristic scheduling can significantly reduce the energy consumption of storing/retrieving intermediate data generated during the execution of data intensive workflow. In addition, it exhibits better robustness than existing algorithms when cloud systems are in presence of I/O intensive workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Sun D W, Chang G R, Gao S, Jin L Z, Wang X W. Modeling a dynamic data replication strategy to increase system availability in cloud computing environments. Journal of Computer Science and Technology, 2012, 27(2): 256–272.

    Article  Google Scholar 

  2. Sedaghat M, Hernández F, Elmroth E. Unifying cloud management: Towards overall governance of business level objectives. In Proc. the 11th IEEE/ACM Int. Symp. Cluster, Cloud and Grid Computing, May 2011, pp.591-597.

  3. Iosup A, Yigitbasi N, Epema D. On the performance variability of production cloud services. In Proc. the 11th IEEE/ACM Int. Symp. Cluster, Cloud and Grid Computing, May 2011, pp.104-113.

  4. Mahadevan P, Banerjee S, Sharma P, Shah A, Ranganathan P. On energy efficiency for enterprise and data center networks. IEEE Communications Magazine, 2011, 49(8): 94-100.

    Article  Google Scholar 

  5. Goth G. Data center operators face energy irony. IEEE Internet Computing, 2010, 14(2): 7–10.

    Article  Google Scholar 

  6. Wang J, Feng L, Xue W, Song Z. A survey on energy-efficient data management. SIGMOD Record, 2011, 40(2): 17–23.

    Article  Google Scholar 

  7. Figueiredo J, Maciel P, Callou G, Tavares E, Sousa E, Silva B. Estimating reliability importance and total cost of acquisition for data center power infrastructures. In Proc. the IEEE Int. Conf. Systems, Man, and Cybernetics, Oct. 2011, pp.421-426.

  8. Li J X, Li B, Wo T Y, Hu C M, Huai J P, Liu L, Lam K P. CyberGuarder: A virtualization security assurance architecture for green cloud computing. Future Generation ComputerSystems, 2012, 28(2): 379–390.

    Article  Google Scholar 

  9. Garg S K, Yeob C S, Anandasivamc A, Buyyaa R. Environment-conscious scheduling of HPC applications on distributed cloud-oriented data centers. Journal of Parallel Distributed Computing, 2011, 71(6): 732–749.

    Article  MATH  Google Scholar 

  10. Juve G, Deelman E, Berriman G B, Berman B P, Maechling P. An evaluation of the cost and performance of scientific workflows on Amazon EC2. Journal of Grid Computing, 2012, 10(1): 5–21.

    Article  Google Scholar 

  11. Yuan D, Yang Y, Liu X, Zhang G, Chen J. A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurrency and Computation: Practice and Experience, 2012, 24(9): 956–976.

    Article  Google Scholar 

  12. Tolosana-Calasanza R, Bañares J A, Pham C, Rana O F. Enforcing QoS in scientific workflow systems enacted over cloud infrastructures. Journal of Computer and System Sciences, 2012, 78(5): 1300–1315.

    Article  Google Scholar 

  13. Sotomayor B, Montero R S, Llorente I M, Foster I. Virtual infrastructure management in private and hybrid clouds. IEEE Internet Computing, 2009, 13(5): 14–22.

    Article  Google Scholar 

  14. Chapman C, Emmerich W, Márquez F G, Clayman S, Galis A. Software architecture definition for on-demand cloud provisioning. Cluster Computing, 2012, 15(2): 79–100.

    Article  Google Scholar 

  15. Kirschnick J, Alcaraz-Calero J M, Goldsack P, Farrell A, Guijarro J, Loughran S, Edwards N, Wilcock L. Towards an architecture for deploying elastic services in the cloud. Software: Practice and Experience, 2012, 42(4): 395–408.

    Google Scholar 

  16. Cherkasova L, Gupta D, Vahdat A. Comparison of the three CPU schedulers in Xen. ACM SIGMETRICS Performance Evaluation Review, 2007, 35(2): 42–51.

    Article  Google Scholar 

  17. Krishnan B, Amur H, Gavrilovska A, Schwan K. VM power metering: Feasibility and challenges. ACM SIGMETRICS Performance Evaluation Review, 2010, 38(3): 56–60.

    Article  Google Scholar 

  18. Kang H, Chen Y, Wong J L, Radu S, Wu J. Enhancement of Xen’s scheduler for MapReduce workloads. In Proc. the 20th Int. Symp. High Performance Distributed Computing, June 2011, pp.251-262.

  19. Kim H, Lim H, Jeong J, Jo H, Lee J. Task-aware virtual machine scheduling for I/O performance. In Proc. the 2009 ACM SIGPLAN/SIGOPS Int. Conf. Virtual Execution, March 2009, pp.101-110.

  20. Abbasi Z, Varsamopoulos G, Gupta S K S. TACOMA: Server and workload management in Internet data centers considering cooling-computing power trade-off and energy proportionality. ACM Transactions on Architecture and Code Optimization, 2012, 9(2): Article No.11.

  21. Fang W, Liang X, Sun Y, Vasilakos A V. Network element scheduling for achieving energy-aware data center networks. International Journal of Computers Communications and Control, 2012, 7(2):241–251.

    Google Scholar 

  22. Benoit A, Goud P R, Robert Y. Performance and energy optimization of concurrent pipelined applications. In Proc. the 24th IEEE Int. Symp. Parallel and Distributed Processing, Apr 2010, pp.1-12.

  23. Baskiyar S, Abdel-Kader R. Energy aware DAG scheduling on heterogeneous systems. Cluster Computing, 2010, 13(4): 373–383.

    Article  Google Scholar 

  24. Rizvandi N B, Taheri J, Zomaya A Y, Lee Y C. Linear combinations of DVFs-enabled processor frequencies to modify the energy-aware scheduling algorithms. In Proc. the 10th IEEE/ACM Int. Conf. Cluster, Cloud and Grid Computing, May 2010, pp.388-397.

  25. Lee Y C, Zomaya A Y. Energy conscious scheduling for distributed computing systems under different operating conditions. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(8): 1374–1381.

    Article  Google Scholar 

  26. Mezmaza M, Melab N, Kessaci Y, Lee Y C, Talbi E G, Zomaya A Y, Tuyttens D. A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. Journal of Parallel and Distributed Computing, 2011, 71(11): 1497–1508.

    Article  Google Scholar 

  27. Zhu D, Melhem R, Childers B R. Scheduling with dynamic voltage/speed adjustment using slack reclamation in multi processor real-time systems. IEEE Transactions on Parallel and Distributed Systems, 2003, 14(7): 686–700.

    Article  Google Scholar 

  28. Zong Z, Briggs M, Connor N, Xiao Q. An energy-efficient framework for large-scale parallel storage systems. In Proc. the 21st IEEE Int. Symp. Parallel and Distributed Processing, Mar. 2007, pp.1-7.

  29. Manzanares A, Bellam K, Qin X. A prefetching scheme for energy conservation in parallel disk systems. In Proc. the 22nd IEEE Int. Symp. Parallel and Distributed Processing, Apr. 2008, pp.1-5.

  30. Bohra A, Chaudhary V. Vmeter: Power modelling for virtualized clouds. In Proc. the 24th IEEE Int. Symp. Parallel and Distributed Processing, Apr. 2010, pp.1-8.

  31. Cho S, Melhem R G. On the interplay of parallelization, program performance, and energy consumption. IEEE Transactions on Parallel and Distributed Systems, 2010, 21(3): 342-353.

    Article  Google Scholar 

  32. Kim K H, Beloglazov A, Buyya R. Power-aware provisioning of virtual machines for real-time cloud services. Concurrency and Computation: Practice and Experience, 2011, 23(13):1491–1505.

    Article  Google Scholar 

  33. Speitkamp B, Bichler M. A mathematical programming approach for server consolidation problems in virtualized data centers. IEEE Transactions on Services Computing, 2010, 3(4): 266–278.

    Article  Google Scholar 

  34. Hupfeld F, Cortes T, Kolbeck B, Stender J, Focht E, Hess M, Malo J, Martí J, Cesario E. The XtreemFS architecture — A case for object-based file systems in grids. Concurrency and Computation: Practice and Experience, 2008, 20(17): 2049-2060.

    Article  Google Scholar 

  35. Topcuoglu H, Hariri S, Wu M Y. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems, 2002, 13(3): 260–274.

    Article  Google Scholar 

  36. Calheiros R N, Ranjan R, Beloglazov A, De Rose C A F, Buyya R. CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and Experience, 2011, 41(1): 23–50.

    Google Scholar 

  37. Berlinska J, Drozdowski M. Scheduling divisible MapReduce computations. Journal of Parallel and Distributed Computing, 2011, 71(3): 450–459.

    Article  Google Scholar 

  38. Kiss T, Greenwell P, Heindl H, Terstyánszky G, Weingarten N. Parameter sweep workflows for modelling carbohydrate recognition. Journal of Grid Computing, 2010, 8(4): 587-601.

    Article  Google Scholar 

  39. Kansal A, Zhao F, Liu J, Kothari N, Bhattacharya A A. Virtual machine power metering and provisioning. In Proc. the 1st ACM Symp. Cloud Computing, June 2010, pp.39-50.

  40. Theiner D, Wieczorek M. Reduction of calibration time of distributed hydrological models by use of grid computing and nonlinear optimisation algorithms. In Proc. the 7th Int. Conf. Hydroinformatics, Sept. 2006.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi-Gang Hu.

Additional information

Supported by the National Natural Science Foundation of China under Grant Nos. 60970038, 61272148, the Science and Technology Plan Project of Hunan Province of China under Grant No. 2012GK3075, and the Scientific Research Fund of Hunan Provincial Education Department of China under Grant No. 13B015.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(DOC 26 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, P., Hu, ZG. & Zhang, YP. An Energy-Aware Heuristic Scheduling for Data-Intensive Workflows in Virtualized Datacenters. J. Comput. Sci. Technol. 28, 948–961 (2013). https://doi.org/10.1007/s11390-013-1390-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-013-1390-9

Keywords

Navigation