Skip to main content

Advertisement

Log in

An energy-aware scheduling of dynamic workflows using big data similarity statistical analysis in cloud computing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Cloud computing is a suitable platform for workflows that work with massive data and big data. Through virtualization, cloud computing converts physical infrastructures to virtual machines (VMs). Virtual machines can meet fluctuating and dynamic requests through simpler management. Workflow scheduling in cloud computing is important, concerning the fact that proper scheduling can enhance the efficiency of the cloud and good scheduling can cause energy consumption reduction. As energy efficiency is one of the most important issues in cloud computing, in this paper a new statistical analysis-based algorithm is suggested for defining similarities of input workflows. The proposed algorithm, which is called massive data similarity statistics analysis algorithm (MSSA), classifies virtual machines into virtual clusters and it executes scheduling by reforming the virtual clusters. Furthermore, MSSA investigates the similarities of message passing in two different periods; it decides for the next period, and finally, carries out the load balancing by a new method for transferring the machines in virtual clusters. The results of simulation with CloudSim show that the proposed algorithm is more energy efficient in comparison with traditional methods, like FIFO, and heuristic methods such as BlindPick, and relatively new method, named eOO as well as makespan. The main parameter for comparing is makespan and energy consumption. The results showed that the proposed method is more energy efficient compared with similar algorithms and it reduced the makespan significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Schomm F, Stahl F, Vossen G (2013) Marketplaces for data: an initial survey. ACM SIGMOD Rec 42(1):15–26

    Article  Google Scholar 

  2. Sedaghat M, Hern F, Elmroth E (2011) Unifying cloud management: towards overall governance of business level objectives. In: 2011 11th IEEE/ACM international symposium on cluster, cloud and grid computing. IEEE, pp 591–597

  3. Panda SK, Jana PK (2015) Efficient task scheduling algorithms for heterogeneous multi-cloud environment. J Supercomput 71(4):1505–1533

    Article  Google Scholar 

  4. Djebbar EI, Belalem G (2013) Optimization of tasks scheduling by an efficacy data placement and replication in cloud computing. In: International Conference on Algorithms and Architectures for Parallel Processing. Springer, pp 22–29

  5. Duy TVT, Sato Y, Inoguchi Y (2010) Performance evaluation of a green scheduling algorithm for energy savings in cloud computing. In: 2010 IEEE international symposium on parallel and distributed processing, workshops and PhD forum (IPDPSW). IEEE, pp 1–8

  6. Rani BK, Babu AV (2015) Scheduling of big data application workflows in cloud and inter-cloud environments. In: 2015 IEEE International Conference on Big Data (big data). IEEE, pp 2862–2864

  7. Zhang F, Cao J, Hwang K, Li K, Khan SU (2014) Adaptive workflow scheduling on cloud computing platforms with iterativeordinal optimization. IEEE Trans Cloud Comput 3(2):156–168

    Article  Google Scholar 

  8. Xiao P, Hu Z-G, Zhang Y-P (2013) An energy-aware heuristic scheduling for data-intensive workflows in virtualized datacenters. J Comput Sci Technol 28(6):948–961

    Article  Google Scholar 

  9. Zhang F, Cao J, Tan W, Khan SU, Li K, Zomaya AY (2014) Evolutionary scheduling of dynamic multitasking workloads for big-data analytics in elastic cloud. IEEE Trans Emerg Top Comput 2(3):338–351

    Article  Google Scholar 

  10. Madni SHH, AbdLatiff MS, Coulibaly Y (2016) Resource scheduling for infrastructure as a service (IAAS) in cloud computing: challenges and opportunities. J Netw Comput Appl 68:173–200

    Article  Google Scholar 

  11. Smanchat S, Viriyapant K (2015) Taxonomies of workflow scheduling problem and techniques in the cloud. Futur Gener Comput Syst 52:1–12

    Article  Google Scholar 

  12. Alkhanak EN, Lee SP, Khan SUR (2015) Cost-aware challenges for workflow scheduling approaches in cloud computing environments: taxonomy and opportunities. Futur Gener Comput Syst 50:3–21

    Article  Google Scholar 

  13. Mansouri N, Dastghaibyfard GH, Mansouri E (2013) Combination of data replication and scheduling algorithm for improving data availability in data grids. J Netw Comput Appl 36(2):711–722

    Article  Google Scholar 

  14. Zhang F, Cao J, Li K, Khan SU, Hwang K (2014) Multi-objective scheduling of many tasks in cloud platforms. Futur Gener Comput Syst 37:309–320

    Article  Google Scholar 

  15. Hanani A, Rahmani AM, Sahafi A (2017) A multi-parameter scheduling method of dynamic workloads for big data calculation in cloud computing. J Supercomput 73(11):4796–4822

    Article  Google Scholar 

  16. Navimipour NJ (2015) Task scheduling in the cloud environments based on an artificial bee colony algorithm. In: International Conference on Image Processing, pp 38–44

  17. Qin P, Dai B, Huang B, Xu G (2015) Bandwidth-aware scheduling with SDN in Hadoop: a new trend for big data. IEEE Syst J 11(4):2337–2344

    Article  Google Scholar 

  18. Mashayekhy L, Nejad MM, Grosu D, Zhang Q, Shi W (2014) Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans Parallel Distrib Syst 26(10):2720–2733

    Article  Google Scholar 

  19. Bodík P, Menache I, Naor J, Yaniv J (2014) Deadline-aware scheduling of big-data processing jobs. In: Proceedings of the 26th ACM symposium on parallelism in algorithms and architectures, pp 211–213

  20. Abouelela M, El-Darieby M (2016) Scheduling big data applications within advance reservation framework in optical grids. Appl Soft Comput 38:1049–1059

    Article  Google Scholar 

  21. Li X, Song J, Huang B (2016) A scientific workflow management system architecture and its scheduling based on cloud service platform for manufacturing big data analytics. Int J Adv Manuf Technol 84(1–4):119–131

    Article  Google Scholar 

  22. Gautam JV, Prajapati HB, Dabhi VK, Chaudhary S (2015) A survey on job scheduling algorithms in big data processing. In: 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT). IEEE, pp 1–11

  23. Wang K, Raicu I (2014) Scheduling data-intensive many-task computing applications in the cloud. In: NSFCloud workshop

  24. Bardhan S, Menascé DA (2014) A contention aware hybrid evaluator for schedulers of big data applications in computer clusters. In: 2014 IEEE International Conference on Big Data (big data). IEEE, pp 11–19

  25. Zhao Y, Fei X, Raicu I, Lu S (2011) Opportunities and challenges in running scientific workflows on the cloud. In: 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery. IEEE, pp 455–462

  26. Dashti SE, Rahmani AM (2016) Dynamic VMs placement for energy efficiency by PSO in cloud computing. J Exp Theor Artif Intell 28(1–2):97–112

    Article  Google Scholar 

  27. Lorch JR, Smith AJ (2001) Improving dynamic voltage scaling algorithms with PACE. ACM SIGMETRICS Perform Evaluat Rev 29(1):50–61

    Article  Google Scholar 

  28. Lee YC, Zomaya AY (2010) Energy conscious scheduling for distributed computing systems under different operating conditions. IEEE Trans Parallel Distrib Syst 22(8):1374–1381

    Article  Google Scholar 

  29. Topcuoglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274

    Article  Google Scholar 

  30. Wang L, Von Laszewski G, Dayal J, Wang F (2010) Towards energy aware scheduling for precedence constrained parallel tasks in a cluster with DVFS. In: 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE, pp 368–377

  31. Kimura H, Sato M, Hotta Y, Boku T, Takahashi D (2006) Emprical study on reducing energy of parallel programs using slack reclamation by dvfs in a power-scalable high performance cluster. In: 2006 IEEE International Conference on Cluster Computing. IEEE, pp 1–10

  32. Tang Z, Qi L, Cheng Z, Li K, Khan SU, Li K (2016) An energy-efficient task scheduling algorithm in DVFS-enabled cloud environment. J Grid Comput 14(1):55–74

    Article  Google Scholar 

  33. Zhong X, Xu C-Z (2007) Energy-aware modeling and scheduling for dynamic voltage scaling with statistical real-time guarantee. IEEE Trans Comput 56(3):358–372

    Article  MathSciNet  Google Scholar 

  34. Bini E, Buttazzo G, Lipari G (2009) Minimizing CPU energy in real-time systems with discrete speed management. ACM Trans Embed Comput Syst (TECS) 8(4):1–23

    Article  Google Scholar 

  35. Quan G, Hu XS (2007) Energy efficient dvs schedule for fixed-priority real-time systems. ACM Trans Embed Comput Syst (TECS) 6(4):29

    Article  MathSciNet  Google Scholar 

  36. Zhuo J, Chakrabarti C (2008) Energy-efficient dynamic task scheduling algorithms for DVS systems. ACM Trans Embed Comput Syst (TECS) 7(2):1–25

    Article  Google Scholar 

  37. Juarez F, Ejarque J, Badia RM (2018) Dynamic energy-aware scheduling for parallel task-based application in cloud computing. Futur Gener Comput Syst 78:257–271

    Article  Google Scholar 

  38. Duan H, Chen C, Min G, Wu Y (2017) Energy-aware scheduling of virtual machines in heterogeneous cloud computing systems. Futur Gener Comput Syst 74:142–150

    Article  Google Scholar 

  39. Wen Y, Liu J, Dou W, Xu X, Cao B, Chen J (2020) Scheduling workflows with privacy protection constraints for big data applications on cloud. Futur Gener Comput Syst 108:1084–1091

    Article  Google Scholar 

  40. Elhoseny M, Abdelaziz A, Salama AS, Riad AM, Muhammad K, Sangaiah AK (2018) A hybrid model of internet of things and cloud computing to manage big data in health services applications. Futur Gener Comput Syst 86:1383–1394

    Article  Google Scholar 

  41. Alboaneen D, Tianfield H, Zhang Y, Pranggono B (2021) A metaheuristic method for joint task scheduling and virtual machine placement in cloud data centers. Futur Gener Comput Syst 115:201–212

    Article  Google Scholar 

  42. Zhao Q, Xiong C, Yu C, Zhang C, Zhao X (2016) A new energy-aware task scheduling method for data-intensive applications in the cloud. J Netw Comput Appl 59:14–27

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maziyar Grami.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grami, M. An energy-aware scheduling of dynamic workflows using big data similarity statistical analysis in cloud computing. J Supercomput 78, 4261–4289 (2022). https://doi.org/10.1007/s11227-021-04016-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-04016-8

Keywords

Navigation