Abstract
With the rapid development of information technology, the data generated by the Internet has exploded in recent years. The proliferation of data has brought about a huge increase in the energy consumption for data processing especially in real-time processing framework for big data. In this study, two energy-efficient scheduling algorithms are proposed to reduce energy consumption for streaming applications in Storm. First, an energy consumption model is designed for Storm framework. Then this model is introduced into Storm by an energy consumption monitoring module. For proposed algorithm 1, the energy consumption of the processing tasks is minimized by integrating the tasks into the low energy consumption nodes. For proposed algorithm 2, load balance and energy consumption of Storm cluster are traded off and optimized by sorting the Slot utilization of low energy consumption nodes in the cluster and assigning tasks priority to the low Slot utilization nodes. Test on Hibench workload, the proposed algorithms reduce the total energy consumption of Storm cluster up to 32% compared with the traditional scheduling algorithms. It shows that the proposed scheduling algorithms can effectively reduce the total energy consumption of the Storm cluster while satisfying the deadline constrains.
Similar content being viewed by others
References
Li C, Zhang J, Luo Y (2017) Real-time scheduling based on optimized topology and communication traffic in distributed real-time computation platform of storm. J Netw Comput Appl 87:100–115
Liu X (2018) Robust resource management in distributed stream processing systems. PhD thesis
Chatzistergiou A, Viglas SD (2014) Fast heuristics for near-optimal task allocation in data stream processing over clusters, In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, ACM, pp 1579–1588
Liu X, Buyya R (2017) Performance-oriented deployment of streaming applications on cloud. IEEE Trans Big Data 5(1):46–59
Peng B, Hosseini M, Hong Z, Farivar R, Campbell R (2015) R-storm: resource-aware scheduling in storm. In: Proceedings of the 16th annual middleware conference, ACM, pp 149–161
Chakraborty R, Majumdar S (2016) A priority based resource scheduling technique for multitenant storm clusters. In: 2016 international symposium on performance evaluation of computer and telecommunication systems (SPECTS), IEEE, pp 1–6
Weng Z, Guo Q, Wang C, Meng X, He B (2017) Adastorm: resource efficient storm with adaptive configuration. In: 2017 IEEE 33rd international conference on data engineering (ICDE), IEEE, pp 1363–1364
Qian W, Shen Q, Qin J, Yang D, Yang Y, Wu Z (2016) S-storm: a slot-aware scheduling strategy for even scheduler in storm. In: 2016 IEEE 18th international conference on high performance computing and communications; IEEE 14th international conference on smart city; IEEE 2nd international conference on data science and systems (HPCC/SmartCity/DSS), IEEE, pp 623–630
Cardellini V, Grassi V, Lo Presti F, Nardelli M (2015) Distributed qos-aware scheduling in storm. In: Proceedings of the 9th ACM international conference on distributed event-based systems, ACM, pp 344–347
Eskandari L, Huang Z, Eyers D (2016) P-scheduler: adaptive hierarchical scheduling in apache storm. In: Proceedings of the Australasian computer science week multiconference, ACM, p 26
Xiang D, Wu Y, Shang P, Jiang J, Wu J, Yu K (2017) Rb-storm: resource balance scheduling in apache storm. In: 2017 6th IIAI international congress on advanced applied informatics (IIAI-AAI), IEEE, pp 419–423
Long S, Rao R, Miao W, Zhang X (2015) An improved topology schedule algorithm for storm system. In: Computer science and applications: proceedings of the 2014 Asia-Pacific conference on computer science and applications (CSAC 2014), Shanghai, China, 27–28 December 2014, p 187
Ibrahim H, Aburukba RO, El-Fakih K (2018) An integer linear programming model and adaptive genetic algorithm approach to minimize energy consumption of cloud computing data centers. Comput Electr Eng 67:551–565
Iqbal MH, Soomro TR (2015) Big data analysis: apache storm perspective. Int J Comput Trends Technol 19(1):9–14
Xu J, Chen Z, Tang J, Su S (2014) T-storm: traffic-aware online scheduling in storm. In: 2014 IEEE 34th international conference on distributed computing systems, IEEE, pp 535–544
Han Z, Zhang Y (2015) Spark: a big data processing platform based on memory computing, In: 2015 seventh international symposium on parallel architectures, algorithms and programming (PAAP), IEEE, pp 172–176
Backman N, Fonseca R, Çetintemel U (2012) Managing parallelism for stream processing in the cloud. In: Proceedings of the 1st international workshop on hot topics in cloud data processing, ACM, p 1
Zhang J, Li C, Zhu L, Liu Y (2016) The real-time scheduling strategy based on traffic and load balancing in storm. In: 2016 IEEE 18th international conference on high performance computing and communications; IEEE 14th international conference on smart city; IEEE 2nd international conference on data science and systems (HPCC/SmartCity/DSS), IEEE, pp 372–379
Aniello L, Baldoni R, Querzoni L (2013) Adaptive online scheduling in storm. In: Proceedings of the 7th ACM international conference on distributed event-based systems. ACM, pp 207–218
Wu C-M, Chang R-S, Chan H-Y (2014) A green energy-efficient scheduling algorithm using the dvfs technique for cloud datacenters. Futur Gener Comput Syst 37:141–147
Zhang X, Wu T, Chen M, Wei T, Zhou J, Hu S, Buyya R (2019) Energy-aware virtual machine allocation for cloud with resource reservation. J Syst Softw 147:147–161
Xu M, Alamro S, Lan T, Subramaniam S (2018) chronos: a unifying optimization framework for speculative execution of deadline-critical mapreduce jobs. In: 2018 IEEE 38th international conference on distributed computing systems (ICDCS), IEEE, pp 718–729
Requeno JI, Merseguer J, Bernardi S, Perez-Palacin D, Giotis G, Papanikolaou V (2019) Quantitative analysis of apache storm applications: the newsasset case study. Inf Syst Front 21(1):67–85
Mashayekhy L, Nejad MM, Grosu D, Zhang Q, Shi W (2014) Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans Parallel Distrib Syst 26(10):2720–2733
Tian W, Li G, Yang W, Buyya R (2016) Hscheduler: an optimal approach to minimize the makespan of multiple mapreduce jobs. J Supercomput 72(6):2376–2393
Li H, Wang H, Fang S, Zou Y, Tian W (2019) An energy-aware scheduling algorithm for big data applications in spark. Cluster Comput 23:593–609
Maroulis S, Zacheilas N, Kalogeraki V (2017) A framework for efficient energy scheduling of spark workloads. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS), IEEE, pp 2614–2615
Jin P, Hao X, Wang X, Yue L (2018) Energy-efficient task scheduling for cpu-intensive streaming jobs on hadoop. IEEE Trans Parallel Distrib Syst 30(6):1298–1311
Song J, He H, Wang Z, Yu G, Pierson J-M (2018) Modulo based data placement algorithm for energy consumption optimization of mapreduce system. J Grid Comput 16(3):409–424
Luo L, Wu W-J, Zhang F (2014) Energy modeling based on cloud data center. J Softw 25(7):1371–1387
Chen Y-R, Lee C-R (2016) G-storm: a gpu-aware storm scheduler. In: 2016 IEEE 14th international conference on dependable, autonomic and secure computing; 14th international conference on pervasive intelligence and computing; 2nd international conference on big data intelligence and computing and cyber science and technology congress (DASC/PiCom/DataCom/CyberSciTech), IEEE, pp 738–745
Samadi Y, Zbakh M, Tadonki C (2018) Performance comparison between hadoop and spark frameworks using hibench benchmarks. Concurr Comput Pract Exp 30(12):e4367
Acknowledgements
This work was supported by Chongqing science and Technology Commission Project (Grant Nos: cstc2017jcyjAX0142 and cstc2018jcyjAX0525), Key Research and Development Projects of Sichuan Science and Technology Department (Grant No: 2019YFG0107).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, H., Dai, H., Liu, Z. et al. Dynamic energy-efficient scheduling for streaming applications in storm. Computing 104, 413–432 (2022). https://doi.org/10.1007/s00607-021-00961-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-021-00961-7