Abstract
Load imbalance severely affects cluster performance, and the polarization of resources due to load skewing leads to further worsening of system throughput and latency problems. The proliferation of tasks to be processed in the big data era leads to more severe load skewing. How to cope with the surge of skewed data stream in the context of big data is a new challenge now. In this paper, we propose a coordinated load balancing strategy on skewed data streams (referred to as St-Stream), which is a two-tier hierarchical system for handling data streams. The proposed strategy is characterized by performing a migration pairing strategy for resources at the task allocation stage by cutting and moving out the tasks of high-load nodes in a hierarchical manner, and the moved-out operators are placed in the routing table, and the routing table operators are moved out to these nodes sequentially according to the tasks required by low-load nodes. We further design a two-tier coordination scheme for the resource allocation problem, which can adjust the skewed load from within the nodes and then dynamically restore the balance between the nodes. We implemented St-Stream on Apache Storm, which achieves a 21% coordination in processing CPU utilization, a 17.6% reduction in latency, and a 0.3 improvement in load balance recovery compared to the baseline design. Our experimental results demonstrate that the proposed load balancing strategy better balances the cluster load and improves the performance of the stream processing system.
Similar content being viewed by others
References
Liu S, Weng J, Wang J, An C, Zhou Y, Wang J (2019) An adaptive online scheme for scheduling and resource enforcement in storm. IEEE/ACM Trans Netw (TON)
Baig F, Teng D, Kong J, Wang (2021) Spear: dynamic spatio-temporal query processing over high velocity data streams. 2021 IEEE 37th International Conference on Data Engineering (ICDE), 2279–2284
Kumar V, Sharma DK, Mishra VK (2021) Mille cheval: a gpu-based in-memory high-performance computing framework for accelerated processing of big-data streams. J Supercomput 77:6936–6960
Aleem M, Islam A (2020) Top-storm: a topology-based resource-aware scheduler for stream processing engine. Cluster Comput J Netw Softw Tools Appl, 123–124
Hadian H, Farrokh M, Sharifi M, Jafari A (2023) An elastic and traffic-aware scheduler for distributed data stream processing in heterogeneous clusters. J Supercomput 79:461–498
Liu C, Weng J, Wang J, An C, Zhou Y, Wang J (2019) An adaptive online scheme for scheduling and resource enforcement in storm. IEEE/ACM Trans Netw 27:1373–1386
Li W, Zhang Z, Shu Y, Liu H, Liu T (2022) Toward optimal operator parallelism for stream processing topology with limited buffers. J Supercomput 78:13276–13297
Zhang Z, Jin PQ, Wang XL.(2019) N-storm: efficient thread-level task migration in apache storm. 2019 IEEE 21st International Conference on High Performance Computing and Communication, IEEE 14th International Conference on Smart City, IEEE 2nd International Conference on Data Science and Systems, 1595–1602
Qian W, Shen Q, Qin J, Yang D, Yang Y, Wu Z (2016) A slot-aware scheduling strategy for even scheduler in storm. 18th International Conference on High Performance Computing and Communications, 623–630
Houatra D, Tseng Y (2018) Monitoring 5g radio access networks with cloud-based stream processing platforms. 2018 21st Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN), 1–5
Bi Y, Han G, Lin C (2020) Intelligent quality of service aware traffic forwarding for software-defined networking/open shortest path first hybrid industrial internet. IEEE Trans Industr Inf 16:1395–1405
Cheng D, Zhou X, Wang Y, Jiang C (2018) Adaptive scheduling parallel jobs with dynamic batching in spark streaming. IEEE Trans Parallel Distrib Syst 29:2672–2685
Fischer L, Bernstein A (2015) Workload scheduling in distributed stream proces-sors using graph partitioning. Proceedings of IEEE International Conference on Big Data, Big Data, 124–133
Zhao J, Guo J (2018) Design of distance learning streaming media system based on cloud platform. 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), 131–134
Shangguan B, Yue P, Wu Z (2017) A stream computing based approach for updating waterlogging information on remote sensing images. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 373–375
Shojaei K, Safi-Esfahani Ayat S (2018) Vmdfs: virtual machine dynamic frequency scaling strategy in cloud computing. J Supercomput 74:5944–5979
Fan JHC, Hu F (2015) Adaptive task scheduling in storm. 2015 4th International Conference on Computer Science and Ne The Power of Both Choices Technology (ICCSNT), 309–314
Liao X, Huang Y, Zheng L, Jin H (2019) Efficient time-evolving stream processing at scale. IEEE Trans Parallel Distrib Syst 30:2165–2178
Jayashri C, Abitha P, Subburaj S, Devi SY, S S, S J (2017) Big data transfers through dynamic and load balanced flow on cloud networks. 2017 Third International Conference on Advances in Electrical, Electronics, In-formation, Communication and Bio-Informatics (AEEICB), 342–346
Deng S et al (2020) Dynamical resource allocation in edge for trustable internet-of-things systems: a reinforcement learning method. IEEE Trans Industr Inf 16:6103–6113
Grandl R, Chowdhury M, Akella A (2016) Altruistic scheduling in multi-resource clusters. Proceedings of OSDI’16: 12th USENIX Symposium on Operating Systems Design and Implementation, 65–80
Son SHI, Moon YS (2021) Stochastic distributed data stream partitioning using task locality: design, implementation, and optimization. J Supercomput 10
Aslam, Adeel HC, H J (2021) Pre-filtering based summarization for data partitioning in distributed stream processing. Concurr Comput Pract Exp
Li W, Liu D, Chen K, Li K, Qi H (2021) Hone: mitigating stragglers in distributed stream processing with tuple scheduling. IEEE Trans Parall Distrib Syst, 99
FeiChen SongWu HaiJin (2018) Network-aware grouping in distributed stream processing systems. In: International Conference on Algorithms and Architectures for Parallel Processing
Qian W, Shen Q, Qin J, Yang D, Yang Y, Wu Z (2016) S-storm: a slot-aware scheduling strategy for even scheduler in storm. 2016 IEEE 2nd Interna-tional Conference on Data Science and Systems, HPCC, Sydney, NSW, Australia, 623–630
Nasir MAU, Morales G, García-Soriano D, Kourtellis N, Serafini M (2015) The power of both choices: Practical load balancing for distributed tream processing engines. 2015 IEEE 31st International Conference on Data Engineering, 137–148
Sun D, Yan H, Gao S, Liu X, Buyya R (2018) Rethinking elastic online scheduling of big data streaming applications over high-velocity continuous data streams. J Supercomput 74:615–636
Lang K, Chai X (2022) Implementation of load balancing algorithm based on flink cluster, pp 264–268
Dai Q, Qin G, Li J, Zhao J, Cai J (2023) A resource occupancy ratio-oriented load balancing task scheduling mechanism for flink. J Intell Fuzzy Syst 44:2703–2713
Li Z, Yu J, Wang Y, Bian C, Pu Y, Zhang Y, Liu Y (2020) Load prediction based elastic resource scheduling strategy in flink. J Commun 41:92–108
Anis Uddin Nasir M, G, DFM, Kourtellis N, Serafini M (2016) When two choices are not enough: balancing at scale in distributed stream processing. 2016 IEEE 32nd International Conference on Data Engineering, ICDE, Helsinki, Finland, 589–600
Chen H, Zhang F, Jin H (2021) Pstream: a popularity-aware differentiated distributed stream processing system. IEEE Trans Comput 70:1582–1597
Aslam A, Chen H, Jin H (2021) Pre-filtering based summarization for data partitioning in distributed stream processing. Concurr Comput Pract Exp 33
Fu TZJ, Ding J, Ma RTB, Winslett M, Yang Y, Zhang Z (2015) Drs: dynamic resource scheduling for real-time analytics over fast streams. 2015 IEEE 35th International Conference on Distributed Computing Systems, Colum-bus, OH, USA, 411–420
Zhang W, Duan P, Gong W, Lu Q, Yang S (2016) A load-aware pluggable cloud strategy for real-time video processing. IEEE Trans Industr Inf 12:2166–2176
Cardellini V, Grassi V, Presti FL, Nardelli M (2015) Poster: distributed qos-aware scheduling in storm". Acm International Conference on Distributed Event-based Systems, 344–347
Zhou Y, Liu Y, Zhang C, Peng X (2020) Toss: a topology-based scheduler for storm c1usters. IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW, 587–588
Wu M, Sun D, Cui Y, Gao S, Liu X, Buyya R (2022) A state lossless scheduling strategy in distributed stream computing systems. J Netw Comput Appl 206:1–16
Vicentini C, Santin A, Viegas E, Abreu V (2019) Sdn-based and multitenant-aware resource provisioning mechanism for cloud-based big data streaming. J Netw Comput Appl 126:133–149
Fischer L, Bernstein A (2015) Workload scheduling in distributed stream processors using graph partitioning proceedings. IEEE International Conference on Big Data, Big Data., 124–133
Li B, Zhang Z, Zheng T, Zhong Q, Huang Q, Cheng X (2020) Marabunta: continuous distributed processing of skewed streams. 2020 20th IEEE/ACM Inter-national Symposium on Cluster, Cloud and Internet Computing (CCGRID), 252–261
Acknowledgements
This work is supported by the National Natural Science Foundation of China under Grant No. 61972364; the Fundamental Research Funds for the Central Universities under Grant No. 265QZ2021001; Melbourne-Chindia Cloud Computing (MC3) Research Network, Australia.
Author information
Authors and Affiliations
Contributions
DS contributed to methodology, validation, writing–original draft, investigation, and funding acquisition. MW contributed to conceptualization, methodology, validation, and writing–original draft. ZY contributed to validation, investigation, writing–review and editing. AS contributed to formal analysis, investigation, writing–review and editing. RB contributed to methodology, writing–review and editing, supervision, and funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, D., Wu, M., Yang, Z. et al. A two-tier coordinated load balancing strategy over skewed data streams. J Supercomput 79, 21028–21056 (2023). https://doi.org/10.1007/s11227-023-05473-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05473-z