Skip to main content
Log in

A two-tier coordinated load balancing strategy over skewed data streams

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Load imbalance severely affects cluster performance, and the polarization of resources due to load skewing leads to further worsening of system throughput and latency problems. The proliferation of tasks to be processed in the big data era leads to more severe load skewing. How to cope with the surge of skewed data stream in the context of big data is a new challenge now. In this paper, we propose a coordinated load balancing strategy on skewed data streams (referred to as St-Stream), which is a two-tier hierarchical system for handling data streams. The proposed strategy is characterized by performing a migration pairing strategy for resources at the task allocation stage by cutting and moving out the tasks of high-load nodes in a hierarchical manner, and the moved-out operators are placed in the routing table, and the routing table operators are moved out to these nodes sequentially according to the tasks required by low-load nodes. We further design a two-tier coordination scheme for the resource allocation problem, which can adjust the skewed load from within the nodes and then dynamically restore the balance between the nodes. We implemented St-Stream on Apache Storm, which achieves a 21% coordination in processing CPU utilization, a 17.6% reduction in latency, and a 0.3 improvement in load balance recovery compared to the baseline design. Our experimental results demonstrate that the proposed load balancing strategy better balances the cluster load and improves the performance of the stream processing system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Liu S, Weng J, Wang J, An C, Zhou Y, Wang J (2019) An adaptive online scheme for scheduling and resource enforcement in storm. IEEE/ACM Trans Netw (TON)

  2. Baig F, Teng D, Kong J, Wang (2021) Spear: dynamic spatio-temporal query processing over high velocity data streams. 2021 IEEE 37th International Conference on Data Engineering (ICDE), 2279–2284

  3. Kumar V, Sharma DK, Mishra VK (2021) Mille cheval: a gpu-based in-memory high-performance computing framework for accelerated processing of big-data streams. J Supercomput 77:6936–6960

    Article  Google Scholar 

  4. Aleem M, Islam A (2020) Top-storm: a topology-based resource-aware scheduler for stream processing engine. Cluster Comput J Netw Softw Tools Appl, 123–124

  5. Hadian H, Farrokh M, Sharifi M, Jafari A (2023) An elastic and traffic-aware scheduler for distributed data stream processing in heterogeneous clusters. J Supercomput 79:461–498

    Article  Google Scholar 

  6. Liu C, Weng J, Wang J, An C, Zhou Y, Wang J (2019) An adaptive online scheme for scheduling and resource enforcement in storm. IEEE/ACM Trans Netw 27:1373–1386

    Article  Google Scholar 

  7. Li W, Zhang Z, Shu Y, Liu H, Liu T (2022) Toward optimal operator parallelism for stream processing topology with limited buffers. J Supercomput 78:13276–13297

    Article  Google Scholar 

  8. Zhang Z, Jin PQ, Wang XL.(2019) N-storm: efficient thread-level task migration in apache storm. 2019 IEEE 21st International Conference on High Performance Computing and Communication, IEEE 14th International Conference on Smart City, IEEE 2nd International Conference on Data Science and Systems, 1595–1602

  9. Qian W, Shen Q, Qin J, Yang D, Yang Y, Wu Z (2016) A slot-aware scheduling strategy for even scheduler in storm. 18th International Conference on High Performance Computing and Communications, 623–630

  10. Houatra D, Tseng Y (2018) Monitoring 5g radio access networks with cloud-based stream processing platforms. 2018 21st Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN), 1–5

  11. Bi Y, Han G, Lin C (2020) Intelligent quality of service aware traffic forwarding for software-defined networking/open shortest path first hybrid industrial internet. IEEE Trans Industr Inf 16:1395–1405

    Article  Google Scholar 

  12. Cheng D, Zhou X, Wang Y, Jiang C (2018) Adaptive scheduling parallel jobs with dynamic batching in spark streaming. IEEE Trans Parallel Distrib Syst 29:2672–2685

    Article  Google Scholar 

  13. Fischer L, Bernstein A (2015) Workload scheduling in distributed stream proces-sors using graph partitioning. Proceedings of IEEE International Conference on Big Data, Big Data, 124–133

  14. Zhao J, Guo J (2018) Design of distance learning streaming media system based on cloud platform. 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), 131–134

  15. Shangguan B, Yue P, Wu Z (2017) A stream computing based approach for updating waterlogging information on remote sensing images. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 373–375

  16. Shojaei K, Safi-Esfahani Ayat S (2018) Vmdfs: virtual machine dynamic frequency scaling strategy in cloud computing. J Supercomput 74:5944–5979

    Article  Google Scholar 

  17. Fan JHC, Hu F (2015) Adaptive task scheduling in storm. 2015 4th International Conference on Computer Science and Ne The Power of Both Choices Technology (ICCSNT), 309–314

  18. Liao X, Huang Y, Zheng L, Jin H (2019) Efficient time-evolving stream processing at scale. IEEE Trans Parallel Distrib Syst 30:2165–2178

    Article  Google Scholar 

  19. Jayashri C, Abitha P, Subburaj S, Devi SY, S S, S J (2017) Big data transfers through dynamic and load balanced flow on cloud networks. 2017 Third International Conference on Advances in Electrical, Electronics, In-formation, Communication and Bio-Informatics (AEEICB), 342–346

  20. Deng S et al (2020) Dynamical resource allocation in edge for trustable internet-of-things systems: a reinforcement learning method. IEEE Trans Industr Inf 16:6103–6113

    Article  Google Scholar 

  21. Grandl R, Chowdhury M, Akella A (2016) Altruistic scheduling in multi-resource clusters. Proceedings of OSDI’16: 12th USENIX Symposium on Operating Systems Design and Implementation, 65–80

  22. Son SHI, Moon YS (2021) Stochastic distributed data stream partitioning using task locality: design, implementation, and optimization. J Supercomput 10

  23. Aslam, Adeel HC, H J (2021) Pre-filtering based summarization for data partitioning in distributed stream processing. Concurr Comput Pract Exp

  24. Li W, Liu D, Chen K, Li K, Qi H (2021) Hone: mitigating stragglers in distributed stream processing with tuple scheduling. IEEE Trans Parall Distrib Syst, 99

  25. FeiChen SongWu HaiJin (2018) Network-aware grouping in distributed stream processing systems. In: International Conference on Algorithms and Architectures for Parallel Processing

  26. Qian W, Shen Q, Qin J, Yang D, Yang Y, Wu Z (2016) S-storm: a slot-aware scheduling strategy for even scheduler in storm. 2016 IEEE 2nd Interna-tional Conference on Data Science and Systems, HPCC, Sydney, NSW, Australia, 623–630

  27. Nasir MAU, Morales G, García-Soriano D, Kourtellis N, Serafini M (2015) The power of both choices: Practical load balancing for distributed tream processing engines. 2015 IEEE 31st International Conference on Data Engineering, 137–148

  28. Sun D, Yan H, Gao S, Liu X, Buyya R (2018) Rethinking elastic online scheduling of big data streaming applications over high-velocity continuous data streams. J Supercomput 74:615–636

    Article  Google Scholar 

  29. Lang K, Chai X (2022) Implementation of load balancing algorithm based on flink cluster, pp 264–268

  30. Dai Q, Qin G, Li J, Zhao J, Cai J (2023) A resource occupancy ratio-oriented load balancing task scheduling mechanism for flink. J Intell Fuzzy Syst 44:2703–2713

    Article  Google Scholar 

  31. Li Z, Yu J, Wang Y, Bian C, Pu Y, Zhang Y, Liu Y (2020) Load prediction based elastic resource scheduling strategy in flink. J Commun 41:92–108

    Google Scholar 

  32. Anis Uddin Nasir M, G, DFM, Kourtellis N, Serafini M (2016) When two choices are not enough: balancing at scale in distributed stream processing. 2016 IEEE 32nd International Conference on Data Engineering, ICDE, Helsinki, Finland, 589–600

  33. Chen H, Zhang F, Jin H (2021) Pstream: a popularity-aware differentiated distributed stream processing system. IEEE Trans Comput 70:1582–1597

    Article  MathSciNet  MATH  Google Scholar 

  34. Aslam A, Chen H, Jin H (2021) Pre-filtering based summarization for data partitioning in distributed stream processing. Concurr Comput Pract Exp 33

  35. Fu TZJ, Ding J, Ma RTB, Winslett M, Yang Y, Zhang Z (2015) Drs: dynamic resource scheduling for real-time analytics over fast streams. 2015 IEEE 35th International Conference on Distributed Computing Systems, Colum-bus, OH, USA, 411–420

  36. Zhang W, Duan P, Gong W, Lu Q, Yang S (2016) A load-aware pluggable cloud strategy for real-time video processing. IEEE Trans Industr Inf 12:2166–2176

    Article  Google Scholar 

  37. Cardellini V, Grassi V, Presti FL, Nardelli M (2015) Poster: distributed qos-aware scheduling in storm". Acm International Conference on Distributed Event-based Systems, 344–347

  38. Zhou Y, Liu Y, Zhang C, Peng X (2020) Toss: a topology-based scheduler for storm c1usters. IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW, 587–588

  39. Wu M, Sun D, Cui Y, Gao S, Liu X, Buyya R (2022) A state lossless scheduling strategy in distributed stream computing systems. J Netw Comput Appl 206:1–16

    Article  Google Scholar 

  40. Vicentini C, Santin A, Viegas E, Abreu V (2019) Sdn-based and multitenant-aware resource provisioning mechanism for cloud-based big data streaming. J Netw Comput Appl 126:133–149

    Article  Google Scholar 

  41. Fischer L, Bernstein A (2015) Workload scheduling in distributed stream processors using graph partitioning proceedings. IEEE International Conference on Big Data, Big Data., 124–133

  42. Li B, Zhang Z, Zheng T, Zhong Q, Huang Q, Cheng X (2020) Marabunta: continuous distributed processing of skewed streams. 2020 20th IEEE/ACM Inter-national Symposium on Cluster, Cloud and Internet Computing (CCGRID), 252–261

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grant No. 61972364; the Fundamental Research Funds for the Central Universities under Grant No. 265QZ2021001; Melbourne-Chindia Cloud Computing (MC3) Research Network, Australia.

Author information

Authors and Affiliations

Authors

Contributions

DS contributed to methodology, validation, writing–original draft, investigation, and funding acquisition. MW contributed to conceptualization, methodology, validation, and writing–original draft. ZY contributed to validation, investigation, writing–review and editing. AS contributed to formal analysis, investigation, writing–review and editing. RB contributed to methodology, writing–review and editing, supervision, and funding acquisition.

Corresponding author

Correspondence to Dawei Sun.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, D., Wu, M., Yang, Z. et al. A two-tier coordinated load balancing strategy over skewed data streams. J Supercomput 79, 21028–21056 (2023). https://doi.org/10.1007/s11227-023-05473-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05473-z

Keywords

Navigation