Skip to main content
Log in

An elastic and traffic-aware scheduler for distributed data stream processing in heterogeneous clusters

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Existing Data Stream Processing (DSP) systems perform poorly while encountering heavy workloads, particularly on clustered set of (heterogeneous) computers. Elasticity and changing application parallelism degree can limit the performance degradation in the face of varying workloads that negatively impact the overall application response time. Elasticity can be achieved by operator scaling, i.e., by replication and relocation in operators at runtime. However, scaling decisions at runtime is challenging, since it first increases the overall communication overhead between operators and secondly changes any initial scheduling that could lead to a non-optimal scheduling plan. In this paper, we investigate the problem of elasticity and scaling decisions and propose a DSP system called ER-Storm. To curb communication overhead, we propose a new 3-step mechanism for replication and relocation of operators upon detecting a bottleneck operator that overutilizes a worker node. The other challenge is to select the proper worker nodes to host relocated operators. By discretizing the input workload, we model the relocation of operators between worker nodes at runtime through a scalable Markov Decision Process (MDP) and use a model-free notion of reinforcement learning (Q-Learning) to find optimal solutions. We have implemented our propositions on the Apache Storm version 2.1.0. Our experimental results show that ER-Storm reduces the average topology response time by 20–60 percent based on the rate of input workload (low or high) compared to the R-Storm scheduler and the Online-Scheduler of Storm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Lal DK, Suman U (2019) Towards comparison of real time stream processing engines. In: Proceedings of the IEEE Conference on Information and Communication Technology, pp 1–5

  2. Nardelli M, Cardellini V, Grassi V, Presti FL (2019) Efficient operator placement for distributed data stream processing applications. IEEE Trans Parallel Distrib Syst 30(8):1753–1767

    Article  Google Scholar 

  3. Govindarajan K, Kamburugamuve S, Wickramasinghe P, Abeykoon V, Fox G (2017) Task scheduling in big data-review, research challenges, and prospects. In: Proceedings of the Ninth International Conference on Advanced Computing (ICoAC), pp 165–173

  4. Sun D, Gao S, Liu X, Li F, Zheng X, Buyya R (2019) State and runtime-aware scheduling in elastic stream computing systems. Futur Gener Comput Syst 97:194–209

    Article  Google Scholar 

  5. Russo GR, Cardellini V, Presti FL (2019) Reinforcement learning based policies for elastic stream processing on heterogeneous resources. In: Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, pp 31–42

  6. Schneider S, Hirzel M, Gedik B, and Wu KL (2012) Auto-parallelizing stateful distributed streaming applications. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, pp 53–64

  7. Koliousis A, Weidlich M, Castro Fernandez R, Wolf A L, Costa P, Pietzuch P Saber (2016) Window-based hybrid stream processing for heterogeneous architectures. In: Proceedings of the International Conference on Management of Data, pp 555–569

  8. Heinze T, Roediger L, Meister A, Ji Y, Jarak Z, and Fetzer C (2015) Online parameter optimization for elastic data stream processing, In: Proceedings of the Sixth ACM Symposium on Cloud Computing pp 276–287

  9. Kombi RK, Lumineau N, Lamarre P (2017) A preventive auto-parallelization approach for elastic stream processing. In: Proceedings of the IEEE 37th International Conference on Distributed Computing Systems (ICDCS), IEEE, pp1532–1542

  10. Xu J, Chen Z, Tang J, Su S (2014) T-storm: Traffic-aware online scheduling in Storm. In: Proceedings of the 2014 IEEE 34th International Conference on Distributed Computing Systems, pp 535–544

  11. Peng B, Hosseini M, Hong Z, Farivar R, Campbell R (2015) R-storm resource-aware scheduling in Storm, In: Proceedings of the 16th Annual Middleware Conference, pp 149–161

  12. Cardellini V, Lo Presti F, Nardelli M, Russo Russo G (2018) Optimal operator deployment and replication for elastic distributed data stream processing. Concurr Comput Pract Exp 30(9):e4334

    Article  Google Scholar 

  13. Aniello L, Baldoni R, Querzoni L (2013) Adaptive online scheduling in storm. In: Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, pp 207–218

  14. Lombardi F, Aniello L, Bonomi S, Querzoni L (2018) Elastic symbiotic scaling of operators and resources in stream processing systems. IEEE Trans Parallel Distrib Syst 29(3):572–585

    Article  Google Scholar 

  15. Liu X, Buyya R (2017) D-storm: Dynamic resource-efficient scheduling of stream processing applications. In: Proceedings of the 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS), pp 485–492

  16. Muhammad A, Aleem M, Islam MA (2021) TOP-storm: a topology-based resource-aware scheduler for stream processing engine. Clust Comput 24(1):417–431

    Article  Google Scholar 

  17. Fu X, Ghaffar T, Davis JC, Lee D (2019) EdgeWise: a better stream processing engine for the edge. In: 2019 USENIX Annual Technical Conference (USENIX ATC 19), pp 929–946

  18. Russo Russo G, Schiazza A, Cardellini V (2021) Elastic pulsar functions for distributed stream processing. In: Companion of the ACM/SPEC International Conference on Performance Engineering, pp 9–16

  19. Liu P, Da Silva D, Hu L (2021) DART: A scalable and adaptive edge stream processing engine. In: 2021 USENIX Annual Technical Conference (USENIX ATC 21)

  20. Heinze T, Pappalardo V, Jerzak Z, Fetzer C (2014) Auto-scaling techniques for elastic data stream processing. In: Proceedings of the IEEE 30th International Conference on Data Engineering Workshops, IEEE, pp 296–302

  21. Cardellini V, Presti FL, Nardelli M, Russo GR (2017) Auto-scaling in data stream processing applications: A model-based reinforcement learning approach. In: Proceedings of the Workshop on New Frontiers in Quantitative Methods in Informatics, pp 97–110

  22. Sun D, He H, Yan H, Gao S, Liu X, Zheng X (2021) Lr-Stream: using latency and resource aware scheduling to improve latency and throughput for streaming applications. Futur Gener Comput Syst 114:243–258

    Article  Google Scholar 

  23. Eskandari L, Mair J, Huang Z, Eyers D (2020) I-Scheduler: iterative scheduling for distributed stream processing systems. Future Gener Comput Syst 17:219–233

    Google Scholar 

  24. Mencagli G, Torquati M, Danelutto M (2018) Elastic-PPQ: a two-level autonomic system for spatial preference query processing over dynamic data streams. Futur Gener Comput Syst 79:862–877

    Article  Google Scholar 

  25. Farrokh M, Hadian H, Sharifi M, Jafari A (2022) SP-ant: An ant colony optimization-based operator scheduler for high performance distributed stream processing on heterogeneous clusters. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2021.116322

    Article  Google Scholar 

  26. Toshniwal A, Taneja S, Shukla A, Ramasamy K, Patel JM, Kulkarni S, Jackson J et al (2014) Storm@ twitter. In: Proceedings of the ACM SIGMOD International Conference on Management of data, pp 147–156

  27. Kulkarni S, Bhagat N, Fu M, Kedigehalli V, Kellogg C, Mittal S, Patel JM, Ramasamy K, Taneja S (2015) Twitter Heron: Stream processing at scale. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp 239–250

  28. Flink.apache.org. Apache Flink: Stateful computations over data streams. [online] Available at: <http://flink.apache.org/> [Accessed 19 Aug 2021]

  29. Spark.apache.org. Apache Spark™ - Unified analytics engine for big data. [online] Available at: <http://spark.apache.org/> [Accessed 19 Aug 2020]

  30. Cardellini V, Nardelli M, Luzi D (2016) Elastic stateful stream processing in storm. In: Proceedings of the International Conference on High Performance Computing & Simulation (HPCS), pp 583–590

  31. Farahabady MRH, Samani HRD, Wang Y, Zomaya AY, Tari Z (2016) A QOS-aware controller for Apache Storm. In: Proceedings of the IEEE 15th International Symposium on Network Computing and Applications (NCA), pp 334–342

  32. Eskandari L, Huang Z, Eyers D (2016) P-Scheduler: Adaptive hierarchical scheduling in Apache Storm. In: Proceedings of the Australasian Computer Science Week Multiconference, pp 1–10

  33. Zookeeper.apache.org. Apache Zookeeper. [online] Available at: <https://zookeeper.apache.org/> [Accessed 19 Aug 2021]

  34. Bilal M, Canini M (2017) Towards automatic parameter tuning of stream processing systems. In: Proceedings of the Symposium on Cloud Computing, pp189–200

  35. Liu S, Weng J, Wang JH, An C, Zhou Y, Wang J (2019) An adaptive online scheme for scheduling and resource enforcement in Storm. IEEE/ACM Trans Netw 27(4):1373–1386

    Article  Google Scholar 

  36. Tantalaki N, Souravlas S, Roumeliotis M (2020) A review on big data real-time stream processing and its scheduling techniques. Int J Parallel Emerg Distrib Syst 35(5):571–601

    Article  Google Scholar 

  37. Howe B, Balazinska M (2012) Beyond MapReduce: New requirements for scalable data processing, data-intensive computing: architectures, algorithms, and applications

  38. Liu X, Dastjerdi AV, Calheiros RN, Qu C, Buyya R (2017) A stepwise auto-profiling method for performance optimization of streaming applications. ACM Trans Autonom Adapt Syst (TAAS) 12(4):1–33

    Article  Google Scholar 

  39. Schneider S, Andrade H, Gedik B, Biem A, Wu KL (2009) Elastic scaling of data parallel operators in stream processing. In: Proceedings of the IEEE International Symposium on Parallel & Distributed Processing, pp 1–12

  40. Kombi RK, Lumineau N, Lamarre P, Rivetti N, Busnel Y (2019) DABS-Storm: a data-aware approach for elastic stream processing. Transactions on large-scale data-and knowledge-centered systems XL. Springer, Berlin, Heidelberg, pp 58–93

    Chapter  Google Scholar 

  41. De Assuncao MD, da Silva Veith A, Buyya R (2018) Distributed data stream processing and edge computing. In: Proceedings of a survey on resource elasticity and future directions. Journal of Network and Computer Applications. vol 103, pp 1–17

  42. Liu X, Buyya R (2017) Performance-oriented deployment of streaming applications on cloud. IEEE Trans Big Data 5(1):46–59

    Article  Google Scholar 

  43. Fukunaga AS, Korf RE (2005) Bin-completion algorithms for multi-container packing and covering problems. In: Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, vol 28, pp 117–124

  44. Dai Y, Xiang Y, Zhang G (2009) Self-healing and hybrid diagnosis in cloud computing. In: Proceedings of the IEEE International Conference on Cloud Computing. Springer, Berlin, Heidelberg, pp 45–56

  45. Fekade B, Maksymyuk T, Jo M (2016) Clustering hypervisors to minimize failures in mobile cloud computing. Wirel Commun Mob Comput 16(18):3455–3465

    Article  Google Scholar 

  46. Kombi RK, Lumineau N, Lamarre P, Rivetti N, Busnel Y (2019) DABS-Storm: A data-aware approach for elastic stream processing. Transactions on large-scale Data-and Knowledge-centered Systems XL, Springer, Berlin, Heidelberg pp 58–93

  47. Watkins CJ, Dayan P (1992) Q-learning. In: Proceedings of the Machine learning vol 8, no. 3-4, pp 279-292

  48. Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130

    Article  MATH  Google Scholar 

  49. Tan PN, Steinbach M, Kumar V (2016) Introduction to data mining. Pearson Education India

  50. Carroll, A., 2022. Alice's Adventures in Wonderland by Lewis Carroll. [online] Project Gutenberg. Available at: <https://www.gutenberg.org/ebooks/11> [Accessed 17 June 2022]

  51. Illecker M (2015) SentiStorm, [Online]. Available: https:// github.com/millecker/senti-storm

  52. Kaggle.com. 2022. Sentiment140 dataset with 1.6 million tweets. [online] Available at: <https://www.kaggle.com/kazanova/sentiment140> [Accessed 17 June 2022]

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohsen Sharifi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: The process of scheduling in ER-Storm

Appendix: The process of scheduling in ER-Storm

See Appendix Fig. 

Fig. 19
figure 19

The workflow of scheduling of ER-Storm

19.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hadian, H., Farrokh, M., Sharifi, M. et al. An elastic and traffic-aware scheduler for distributed data stream processing in heterogeneous clusters. J Supercomput 79, 461–498 (2023). https://doi.org/10.1007/s11227-022-04669-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04669-z

Keywords

Navigation