Skip to main content

A Topology-Aware Scheduling Strategy for Distributed Stream Computing System

  • Conference paper
  • First Online:
Broadband Communications, Networks, and Systems (BROADNETS 2021)

Abstract

Reducing latency has become the focus of task scheduling research in distributed big data stream computing systems. Currently, most task schedulers in big data stream computing systems mainly focus on tasks assignment and implicitly ignore task topology which can have significant impact on the latency and energy efficiency. This paper proposes a topology-aware scheduling strategy to reduce the processing latency of stream processing systems. We construct the data stream graph as a directed acyclic graph and then, divide it using the graph Laplace algorithm. On the divided graph, tasks will be assigned with a low-latency scheduling strategy. We also provide a computing node selection strategy, which enables the system to run tasks on the topology with the least number of computing nodes. Based on this scheduling strategy, the tasks of the data stream graph can be redistributed and the scheduling mechanism can be optimized to minimize the system latency. The experimental results demonstrate the efficiency and effectiveness of the proposed strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chintapalli, S., Dagit, D., et al.: Benchmarking streaming computation engines: storm, flink and spark streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW, Chicago, IL, USA, pp. 1789–1792. IEEE (2016)

    Google Scholar 

  2. Shih, D., Hsu, H., Shih, P.: A study of early warning system in volume burst risk assessment of stock with big data platform. In: 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis, ICCCBDA, Chengdu, China, pp. 244–248. IEEE (2019)

    Google Scholar 

  3. Kridel, D., Dolk, D., Castillo, D.: Adaptive modeling for real time analytics: the case of “Big Data” in mobile advertising. In: 2015 48th Hawaii International Conference on System Sciences, Kauai, HI, USA, pp. 887–896 (2015)

    Google Scholar 

  4. Sharif, A., Li, J., Khalil, M., Kumar, R., Sharif, M.I., Sharif, A.: Internet of things — smart traffic management system for smart cities using big data analytics. In: 2017 14th International Computer Conference on Wavelet Active Media Technology and Information Processing, ICCWAMTIP, Chengdu, China, pp. 281–284 (2017)

    Google Scholar 

  5. Storm Homepage. http://storm.apache.org/. Accessed 25 Apr 2021

  6. Hadoop Homepage. http://hadoop.apache.org/. Accessed 25 Apr 2021

  7. Farahabady, M.R.H., Samani, H.R.D., Wang, Y., et al.: A QoS-aware controller for apache storm. In: 2016 IEEE 15th International Symposium on Network Computing and Applications, NCA, pp. 334–342 (2016)

    Google Scholar 

  8. Liu, Y., Shi, X., Jin, H.: Runtime-aware adaptive scheduling in stream processing. Concurrency Comput. Pract. Experience 28(14), 3830–3843 (2016)

    Article  Google Scholar 

  9. Dongen, G., Poel, D.: Evaluation of stream processing frameworks. IEEE Trans. Parallel Distrib. Syst. 31(8), 1845–1858 (2020)

    Article  Google Scholar 

  10. Benjelloun, S., et al.: Big data processing: batch-based processing and stream-based processing. In: 2020 Fourth International Conference on Intelligent Computing in Data Sciences, ICDS, Fez, Morocco, pp. 1–6 (2020)

    Google Scholar 

  11. Aniello, L., Baldoni, R., Querzoni, L.: Adaptive online scheduling in storm. In Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems, pp. 207–218. ACM (2013)

    Google Scholar 

  12. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  13. Mehmood, E., Anees, T.: Challenges and solutions for processing real-time big data stream: a systematic literature review. IEEE Access 8, 119123–119143 (2020)

    Article  Google Scholar 

  14. Xhafa, F., Naranjo, V., Caballé, S.: Processing and analytics of big data streams with Yahoo!S4. In: 2015 IEEE 29th International Conference on Advanced Information Networking and Applications, Gwangju, Korea (South), pp. 263–270. IEEE (2015)

    Google Scholar 

  15. Liu, Y., Buyya, R.: Resource management and scheduling in distributed stream processing systems: a taxonomy, review, and future directions. ACM Comput. Surv. 53(3), 1–41. Article No. 50. ISSN 0360-0300 (2020)

    Google Scholar 

  16. Govindarajan, K., Kamburugamuve, S., Wickramasinghe, P., Abeykoon, V., Fox, G.: Task scheduling in big data - review, research challenges, and prospects. In: 2017 Ninth International Conference on Advanced Computing, ICoAC, Chennai, India, pp. 165–173 (2017)

    Google Scholar 

  17. Peng, Y., Hosseini, M., Hong, H., Farivar, R., Campbell, R.: R-Storm: resource-aware scheduling in storm. In: Proceedings of the 16th Annual Middleware Conference, pp. 149–161. Association for Computing Machinery, New York, NY, USA (2015)

    Google Scholar 

  18. Fu, T., Ding, J., Ma, R., Winslett, M., Yang, Y., Zhang, Z.: DRS: dynamic resource scheduling for real-time analytics over fast streams. In: Proceedings 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS, pp. 411–420. IEEE (2015)

    Google Scholar 

  19. Xu, J., Chen, Z., Tang, J., Su, S.: T-Storm: traffic-aware online scheduling in storm. In: 2014 IEEE 34th International Conference on Distributed Computing Systems, Madrid, Spain, pp. 535–544. IEEE (2014)

    Google Scholar 

  20. Zhang, Z., Jin, P., Wang, X., Liu, R., Wan, S.: N-Storm: efficient thread-level task migration in apache storm. In: 2019 IEEE 21st International Conference on High Performance Computing and Communications, pp. 1595–1602. IEEE (2019)

    Google Scholar 

  21. Eskandari, L., Huang, Z., Eyers, D.: P-Scheduler: adaptive hierarchical scheduling in apache storm. In: Proceedings of the Australasian Computer Science Week Multiconference, p. 26. ACM (2016)

    Google Scholar 

  22. Wei, H., Wei, X., Li, L.: Topology-aware task allocation for online distributed stream processing applications with latency constraints. Phys. A Stat. Mech. Appl. 534, 122024 (2019)

    Article  Google Scholar 

  23. Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grant No. 61972364; the Fundamental Research Funds for the Central Universities under Grant No. 2652021001; and Melbourne-Chindia Cloud Computing (MC3) Research Network.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dawei Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, B., Sun, D., Chau, V.L., Buyya, R. (2022). A Topology-Aware Scheduling Strategy for Distributed Stream Computing System. In: Xiang, W., Han, F., Phan, T.K. (eds) Broadband Communications, Networks, and Systems. BROADNETS 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 413. Springer, Cham. https://doi.org/10.1007/978-3-030-93479-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93479-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93478-1

  • Online ISBN: 978-3-030-93479-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics