Abstract
In this paper, we propose duality-based locality-aware stream partitioning (LSP) in distributed stream processing engines (DSPEs). In general, LSP directly uses the locality concept of distributed batch processing engines (DBPEs). This concept does not fully take into account the characteristics of DSPEs and therefore does not maximize cluster resource utilization. To solve this problem, we first explain the limitations of existing LSP, and we then propose a duality relationship between DBPEs and DSPEs. We finally propose a simple but efficient ping-based mechanism to maximize the locality of DSPEs based on the duality. The insights uncovered in this paper can maximize the throughput and minimize the latency in stream partitioning.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Apache Storm describes stream partitioning as grouping method.
References
Apache Hadoop. https://hadoop.apache.org/
Ibrahim, S., et al.: LEEN: locality/fairness-aware key partitioning for MapReduce in the cloud. In: Proceedings of the 2nd IEEE International Conference on Cloud Computing Technology and Science, Indianapolis, IN, pp. 17–24, November 2010
Wang, W., et al.: MapTask scheduling in MapReduce with data locality: throughput and heavy-traffic optimality. IEEE/ACM Trans. Netw. 24(1), 190–203 (2016)
Zaharia, M., et al.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European Conference on Computer Systems, Paris, France, pp. 265–278, April 2010
Bu, X., et al.: Interference and locality-aware task scheduling for MapReduce applications in virtual clusters. In: Proceedings of the 22nd International Symposium on High-performance Parallel and Distributed Computing, New York, NY, pp. 227–238, June 2013
Apache Storm Concepts. http://storm.apache.org/releases/1.2.2/Concepts.html
Caneill, M., et al.: Locality-aware routing in stateful streaming applications. In: Proceedings of the 17th International Middleware Conference, Trento, Italy, pp. 4:1–4:13, December 2016
Son, S., et al.: Locality aware traffic distribution in apache storm for energy analytics platform. In: Proceedings of the 6th IEEE International Conference on Big Data and Smart Computing, Shanghai, China, pp. 721–724, January 2018
Toshniwal, A., et al.: Storm @Twitter. In: Proceedings of the International Conference on Management of Data, Snowbird, UT, pp. 147–156, June 2014
Nasir, M., et al.: The power of both choices: practical load balancing for distributed stream processing engines. In: Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE), Seoul, South Korea, pp. 137–148, 13–17 April 2015
Acknowledgements
This research was supported by Korea Electric Power Corporation. (Grant number: R18XA05).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Son, S., Moon, YS. (2020). Duality-Based Locality-Aware Stream Partitioning in Distributed Stream Processing Engines. In: Schwardmann, U., et al. Euro-Par 2019: Parallel Processing Workshops. Euro-Par 2019. Lecture Notes in Computer Science(), vol 11997. Springer, Cham. https://doi.org/10.1007/978-3-030-48340-1_57
Download citation
DOI: https://doi.org/10.1007/978-3-030-48340-1_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48339-5
Online ISBN: 978-3-030-48340-1
eBook Packages: Computer ScienceComputer Science (R0)