Skip to main content

Duality-Based Locality-Aware Stream Partitioning in Distributed Stream Processing Engines

  • Conference paper
  • First Online:
  • 1226 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11997))

Abstract

In this paper, we propose duality-based locality-aware stream partitioning (LSP) in distributed stream processing engines (DSPEs). In general, LSP directly uses the locality concept of distributed batch processing engines (DBPEs). This concept does not fully take into account the characteristics of DSPEs and therefore does not maximize cluster resource utilization. To solve this problem, we first explain the limitations of existing LSP, and we then propose a duality relationship between DBPEs and DSPEs. We finally propose a simple but efficient ping-based mechanism to maximize the locality of DSPEs based on the duality. The insights uncovered in this paper can maximize the throughput and minimize the latency in stream partitioning.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Apache Storm describes stream partitioning as grouping method.

References

  1. Apache Hadoop. https://hadoop.apache.org/

  2. Ibrahim, S., et al.: LEEN: locality/fairness-aware key partitioning for MapReduce in the cloud. In: Proceedings of the 2nd IEEE International Conference on Cloud Computing Technology and Science, Indianapolis, IN, pp. 17–24, November 2010

    Google Scholar 

  3. Wang, W., et al.: MapTask scheduling in MapReduce with data locality: throughput and heavy-traffic optimality. IEEE/ACM Trans. Netw. 24(1), 190–203 (2016)

    Article  Google Scholar 

  4. Zaharia, M., et al.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European Conference on Computer Systems, Paris, France, pp. 265–278, April 2010

    Google Scholar 

  5. Bu, X., et al.: Interference and locality-aware task scheduling for MapReduce applications in virtual clusters. In: Proceedings of the 22nd International Symposium on High-performance Parallel and Distributed Computing, New York, NY, pp. 227–238, June 2013

    Google Scholar 

  6. Apache Storm Concepts. http://storm.apache.org/releases/1.2.2/Concepts.html

  7. Caneill, M., et al.: Locality-aware routing in stateful streaming applications. In: Proceedings of the 17th International Middleware Conference, Trento, Italy, pp. 4:1–4:13, December 2016

    Google Scholar 

  8. Son, S., et al.: Locality aware traffic distribution in apache storm for energy analytics platform. In: Proceedings of the 6th IEEE International Conference on Big Data and Smart Computing, Shanghai, China, pp. 721–724, January 2018

    Google Scholar 

  9. Toshniwal, A., et al.: Storm @Twitter. In: Proceedings of the International Conference on Management of Data, Snowbird, UT, pp. 147–156, June 2014

    Google Scholar 

  10. Nasir, M., et al.: The power of both choices: practical load balancing for distributed stream processing engines. In: Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE), Seoul, South Korea, pp. 137–148, 13–17 April 2015

    Google Scholar 

Download references

Acknowledgements

This research was supported by Korea Electric Power Corporation. (Grant number: R18XA05).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang-Sae Moon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Son, S., Moon, YS. (2020). Duality-Based Locality-Aware Stream Partitioning in Distributed Stream Processing Engines. In: Schwardmann, U., et al. Euro-Par 2019: Parallel Processing Workshops. Euro-Par 2019. Lecture Notes in Computer Science(), vol 11997. Springer, Cham. https://doi.org/10.1007/978-3-030-48340-1_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-48340-1_57

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-48339-5

  • Online ISBN: 978-3-030-48340-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics