Abstract
Stream join is a fundamental and important processing in many real-world applications. Due to the complexity of join operation and the inherent characteristic of streaming data (e.g., skewed distribution and dynamics), though massive research has been conducted, adaptivity and load-balancing are still urgent problems. In this paper, an enhanced adaptive join-matrix system AdaptMX for stream theta-join is presented, which combines the key-based and tuple-based join approaches well: (i) at outer level, it modifies the well-known join-matrix model to allocate resource on demand, improving the adaptivity of tuple-based parititoning scheme; (ii) at inner level, it adopts a key-based routing policy among grouped processing tasks to maintain the join semantics and cost-effective load balancing strategies to remove the stragglers. For demonstration, we present a transparent processing of distributed stream theta-join and compare the performance of our AdaptMX system with other baselines, with 3\(\times \) higher throughput.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Web link of the demonstration: https://github.com/CJECNU/AdaptMX.
References
Elseidy, M., Elguindy, A., Vitorovic, A., Koch, C.: Scalable and adaptive online joins. PVLDB 7(6), 441–452 (2014)
Fang, J., Zhang, R., Wang, X., Fu, T.Z.J., Zhang, Z., Zhou, A.: Cost-effective stream join algorithm on cloud system. In: CIKM, pp. 1773–1782 (2016)
Okcan A., Riedewald, M.: Processing theta-joins using MapReduce. In: SIGMOD, pp. 949–960 (2011)
Wang, X., Fang, J., Li, Y., Zhang, R., Zhou, A.: Cost-effective data partition for distributed stream processing system. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10178, pp. 623–635. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55699-4_39
Acknowledgements
The work is partially supported by the Key Program of National Natural Science Foundation of China (Grant No. 61672233, No. 61572194 and No. 61702113).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Wang, X., Jiang, C., Fang, J., Wang, X., Zhang, R. (2018). AdaptMX: Flexible Join-Matrix Streaming System for Distributed Theta-Joins. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_52
Download citation
DOI: https://doi.org/10.1007/978-3-319-91458-9_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)