Distributed top-k similarity query on big trajectory streams

Zhang, Zhigang; Qi, Xiaodong; Wang, Yilin; Jin, Cheqing; Mao, Jiali; Zhou, Aoying

doi:10.1007/s11704-018-7234-6

Distributed top-k similarity query on big trajectory streams

Research Article
Published: 06 November 2018

Volume 13, pages 647–664, (2019)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Zhigang Zhang¹,
Xiaodong Qi¹,
Yilin Wang¹,
Cheqing Jin¹,
Jiali Mao^1,2 &
…
Aoying Zhou¹

51 Accesses
5 Citations
Explore all metrics

Abstract

Recently, big trajectory data streams are generated in distributed environments with the popularity of smartphones and other mobile devices. Distributed top-k similarity query, which finds k trajectories that are most similar to a given query trajectory from all remote sites, is critical in this field. The key challenge in such a query is how to reduce the communication cost due to the limited network bandwidth resource. Although this query can be solved by sending the query trajectory to all the remote sites, in which the pairwise similarities are computed precisely. However, the overall cost, O(n · m), is huge when n or m is huge, where n is the size of query trajectory and m is the number of remote sites. Fortunately, there are some cheap ways to estimate pairwise similarity, which filter some trajectories in advance without precise computation. In order to overcome the challenge in this query, we devise two general frameworks, into which concrete distance measures can be plugged. The former one uses two bounds (the upper and lower bound), while the latter one only uses the lower bound. Moreover, we introduce detailed implementations of two representative distance measures, Euclidean and DTW distance, after inferring the lower and upper bound for the former framework and the lower bound for the latter one. Theoretical analysis and extensive experiments on real-world datasets evaluate the efficiency of proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DT-KST: Distributed Top-k Similarity Query on Big Trajectory Streams

MDTK: Bandwidth-Saving Framework for Distributed Top-k Similar Trajectory Query

Continuous k-Similarity Trajectories Search over Data Stream

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Dai J P, Teng J, Bai X, Shen Z H, Xuan D. Mobile phone based drunk driving detection. In: Proceedings of the 4th International Conference on Pervasive Computing Technologies for Healthcare. 2010, 1–8
Google Scholar
Zeinalipour Yazti D, Laoudias C, Costa C, Vlachos M, Andreou M I, Gunopulos D. Crowdsourced trace similarity with smartphones. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(6): 1240–1253
Article Google Scholar
Ding H, Trajcevski G, Scheuermann P. Efficient similarity join of large sets of moving object trajectories. In: Proceedings of the 15th International Conference on Temporal Representaion and Reasoning. 2008, 79–87
Google Scholar
Ma C Y, Lu H, Shou L D, Chen G. KSQ: top-k similarity query on uncertain trajectories. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(9): 2049–2062
Article Google Scholar
Skoumas G, Skoutas D, Vlachaki A. Efficient identification and approximation of k-nearest moving neighbors. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2013, 264–273
Google Scholar
Sacharidis D, Skoutas D, Skoumas G. Continuous monitoring of nearest trajectories. In: Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 2014, 361–370
Google Scholar
Yeh M Y, Wu K L, Yu P S, Chen M S. Leewave: level-wise distribution of wavelet coefficients for processing knn queries over distributed streams. Proceedings of the VLDB Endowment, 2008, 1(1): 586–597
Article Google Scholar
Hsu C C, Kung P H, Yeh M Y, Lin S D, Gibbons P B. Bandwidth-efficient distributed k-nearest-neighbor search with dynamic time warping. In: Proceedings of the 2015 IEEE International Conference on Big Data. 2015, 551–560
Chapter Google Scholar
Zhang Z G, Wang Y L, Mao J L, Qiao S J, Jin C Q, Zhou A Y. DTKST: distributed top-k similarity query on big trajectory streams. In: Proceedings of the 22nd International Conference on Database Systems for Advanced Applications. 2017, 199–214
Chapter Google Scholar
Faloutsos C, Ranganathan M, Manolopoulos Y. Fast subsequence matching in time-series databases. In: Proceedings of the 1994 ACM International Conference on Management of Data. 1994, 419–429
Google Scholar
Kanth K V R, Agrawal D, Singh A K. Dimensionality reduction for similarity searching in dynamic databases. In: Proceedings of the 1998 ACM International Conference on Management of Data. 1998, 166–176
Google Scholar
Popivanov I, Miller R J. Similarity search over time-series data using wavelets. In: Proceedings of the 18th International Conference on Data Engineering. 2002, 212–221
Chapter Google Scholar
Yi B K, Faloutsos C. Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26th International Conference on Very Large Data Bases. 2000, 385–394
Google Scholar
Chakrabarti K, Keogh E, Mehrotra S, Pazzani M. Locally adaptive dimensionality reduction for indexing large time series databases. ACM Transactions on Database Systems, 2002, 27(2): 188–228
Article Google Scholar
Cao H, Wolfson O, Trajcevski G. Spatio-temporal data reduction with deterministic error bounds. The VLDB Journal, 2006, 15(3): 211–228
Article Google Scholar
Papadopoulos A N, Manolopoulos Y. Distributed processing of similarity queries. Distributed and Parallel Databases, 2001, 9(1): 67–92
Article MATH Google Scholar
Kashyap S, Karras P. Scalable KNN search on vertically stored time series. In: Proceedings of the 17th ACM International Conference on Knowledge Discovery and Data Mining. 2011, 1334–1342
Google Scholar
Vernica R, Carey M J, Li C. Efficient parallel set-similarity joins using mapreduce. In: Proceedings of the 16th ACM International Conference on Management of Data. 2010, 495–506
Google Scholar
Kim Y, Shim K. Parallel top-k similarity join algorithms using mapreduce. In: Proceedings of the 28th IEEE International Conference on Data Engineering. 2012, 510–521
Google Scholar
Yazti D Z, Lin S, Gunopulos D. Distributed spatio-temporal similarity search. In: Proceedings of the 2006 ACM International Conference on Information and Knowledge Management. 2006, 14–23
Google Scholar
Costa C, Laoudias C, Yazti D Z, Gunopulos D. Smarttrace: finding similar trajectories in smartphone networks without disclosing the traces. In: Proceedings of the 27th International Conference on Data Engineering. 2011, 1288–1291
Google Scholar
Chan K P, Fu A W C, Yu C T. Haar wavelets for efficient similarity search of time-series: with and without time warping. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(3): 686–705
Article Google Scholar
Liu H P, Jin C Q, Zhou A Y. Popular route planning with travel cost estimation. In: Proceedings of the 21st International Conference on Database Systems for Advanced Applications. 2016, 403–418
Chapter Google Scholar

Download references

Acknowledgements

Our research is supported by the National Key Research and Development Program of China (2016YFB1000905), NSFC (61370101, 61532021, U1501252, U1401256 and 61402180), Shanghai Knowledge Service Platform Project (ZF1213).

Author information

Authors and Affiliations

School of Data Science and Engineering, East China Normal University, Shanghai, 200062, China
Zhigang Zhang, Xiaodong Qi, Yilin Wang, Cheqing Jin, Jiali Mao & Aoying Zhou
Computer School, China West Normal University, Nanchong, 637009, China
Jiali Mao

Authors

Zhigang Zhang
View author publications
Search author on:PubMed Google Scholar
Xiaodong Qi
View author publications
Search author on:PubMed Google Scholar
Yilin Wang
View author publications
Search author on:PubMed Google Scholar
Cheqing Jin
View author publications
Search author on:PubMed Google Scholar
Jiali Mao
View author publications
Search author on:PubMed Google Scholar
Aoying Zhou
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Jiali Mao.

Additional information

Zhigang Zhang is currently working toward the PhD degree at the School of Data Science and Engineering, East China Normal University, China. His research interests include location based service, spatiotemporal data management and distributed computing.

Xiaodong Qi is currently working toward the PhD degree at the School of Data Science and Engineering, East China Normal University, China. His research interests include scientific data management and block chain.

Yilin Wang received her Bachelor degree of Computer Science and Technology from Northwestern Polytecnical University, China in 2015. She is a graduate student in the school of Software Engineering, East China Normal University. Her current research interests include data mining and location-based services.

Cheqing Jin is a professor on computer science at East China Normal University, China. He received Excellent Young Teacher Award by Fok Ying Tung Education Foundation. His main research interests include: streaming data management, location-based services, uncertain data management, data quality, and database benchmarking.

Jiali Mao is an associate professor at China West Normal University, China. She is currently working toward the PhD degree in the school of Data Science and Engineering, East China Normal University, China. Her current research interests include big data analysis and location-based services.

Aoying Zhou is a professor on computer science at East China Normal University, China, as well as the Dean of School of Data Science and Engineering (DaSE). His research interests include Web data management, data management for dataintensive computing, in-memory cluster computing, benchmarking for big data and performance.

Electronic supplementary material

Supplementary material, approximately 297 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Z., Qi, X., Wang, Y. et al. Distributed top-k similarity query on big trajectory streams. Front. Comput. Sci. 13, 647–664 (2019). https://doi.org/10.1007/s11704-018-7234-6

Download citation

Received: 03 July 2017
Accepted: 21 December 2017
Published: 06 November 2018
Issue Date: June 2019
DOI: https://doi.org/10.1007/s11704-018-7234-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed top-k similarity query on big trajectory streams

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

DT-KST: Distributed Top-k Similarity Query on Big Trajectory Streams

MDTK: Bandwidth-Saving Framework for Distributed Top-k Similar Trajectory Query

Continuous k-Similarity Trajectories Search over Data Stream

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 297 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now