Indexable sub-trajectory matching using multi-segment approximation: a partition-and-stitch framework

Yoo, Jae-Jun; Loh, Woong-Kee; Whang, Kyu-Young

doi:10.1007/s11227-019-02813-w

Indexable sub-trajectory matching using multi-segment approximation: a partition-and-stitch framework

Published: 15 March 2019

Volume 75, pages 6129–6157, (2019)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

366 Accesses
Explore all metrics

Abstract

With advances in base technologies for moving objects, many studies have been conducted on the construction of databases of the trajectories of moving objects, including the diverse applications related to the trajectories. Most previous studies deal with whole trajectory matching, which finds the trajectories T in the database similar to a given query trajectory Q ‘as a whole.’ However, we often want to find those T containing the sub-trajectories $T_{\mathrm{sub}} \, (\subseteq T)$ that are similar to Q. This problem is known as sub-trajectory matching and is more complicated than whole trajectory matching since the query trajectory Q can be of any length and the matching sub-trajectories $T_{\mathrm{sub}}$ can be at any position in the data trajectories T. In this paper, we present a novel indexing-based sub-trajectory matching algorithm using multi-segment approximation. Our algorithm partitions a data trajectory into multiple component segments and then stores the individual segments in an index. The query trajectory is also partitioned into its component segments, and the search for similar segments for each query segment is efficiently performed using the index. The sub-trajectories similar to the query trajectory are reconstructed by our ‘stitching’ algorithm using the individual segments retrieved from the index. Our stitching algorithm is novel and innovative in the sense that it facilitates segment-wise partitioning and indexing of data trajectories. Without stitching, only trajectory-wise operations would be affordable, which causes severe storage space overhead and degradation in search performance. Our study is the first that uses indexing in sub-trajectory matching. We define a (multi-segment) trajectory similarity measure that extends a widely used single-segment similarity measure proposed by Lee et al. (in: Proceedings of ACM SIGMOD international conference on management of data (SIGMOD), 2007; in: Proceedings of IEEE international conference on data engineering (ICDE), 2008; Proc VLDB Endow (PVLDB) 1(1):1081–1094, 2008) by using the Hausdorff distance. We perform extensive experiments to compare our method with EDS (Xie, in: Proceedings of ACM SIGMOD international conference on management of data (SIGMOD), 2014), which has been proved to outperform all representative point-based measures in terms of accuracy and performance. The accuracy of our similarity measure is better than EDS by up to 52.0%, and our algorithm significantly outperforms that using EDS by up to 22,543 times. The performance of our algorithm is linearly scalable in the size of the database, which is an essential property for handling large-scale databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

On temporal-constrained sub-trajectory cluster analysis

Article 04 April 2017

SST: Synchronized Spatial-Temporal Trajectory Similarity Search

Article 28 April 2020

Continuous k-Similarity Trajectories Search over Data Stream

Notes

Some arrows are omitted to avoid clutter.
http://research.microsoft.com/en-us/projects/geolife/.
http://www.nhc.noaa.gov/data/.
http://www.chorochronos.org/?q=node/5.

References

Alamri S, Taniar D, Safar M (2014) A taxonomy for moving object queries in spatial databases. Future Gener Comput Syst 37:232–242
Article Google Scholar
Atev S, Miller G, Papanikolopoulos NP (2010) Clustering of vehicle trajectories. IEEE Trans Intell Transp Syst 11(3):647–657
Article Google Scholar
Beckmann N, Seeger B (2009) A revised R*-tree in comparison with related index structures. In: Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), pp 799–812
Buchin K, Buchin M, Kreveld MV, Luo J (2009) Finding long and similar parts of trajectories. In: Proceedings of ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS), pp 296–305
Chen J, Leung MKH, Gao Y (2003) Noisy logo recognition using line segment Hausdorff distance. Pattern Recognit 36(4):943–955
Article Google Scholar
Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp 792–803
Chapter Google Scholar
Chen L, Ozsu MT, Oria V (June 2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), pp 491–502
Ding X, Chen L, Gao Y, Jensen CS, Bao H (2018) UlTraMan: a unified platform for big trajectory data management and analytics. Proc VLDB Endow 11(7):787–799
Article Google Scholar
Dong Y, Pi D (2018) Novel privacy-preserving algorithm based on frequent path for trajectory data publishing. Knowl Based Syst 148:55–65
Article Google Scholar
Eberly DH (2006) 3D game engine design: a practical approach to real-time computer graphics, 2nd edn. Morgan Kaufmann, Burlington
Google Scholar
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 226–231
Frentzos E, Gratsias K, Theodoridis Y (2007) Index-based most similar trajectory search. In: Proceedings of IEEE International Conference on Data Engineering (ICDE), pp 816–825
Hung C-C, Peng W-C, Lee W-C (2015) Clustering and aggregating clues of trajectories for mining trajectory patterns and routes. VLDB J 24(2):169–192
Article Google Scholar
Huttenlocher DP, Kedem K (1990) Computing the minimum Hausdorff distance for point sets under translation. In: Proceedings of ACM annual symposium on computational geometry (SCG), pp 340–349
Kaplan E, Gürsoy ME, Nergiz ME, Saygin Y (2018) Location disclosure risks of releasing trajectory distances. Data Knowl Eng 113:43–63
Article Google Scholar
Lee J-G, Han J, Whang K-Y (2007) Trajectory clustering: a partition-and-group framework. In: Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), pp 593–604
Lee J-G, Han J, Li X (2008) Trajectory outlier detection: a partition-and-detect framework. In: Proceedings of IEEE International Conference on Data Engineering (ICDE), pp 140–149
Lee J-G, Han J, Li X, Gonzalez H (2008) TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering. Proc VLDB Endow (PVLDB) 1(1):1081–1094
Article Google Scholar
Mao J, Sun P, Jin C, Zhou A (2018) Outlier detection over distributed trajectory streams. In: Proceedings of SIAM International Conference on Data Mining (SDM), San Diego, pp 64–72
Mao Y, Zhong H, Xiao X, Li X (2017) A segment-based trajectory similarity measure in the urban transportation systems. Sensors 17(3):524
Article Google Scholar
Nutanong S, Jacox EH, Samet H (2011) An incremental Hausdorff distance calculation algorithm. Proc VLDB Endow (PVLDB) 4(8):506–517
Article Google Scholar
Pelekis N, Tampakis P, Vodas M, Doulkeridis C, Theodoridis Y (2017) On temporal-constrained sub-trajectory cluster analysis. Data Min Knowl Discov (DMKD) 31(5):1294–1330
Article MathSciNet Google Scholar
Ranu S, Deepak P, Telang AD, Deshpande P, Raghavan S (2015) Indexing and matching trajectories under inconsistent sampling rates. In: Proceedings of IEEE International Conference on Data Engineering (ICDE), pp 999–1010
Shang Z, Li G, Bao Z (2018) DITA: distributed in-memory trajectory analytics. In: Proceedings of International Conference on Management of Data (SIGMOD), Houston, pp 725–740
Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings of IEEE International Conference on Data Engineering (ICDE), pp 673–684
Wolfson O, Xu B, Chamberlain S, Jiang L (1998) Moving objects databases: issues and solutions. In: Proceedings of IEEE International Conference on Scientific and Statistical Database Management, pp 111–122
Xie M (2014) EDS: a segment-based distance measure for sub-trajectory similarity search. In: Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), pp 1609–1610
Xie D, Li F, Phillips JM (2017) Distributed trajectory similarity search. Proc VLDB Endow (PVLDB) 10(11):1478–1489
Article Google Scholar
Yi B-K, Jagadish HV, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of IEEE International Conference on Data Engineering (ICDE), pp 201–208
Yuan G, Sun P, Zhao J, Li D, Wang C (2017) A review of moving object trajectory clustering algorithms. Artif Intell Rev 47(1):123–144
Article Google Scholar
Zheng Y, Zhang L, Xie X, Ma W-Y (2009) Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of International Conference on World Wide Web (WWW), pp 791–800
Zheng Y, Zhou X (eds) (2011) Computing with spatial trajectories. Springer, Berlin
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
Jae-Jun Yoo & Kyu-Young Whang
Department of Software, Gachon University, Seongnam, Republic of Korea
Woong-Kee Loh

Authors

Jae-Jun Yoo
View author publications
You can also search for this author in PubMed Google Scholar
Woong-Kee Loh
View author publications
You can also search for this author in PubMed Google Scholar
Kyu-Young Whang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Woong-Kee Loh or Kyu-Young Whang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Research Foundation of Korea (NRF) grant funded by Korean Government (MSIT) (No. 2016R1A2B4015929).

Appendix

Lemma 5

For any two segments$L_i$and$L_j$, the following always holds:

$$\begin{aligned} {\mathrm{dist}}(L_i, L_j) \ge d_{{\mathrm{seg}},0}(L_i, L_j), \end{aligned}$$

(11)

where${\mathrm{dist}}(L_i, L_j) = w_\perp \cdot d_\perp (L_i, L_j) + w_\parallel \cdot d_\parallel (L_i, L_j) + w_\theta \cdot d_\theta (L_i, L_j)$and$w_\perp = w_\parallel = w_\theta = 1$ [16].

Proof

Let ${\mathcal {L}}_i$ and ${\mathcal {L}}_j$ be two lines containing two segments $L_i$ and $L_j$, respectively. Let $p_s$ and $p_e$ be the projection points of two end points $s_j$ and $e_j$ of $L_j$ onto ${\mathcal {L}}_i$, respectively. Without loss of generality, we assume $d(s_j, p_s) \le d(e_j, p_e)$. We also assume that $L_i$ is longer than $L_j$ as in [16]. We prove for the following three cases:

Case 1: $p_s$ is located on $L_i$.

As shown in Fig. 17a, $d_{{\mathrm{seg}},0}(L_i, L_j) = l_{\perp 1}$. Thus,

$$\begin{aligned} {\mathrm{dist}}(L_i, L_j)&\ge {} d_\perp (L_i, L_j) = \frac{l_{\perp 1}^2 + l_{\perp 2}^2}{ l_{\perp 1} + l_{\perp 2}} \\&\ge {} \frac{l_{\perp 1}^2 + l_{\perp 1} l_{\perp 2}}{ l_{\perp 1} + l_{\perp 2}} = l_{\perp 1} \\&= {} d_{{\mathrm{seg}},0}(L_i, L_j). \end{aligned}$$

Case 2: $p_s$ is located behind $e_i$.

As shown in Fig. 17b, $d_{{\mathrm{seg}},0}(L_i, L_j) = d(e_i, s_j)$. Thus,

$$\begin{aligned} {\mathrm{dist}}(L_i, L_j)&\ge {} d_{\perp }(L_i, L_j) + d_\parallel (L_i, L_j) \\&\ge {} l_{\perp 1} + l_{\parallel 2} \ge d(e_i, s_j) \\&= {} d_{{\mathrm{seg}},0}(L_i, L_j). \end{aligned}$$

Case 3: $p_s$ is located in front of $s_i$.

Let $p_j$ be the projection point of $s_i$ in $L_i$ onto ${\mathcal {L}}_j$, then $d_{{\mathrm{seg}},0}(L_i, L_j) \le d(s_i, p_j)$. If it holds that $l_{\parallel 1}' \le l_{\parallel 1}''$, then $d_{\parallel }(L_i, L_j) = l_{\parallel 1}'$. Thus,

$$\begin{aligned} {\mathrm{dist}}(L_i, L_j)&\ge {} d_\perp (L_i, L_j) + d_\parallel (L_i, L_j) \ge l_{\perp 1} + l_{\parallel 1}' \\&\ge {} d(s_i, s_j) \ge d(s_i, p_j) \\&\ge {} d_{{\mathrm{seg}},0}(L_i, L_j). \end{aligned}$$

If it holds that $l_{\parallel 1}' \ge l_{\parallel 1}''$, then $d_\parallel (L_i, L_j) = l_{\parallel 1}''$. Thus,

$$\begin{aligned} {\mathrm{dist}}(L_i, L_j)&= {} d_\perp (L_i, L_j) + d_\parallel (L_i, L_j) + d_\theta (L_i, L_j) \\&\ge {} l_{\perp 1} + l_{\parallel 1}'' + d_\theta \ge d(s_i, e_j) \ge d(s_i, p_j) \\&\ge {} d_{{\mathrm{seg}},0}(L_i, L_j). \end{aligned}$$

Therefore, combining these three cases, it always holds that ${\mathrm{dist}}(L_i, L_j) \ge d_{{\mathrm{seg}},0}(L_i, L_j)$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoo, JJ., Loh, WK. & Whang, KY. Indexable sub-trajectory matching using multi-segment approximation: a partition-and-stitch framework. J Supercomput 75, 6129–6157 (2019). https://doi.org/10.1007/s11227-019-02813-w

Download citation

Published: 15 March 2019
Issue Date: September 2019
DOI: https://doi.org/10.1007/s11227-019-02813-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indexable sub-trajectory matching using multi-segment approximation: a partition-and-stitch framework

Abstract

Access this article

Similar content being viewed by others

On temporal-constrained sub-trajectory cluster analysis

SST: Synchronized Spatial-Temporal Trajectory Similarity Search

Continuous k-Similarity Trajectories Search over Data Stream

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Appendix

Lemma 5

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Indexable sub-trajectory matching using multi-segment approximation: a partition-and-stitch framework

Abstract

Access this article

Similar content being viewed by others

On temporal-constrained sub-trajectory cluster analysis

SST: Synchronized Spatial-Temporal Trajectory Similarity Search

Continuous k-Similarity Trajectories Search over Data Stream

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Appendix

Appendix

Lemma 5

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation