Skip to main content
Log in

Efficient and robust data augmentation for trajectory analytics: a similarity-based approach

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Trajectories between the same origin and destination (OD) offer valuable information for us to better understand the diversity of moving behaviours and the intrinsic relationships between the moving objects and specific locations. However, due to the data sparsity issue, there are always insufficient trajectories to carry out mining algorithms, e.g., classification and clustering, to discover the intrinsic properties of OD mobility. In this work, we propose an efficient and robust trajectory augmentation approach to construct sizeable qualified trajectories with existing data to address the sparsity issue. The high-level idea is to concatenate existing trajectories to reconstruct a sufficient number of trajectories to represent the ones going across the OD pair directly. To achieve this goal, we first propose a transition graph to support efficient sub-trajectories concatenation to tackle the sparsity issue. In addition, we develop a novel similarity metric to measure the similarity between two set of trajectories so as to validate whether the reconstructed trajectory set can well represent the original traces. Empirical studies on a large real trajectory dataset show that our proposed solutions are efficient and robust.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14

Similar content being viewed by others

References

  1. Alvarez-Garcia, J.A., Ortega, J.A., Gonzalez-Abril, L., Velasco, F.: Trip destination prediction based on past GPS log using a hidden Markov model. Expert Syst. Appl. 37(12), 8166–8171 (2010)

    Article  Google Scholar 

  2. Castro, P.S, Zhang, D., Chen, C., Li, S., Pan, G.: From taxi GPS traces to social and community dynamics: a survey. ACM Comput. Surv. (CSUR) 46(2), 17 (2013)

    Article  Google Scholar 

  3. Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp 491–502. ACM (2005)

  4. Chen, Z., Shen, H.T., Zhou, X., Zheng, Y., Xie, X.: Searching trajectories by locations: an efficiency study. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, vol. 2010, pp 255–266. ACM (2010)

  5. Chen, Z., Shen, H.T., Zhou, X.: Discovering popular routes from trajectories. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp 900–911. IEEE (2011)

  6. Dai, J., Yang, B., Guo, C., Ding, Z.: Personalized route recommendation using big trajectory data. In: 2015 IEEE 31st International Conference on Data Engineering (ICDE), pp 543–554. IEEE (2015)

  7. Eddy, S.R.: Hidden markov models. Curr. Opin. Struct. Biol. 6(3), 361–365 (1996)

    Article  MathSciNet  Google Scholar 

  8. He, D., Ruan, B., Zheng, B., Zhou X.: Origin-destination trajectory diversity analysis: efficient top-k diversified search. In: 2018 19th IEEE International Conference on Mobile Data Management, pp 135–144. IEEE, MDM (2018)

  9. He, D., Ruan, B., Zheng, B., Zhou, X.: Trajectory set similarity measure: an emd-based approach. In: Australasian Database Conference, pp 28–40. Springer (2018)

  10. Kassidas, A., MacGregor, J.F., Taylor, P.A.: Synchronization of batch trajectories using dynamic time warping. AIChE J. 44(4), 864–875 (1998)

    Article  Google Scholar 

  11. Kruskal, J.B.: An overview of sequence comparison: time warps, string edits, and macromolecules. SIAM Rev. 25(2), 201–237 (1983)

    Article  MathSciNet  Google Scholar 

  12. Kullback, S.: Information Theory and Statistics. Courier Corporation (1997)

  13. Lee, J.G., Han, J., Li, X., Gonzalez, H.: Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering. Proce. VLDB Endow. 1(1), 1081–1094 (2008)

    Article  Google Scholar 

  14. Lin, B., Su, J.: Shapes based trajectory queries for moving objects. In: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems, pp 21–30. ACM (2005)

  15. Newson, P., Krumm, J.: Hidden markov map matching through noise and sparseness. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp 336–343. ACM (2009)

  16. Pele, O., Werman, M.: Fast and robust earth mover’s distances. In: 2009 IEEE 12th International Conference on Computer Vision, pp 460–467. IEEE (2009)

  17. Pelekis, N., Kopanakis, I., Marketos, G., Ntoutsi, I., Andrienko, G., Theodoridis, Y.: Similarity search in trajectory databases. In: 14th International Symposium on Temporal Representation and Reasoning, pp 129–140. IEEE (2007)

  18. Puzicha, J., Hofmann, T., Buhmann, J.M.: Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In: 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997. Proceedings, pp 267–272. IEEE (1997)

  19. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)

    Article  Google Scholar 

  20. Sanderson, A.C., Wong, A.K.: Pattern trajectory analysis of nonstationary multivariate data. IEEE Trans. Syst. Man Cybern. 10(7), 384–392 (1980)

    Article  Google Scholar 

  21. Su, H.: Quality-aware trajectory processing using significant locations. University of Queensland (2015)

  22. Swain, M.J., Ballard, D.H.: Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)

    Article  Google Scholar 

  23. Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: 18th International Conference on Data Engineering, 2002. Proceedings, pp 673–684. IEEE (2002)

  24. Wang, H., Su, H., Zheng, K., Sadiq, S., Zhou, X.: An effectiveness study on trajectory similarity measures. In: Proceedings of the Twenty-Fourth Australasian Database Conference-Volume, vol. 137, pp 13–22. Australian Computer Society Inc. (2013)

  25. Wang, Y., Zheng, Y., Xue, Y.: Travel time estimation of a path using sparse trajectories. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 25–34. ACM (2014)

  26. Wei, L.Y., Zheng, Y., Peng W.C.: Constructing popular routes from uncertain trajectories. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 195–203. ACM (2012)

  27. Wei, L.Y., Chang, K.P., Peng, W.C.: Discovering pattern-aware routes from trajectories. Distrib. Parallel Databases 33(2), 201–226 (2015)

    Article  Google Scholar 

  28. Xue, A.Y., Zhang, R., Zheng, Y., Xie, X., Huang, J., Xu, Z.: Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp 254–265. IEEE (2013)

  29. Yang, B., Guo, C., Jensen, C.S.: Travel cost inference from sparse, spatio temporally correlated time series using markov models. Proc. VLDB Endow. 6(9), 769–780 (2013)

    Article  Google Scholar 

  30. Yi, B.K., Jagadish, H., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: 14th International Conference on Data Engineering, 1998. Proceedings, pp 201–208. IEEE (1998)

Download references

Acknowledgments

Sibo Wang was supported by CUHK Direct Grant No. 4055114. He was also supported by the CUHK University Startup Grant No. 4930911 and No. 5501570.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sibo Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, D., Wang, S., Ruan, B. et al. Efficient and robust data augmentation for trajectory analytics: a similarity-based approach. World Wide Web 23, 361–387 (2020). https://doi.org/10.1007/s11280-019-00695-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-019-00695-9

Keywords

Navigation