Skip to main content

Lunatory: A Real-Time Distributed Trajectory Clustering Framework for Web Big Data

  • Conference paper
  • First Online:
Web Engineering (ICWE 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13362))

Included in the following conference series:

  • 1335 Accesses

Abstract

Web big data contains a wealth of valuable information, which can be extracted through web mining and knowledge extraction. Among them, the real-time location information of web can provide a richer calculation basis for existing applications, such as real-time monitoring systems and recommendation systems based on real-time trajectory clustering. However, as a trajectory is a sequence of user positions in the time dimension, the correlation calculation of the trajectories will inevitably incur a massive computational cost. In addition, such trajectory data is usually time-sensitive, that is, once the trajectory data has been generated and changed, the corresponding clustering results need to be output with low latency. Although the offline trajectory clustering has been well studied, extending such work to an online environment directly tends to incur (1) expensive network cost, (2) high processing latency, and (3) low accuracy results. To enable a real-time clustering on trajectory stream, we propose a distributed cLustering framework for hexagonal-based streaming trajectory (Lunatory). Lunatory covers three key components, that are: (1) Simplifier: to solve the problem of extensive network transmission in a distributed trajectory streaming system, a pivot trajectory data structure is introduced to simplify trajectories by reducing the number of samples and extracting key features; (2) Partitioner: to enhance the local computational efficiency of subsequent clustering, a hexagonal-based indexing strategy is proposed to index the pivot trajectories; (3) Executor extends DBSCAN to pivot trajectories and implements real-time trajectory clustering based on Flink. Empirical studies on real-world data validate the usefulness of our proposal and prove the huge advantage of our approach over available solutions in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal, P.K., Fox, K., Munagala, K., Nath, A., Pan, J., Taylor, E.: Subtrajectory clustering: models and algorithms. In: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pp. 75–87 (2018)

    Google Scholar 

  2. Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. ACM SIGMOD Rec. 28(2), 49–60 (1999)

    Article  Google Scholar 

  3. Birant, D., Kut, A.: ST-DBScan: an algorithm for clustering spatial-temporal data. Data Knowl. Eng. 60(1), 208–221 (2007)

    Article  Google Scholar 

  4. Chen, L., Chao, P., Fang, J., Chen, W., Xu, J., Zhao, L.: Disatra: a real-time distributed abstract trajectory clustering. In: Zhang, W., Zou, L., Maamar, Z., Chen, L. (eds.) WISE 2021. LNCS, vol. 13080, pp. 619–635. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-90888-1_47

    Chapter  Google Scholar 

  5. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  6. Uber Engineering: H3: Uber’s Hexagonal Hierarchical Spatial Index. https://eng.uber.com/h3/

  7. Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)

    Google Scholar 

  8. Fang, Z., Du, Y., Chen, L., Hu, Y., Gao, Y., Chen, G.: E 2 DTC: an end to end deep trajectory clustering framework via self-training. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 696–707. IEEE (2021)

    Google Scholar 

  9. Flink, A.: Apache Flink - Stateful Computations over Data Streams. https://flink.apache.org/

  10. Gudmundsson, J., Valladares, N.: A GPU approach to subtrajectory clustering using the fréchet distance. IEEE Trans. Parallel Distrib. Syst. 26(4), 924–937 (2014)

    Article  Google Scholar 

  11. Hung, C.-C., Peng, W.-C., Lee, W.-C.: Clustering and aggregating clues of trajectories for mining trajectory patterns and routes. VLDB J. 24(2), 169–192 (2011). https://doi.org/10.1007/s00778-011-0262-6

    Article  Google Scholar 

  12. Lee, J.G., Han, J., Whang, K.Y.: Trajectory clustering: a partition-and-group framework. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pp. 593–604 (2007)

    Google Scholar 

  13. Li, X., Zhao, K., Cong, G., Jensen, C.S., Wei, W.: Deep representation learning for trajectory similarity computation. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 617–628. IEEE (2018)

    Google Scholar 

  14. Li, Z., Lee, J.-G., Li, X., Han, J.: Incremental clustering for trajectories. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 32–46. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12098-5_3

    Chapter  Google Scholar 

  15. Liu, A., et al.: Representation learning with multi-level attention for activity trajectory similarity computation. IEEE Trans. Knowl. Data Eng. 34(5), 2387–2400 (2020)

    Article  Google Scholar 

  16. Mao, J., Song, Q., Jin, C., Zhang, Z., Zhou, A.: TSCluWin: trajectory stream clustering over sliding window. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9643, pp. 133–148. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32049-6_9

    Chapter  Google Scholar 

  17. Mao, J., Song, Q., Jin, C., Zhang, Z., Zhou, A.: Online clustering of streaming trajectories. Front. Comp. Sci. 12(2), 245–263 (2018). https://doi.org/10.1007/s11704-017-6325-0

    Article  Google Scholar 

  18. Mao, J., Wang, T., Jin, C., Zhou, A.: Feature grouping-based outlier detection upon streaming trajectories. IEEE Trans. Knowl. Data Eng. 29(12), 2696–2709 (2017)

    Article  Google Scholar 

  19. Myung, P.D., Myung, J.I., Pitt, M.A.: Advances in Minimum Description Length: Theory and Applications. MIT Press, Cambridge (2005)

    Google Scholar 

  20. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  21. Yao, D., Zhang, C., Zhu, Z., Huang, J., Bi, J.: Trajectory clustering via deep representation learning. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3880–3887. IEEE (2017)

    Google Scholar 

  22. Yue, M., Li, Y., Yang, H., Ahuja, R., Chiang, Y.Y., Shahabi, C.: Detect: deep trajectory clustering for mobility-behavior analysis. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 988–997. IEEE (2019)

    Google Scholar 

  23. Zheng, Y.: Trajectory data mining: an overview. ACM Trans. Intell. Syst. Technol. (TIST) 6(3), 1–41 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China under grant (No. 61802273, 62102277), Postdoctoral Science Foundation of China (No. 2020M681529), Natural Science Foundation of Jiangsu Province (BK20210703), China Science and Technology Plan Project of Suzhou (No. SYG202139), Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJCX2\(\_\)11342).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhua Fang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, Y., Pan, Z., Chao, P., Fang, J., Chen, W., Zhao, L. (2022). Lunatory: A Real-Time Distributed Trajectory Clustering Framework for Web Big Data. In: Di Noia, T., Ko, IY., Schedl, M., Ardito, C. (eds) Web Engineering. ICWE 2022. Lecture Notes in Computer Science, vol 13362. Springer, Cham. https://doi.org/10.1007/978-3-031-09917-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-09917-5_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-09916-8

  • Online ISBN: 978-3-031-09917-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics