A framework for parallel map-matching at scale using Spark

Alves Peixoto, Douglas; Quoc Viet Nguyen, Hung; Zheng, Bolong; Zhou, Xiaofang

doi:10.1007/s10619-018-7254-0

A framework for parallel map-matching at scale using Spark

Published: 10 November 2018

Volume 37, pages 697–720, (2019)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

Douglas Alves Peixoto¹,
Hung Quoc Viet Nguyen²,
Bolong Zheng¹ &
…
Xiaofang Zhou¹

774 Accesses
6 Citations
Explore all metrics

Abstract

Map-matching is a problem of matching recorded GPS trajectories to a digital representation of the road network. GPS data may be inaccurate and heterogeneous, due to limitations or error on electronic sensors, as well as law restrictions. How to accurately match trajectories to the road map is an important preprocessing step for many real-world applications, such as trajectory data mining, traffic analysis, and routes prediction. However, the high availability of GPS trajectories and map data challenges the scalability of current map-matching algorithms, which are limited for small datasets since they focus only on the accuracy of the matching rather than scalability. Therefore, we propose a distributed parallel framework for efficient and scalable offline map-matching on top of the Spark framework. Spark uses distributed in-memory data storage and the MapReduce paradigm to achieve horizontal scaling and fast computation of large datasets. Spark, however, is still limited for dynamic map-matching, and memory consumption in Spark can be an issue for very large datasets. We develop a framework to allow map-matching on top os Spark, while achieving horizontal scalability, memory-wise usage, and maintaining the accuracy of state-of-the-art matching algorithms by: (1) We combine a sampling-based Quadtree spatial partitioning construction and batch-based computation to achieve horizontal scalability of map-matching, as well as reduce cluster memory usage. (2) We employ a safe spatial-boundary approach to preserve matching accuracy of boundary objects. (3) In addition, a cost function for the distributed map-matching workload is provided in order to tune the framework parameters. Our extensive experiments demonstrate that our framework is efficient and scalable to process map-matching on large-scale data, while keeping matching accuracy and low memory usage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on parallelized real-time map matching algorithm for massive GPS data

Article 25 April 2017

A Practical Guide to an Open-Source Map-Matching Approach for Big GPS Data

Article Open access 04 August 2022

GAM: A GPU-Accelerated Algorithm for MaxRS Queries in Road Networks

Article 30 September 2022

Notes

https://github.com/douglasapeixoto/map-matching-framework.

References

Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., Saltz, J.: Hadoop-gis: a high performance spatial data warehousing system over mapreduce. VLDB 6, 1009–1020 (2013)
Google Scholar
Alt, H., Efrat, A., Rote, G., Wenk, C.: Matching planar maps. In: ACM-SIAM Symposium on Discrete Algorithms, pp. 589–598. Society for Industrial and Applied Mathematics (2003)
Baig, F., Mehrotra, M., Vo, H., Wang, F., Saltz, J., Kurc, T.: Sparkgis: Efficient comparison and evaluation of algorithm results in tissue image analysis studies. In: VLDB Workshop on Big Graphs Online Querying, pp. 134–146. Springer, New York (2016)
Google Scholar
Brakatsoulas, S., Pfoser, D., Salas, R., Wenk, C.: On map-matching vehicle tracking data. In: VLDB, pp. 853–864. VLDB Endowment (2005)
Chawathe, S.S.: Segment-based map matching. In: IEEE Intelligent Vehicles Symposium, pp. 1190–1197. IEEE (2007)
Cho, W., Choi, E.: A GPS trajectory map-matching mechanism with DTG big data on the hbase system. In: Proceedings of the 2015 International Conference on Big Data Applications and Services, pp. 22–29. ACM (2015)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Eldawy, A., Mokbel, M.F.: Spatialhadoop: a mapreduce framework for spatial data. In: ICDE, pp. 1352–1363 (2015)
Goh, C.Y., Dauwels, J., Mitrovic, N., Asif, M., Oran, A., Jaillet, P.: Online map-matching based on hidden markov model for real-time traffic sensing applications. In: International Conference on Intelligent Transportation Systems (ITSC), pp. 776–781. IEEE (2012)
Hadoop: https://hadoop.apache.org/
Hu, G., Shao, J., Liu, F., Wang, Y., Shen, H.T.: If-matching: towards accurate map-matching with information fusion. TKDE 29(1), 114–127 (2017)
Google Scholar
Huang, J., Qiao, S., Yu, H., Qie, J., Liu, C.: Parallel map matching on massive vehicle GPS data using mapreduce. In: International Conference on Embedded and Ubiquitous Computing, & International Conference on High Performance Computing and Communications, pp. 1498–1503. IEEE (2013)
Javanmard, A., Haridasan, M., Zhang, L.: Multi-track map matching. In: SIGSPATIAL, pp. 394–397. ACM (2012)
Kim, S., Kim, J.H.: Adaptive fuzzy-network-based c-measure map-matching algorithm for car navigation system. IEEE Trans. Ind. Electron. 48(2), 432–441 (2001)
Article Google Scholar
Li, Y., Huang, Q., Kerber, M., Zhang, L., Guibas, L.: Large-scale joint map matching of GPS traces. In: SIGSPATIAL, pp. 214–223. ACM (2013)
Lou, Y., Zhang, C., Zheng, Y., Xie, X., Wang, W., Huang, Y.: Map-matching for low-sampling-rate GPS trajectories. In: SIGSPATIAL, pp. 352–361. ACM (2009)
Newson, P., Krumm, J.: Hidden markov map matching through noise and sparseness. In: SIGSPATIAL, pp. 336–343. ACM (2009)
OpenStreetMap: https://www.openstreetmap.org/
Pink, O., Hummel, B.: A statistical approach to map matching using road network geometry, topology and vehicular motion constraints. In: International Conference on Intelligent Transportation Systems (ITSC), pp. 862–867. IEEE (2008)
Shi, J., Qiu, Y., Minhas, U.F., Jiao, L., Wang, C., Reinwald, B., Özcan, F.: Clash of the titans: Mapreduce vs. spark for large scale data analytics. In: VLDB, pp. 2110–2121 (2015)
Article Google Scholar
Tang, Y., Zhu, A.D., Xiao, X.: An efficient algorithm for mapping vehicle trajectories onto road networks. In: SIGSPATIAL, pp. 601–604. ACM (2012)
Tiwari, V.S., Arya, A., Chaturvedi, S.: Framework for horizontal scaling of map matching: using map-reduce. In: International Conference on Information Technology, pp. 30–34. IEEE (2014)
Wang, H., Li, J., Hou, Z., Fang, R., Mei, W., Huang, J.: Research on parallelized real-time map matching algorithm for massive GPS data. Clust. Comput. 20(2), 1123–1134 (2017)
Article Google Scholar
Wei, H., Wang, Y., Forman, G., Zhu, Y., Guan, H.: Fast Viterbi map matching with tunable weight functions. In: SIGSPATIAL, pp. 613–616. ACM (2012)
Wenk, C., Salas, R., Pfoser, D.: Addressing the need for map-matching speed: Localizing global curve-matching algorithms. In: International Conference on Scientific and Statistical Database Management (SSDBM), pp. 379–388. IEEE (2006)
Xia, Y., Liu, Y., Ye, Z., Wu, W., Zhu, M.: Quadtree-based domain decomposition for parallel map-matching on gps data. In: International Conference on Intelligent Transportation Systems (ITSC), pp. 808–813. IEEE (2012)
Xie, D., Li, F., Yao, B., Li, G., Zhou, L., Guo, M.: Simba: Efficient in-memory spatial analytics. In: SIGMOD. ACM (2016)
You, S., Zhang, J., Gruenwald, L.: Large-scale spatial join query processing in cloud. In: ICDE Workshops, pp. 34–41. IEEE (2015)
Yu, J., Wu, J., Sarwat, M.: Geospark: A cluster computing framework for processing large-scale spatial data. In: SIGSPATIAL, p. 70. ACM (2015)
Yuan, M., Deng, K., Zeng, J., Li, Y., Ni, B., He, X., Wang, F., Dai, W., Yang, Q.: Oceanst: a distributed analytic system for large-scale spatiotemporal mobile broadband data. VLDB 7(13), 1561–1564 (2014)
Google Scholar
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: USENIX Conference on Networked Systems Design and Implementation, p. 2 (2012)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: USENIX Conference on Hot Topics in Cloud Computing, p. 10 (2010)
Zheng, K., Zheng, Y., Xie, X., Zhou, X.: Reducing uncertainty of low-sampling-rate trajectories. In: ICDE, pp. 1144–1155. IEEE (2012)

Download references

Acknowledgements

This research is partially supported by the Brazilian National Council for Scientific and Technological Development (CNPq).

Author information

Authors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia
Douglas Alves Peixoto, Bolong Zheng & Xiaofang Zhou
School of Information and Communication Technology, Griffith University, Gold Coast, Australia
Hung Quoc Viet Nguyen

Authors

Douglas Alves Peixoto
View author publications
You can also search for this author in PubMed Google Scholar
Hung Quoc Viet Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Bolong Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofang Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Douglas Alves Peixoto.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alves Peixoto, D., Quoc Viet Nguyen, H., Zheng, B. et al. A framework for parallel map-matching at scale using Spark. Distrib Parallel Databases 37, 697–720 (2019). https://doi.org/10.1007/s10619-018-7254-0

Download citation

Published: 10 November 2018
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10619-018-7254-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A framework for parallel map-matching at scale using Spark

Abstract

Access this article

Similar content being viewed by others

Research on parallelized real-time map matching algorithm for massive GPS data

A Practical Guide to an Open-Source Map-Matching Approach for Big GPS Data

GAM: A GPU-Accelerated Algorithm for MaxRS Queries in Road Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A framework for parallel map-matching at scale using Spark

Abstract

Access this article

Similar content being viewed by others

Research on parallelized real-time map matching algorithm for massive GPS data

A Practical Guide to an Open-Source Map-Matching Approach for Big GPS Data

GAM: A GPU-Accelerated Algorithm for MaxRS Queries in Road Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation