skip to main content
10.1145/2345316.2345346acmotherconferencesArticle/Chapter ViewAbstractPublication Pagescom-geoConference Proceedingsconference-collections
research-article

Performance comparisons of spatial data processing techniques for a large scale mobile phone dataset

Published: 01 July 2012 Publication History

Abstract

Mobile technology, especially mobile phone, is very popular nowadays. Increasing number of mobile users and availability of GPS-embedded mobile phones generate large amount of GPS trajectories that can be used in various research areas such as people mobility and transportation planning. However, how to handle such a large-scale dataset is a significant issue particularly in spatial analysis domain. In this paper, we aimed to explore a suitable way for extracting geo-location of GPS coordinate that achieve large-scale support, fast processing, and easily scalable both in storage and calculation speed. Geo-locations are cities, zones, or any interesting points. Our dataset is GPS trajectories of 1.5 million individual mobile phone users in Japan accumulated for one year. The total number was approximately 9.2 billion records. Therefore, we conducted performance comparisons of various methods for processing spatial data, particularly for a huge dataset. In this work, we first processed data on PostgreSQL with PostGIS that is a traditional way for spatial data processing. Second, we used java application with spatial library called Java Topology suite (JTS). Third, we tried on Hadoop Cloud Computing Platform focusing on using Hive on top of Hadoop to allow SQL-like support. However, Hadoop/Hive did not support spatial query at the moment. Hence, we proposed a solution to enable spatial support on Hive. As the results, Hadoop/hive with spatial support performed best result in large-scale processing among evaluated methods and in addition, we recommended techniques in Hadoop/Hive for processing different types of spatial data.

References

[1]
Liao, L., et al. 2005. Building Personal Map from GPS Data. In proceedings of IJCAI MOO05, Springer Press(2005): 249--265
[2]
Ashbrook, D., and Starner, T. 2003. Using GPS to learn significant locations and predict movement across multiple users. Personal and Ubiquitous Computing 7(5), 275--286
[3]
Zheng, Y., et al. 2008. Learning transportation mode from raw GPS data from geographic applications on the Web. In Proceedings of WWW 2008, (Beijing, China, April 2008), ACM Press: 247--256
[4]
Zheng, Y., et al. 2009. Mining interesting location and travel sequences from GPS trajectories. In Proceedings of WWW 2009, (Madrid, Spain, April 2009), ACM Press: 791--800
[5]
PostGIS: http://postgis.refractions.net/
[6]
Java Topology Suite: http://tsusiatsoftware.net/jts/main.html
[7]
Yang, J. and Wu, Su. 2010. Studies on Application of Cloud Computing Techniques in GIS. In Proceedings of IGASS 2010, (China, 2010), pp. 492--495.
[8]
Buyya, R., et al. 2008. Cloud Computing and Emerging IT Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility. Future Generation Computer Systems 25(6), pp. 599--616
[9]
Hadoop Project: http://hadoop.apache.org/
[10]
Thusoo, A., et al. 2010. Data warehousing and analytics infrastructure at Facebook. In Proceedings of ACM SIGMOD 2010, pp. 1013--1020.
[11]
Lam, C. 2011. Hadoop in Action. Connecticut, pp.-17--19.
[12]
Zhang, S., et al. 2009. Spatial queries evaluation with MapReduce. In Proceedings of International Conference on Grid and Cooperative Computing 2009, IEEE Computer Society (2009), pp. 287--292.
[13]
Zhang, S., et al. 2009. SJMR: Parallelizing spatial join with MapReduce on clusters. CLUSTER (2009), pp. 1--8.
[14]
Hive Project: http://hive.apache.org/
[15]
Thusoo, A., et al. 2010. Hive - a petabyte scale data warehouse using hadoop. In Proceedings of ICDE 2010, pp. 996--1005.
[16]
Wang, F., et al. 2011. Hadoop-GIS: A High Performance Query System for Analytical Medical Imaging with MapReduce. Technical Report. Center for Comprehensive Information, Emory University.
[17]
Katayama, N. and Satoh, S. 1997. The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries. In Proceedings of ACM SIGMOD 1997, (Arizona, USA, 1997), pp. 396--380.

Cited By

View all
  • (2022)Development of Big Data-Analysis Pipeline for Mobile Phone Data with Mobipack and Spatial EnhancementISPRS International Journal of Geo-Information10.3390/ijgi1103019611:3(196)Online publication date: 15-Mar-2022
  • (2021)Correct and stable sorting for overflow streaming data with a limited storage size and a uniprocessorPeerJ Computer Science10.7717/peerj-cs.3557(e355)Online publication date: 12-Feb-2021
  • (2020)The Data Visualization of Large Scale AIS Trajectories Data on Hadoop2020 6th International Conference on Control, Automation and Robotics (ICCAR)10.1109/ICCAR49639.2020.9108055(758-761)Online publication date: Apr-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
COM.Geo '12: Proceedings of the 3rd International Conference on Computing for Geospatial Research and Applications
July 2012
212 pages
ISBN:9781450311137
DOI:10.1145/2345316
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GPS
  2. Hadoop
  3. cloud computing
  4. mobile phone
  5. spatial query

Qualifiers

  • Research-article

Conference

COM.Geo '12

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Development of Big Data-Analysis Pipeline for Mobile Phone Data with Mobipack and Spatial EnhancementISPRS International Journal of Geo-Information10.3390/ijgi1103019611:3(196)Online publication date: 15-Mar-2022
  • (2021)Correct and stable sorting for overflow streaming data with a limited storage size and a uniprocessorPeerJ Computer Science10.7717/peerj-cs.3557(e355)Online publication date: 12-Feb-2021
  • (2020)The Data Visualization of Large Scale AIS Trajectories Data on Hadoop2020 6th International Conference on Control, Automation and Robotics (ICCAR)10.1109/ICCAR49639.2020.9108055(758-761)Online publication date: Apr-2020
  • (2019)New Model for Geospatial Coverages in JSONEmerging Technologies and Applications in Data Processing and Management10.4018/978-1-5225-8446-9.ch015(316-357)Online publication date: 2019
  • (2019)A Hadoop-Based Spatial Computation Framework for Large-Scale AIS Data2019 IEEE 2nd International Conference on Electronics Technology (ICET)10.1109/ELTECH.2019.8839429(599-602)Online publication date: May-2019
  • (2018)A Spatial Big Data Framework for Maritime Traffic Data2018 3rd International Conference on Computational Intelligence and Applications (ICCIA)10.1109/ICCIA.2018.00054(244-248)Online publication date: Jul-2018
  • (2017)Large Scale Mobility Analysis: Extracting Significant Places Using Hadoop/Hive and Spatial ProcessingRecent Advances and Future Prospects in Knowledge, Information and Creativity Support Systems10.1007/978-3-319-70019-9_17(205-219)Online publication date: 2-Dec-2017
  • (2017)Understanding Job-Housing Relationship from Cell Phone Data Based on HadoopBig Data Support of Urban Planning and Management10.1007/978-3-319-51929-6_19(359-387)Online publication date: 28-Sep-2017
  • (2016)Large scale spatial temporal data visualization based on spark and 3D volume rendering2016 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2016.7727428(1879-1882)Online publication date: Jul-2016
  • (2013)Anomalous event detection on large-scale GPS data from mobile phones using hidden markov model and cloud platformProceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication10.1145/2494091.2497352(1219-1228)Online publication date: 8-Sep-2013

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media