ABSTRACT
With advances in location-acquisition techniques, such as GPS- embedded phones, an enormous volume of trajectory data is generated, by people, vehicles, and animals. This trajectory data is one of the most important data sources in many urban computing applications, e.g., traffic modeling, user profiling analysis, air quality inference, and resource allocation.
To utilize large scale trajectory data efficiently and effectively, cloud computing platforms, e.g., Microsoft Azure, are the most convenient and economic way. However, traditional cloud computing platforms are not designed to deal with spatio-temporal data, such as trajectories. To this end, we design and implement a holistic cloud-based trajectory data management system on Microsoft Azure to bridge the gap between trajectory data and urban applications. Our system can efficiently store, index, and query large trajectory data with three functions: 1) trajectory ID-temporal query, 2) trajectory spatio-temporal query, and 3) trajectory mapmatching. The efficiency of the system is tested and tuned based on real-time trajectory data feeds. The system is currently used in many internal urban applications, as we will illustrate using case studies.
- Y. Zheng, "Trajectory data mining: an overview," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 6, no. 3, p. 29, 2015. Google ScholarDigital Library
- Y. Zheng, L. Capra, O. Wolfson, and H. Yang, "Urban computing: concepts, methodologies, and applications," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 5, no. 3, p. 38, 2014. Google ScholarDigital Library
- Y. Zheng, F. Liu, and H.-P. Hsieh, "U-air: when urban air quality inference meets big data," in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013, pp. 1436--1444. Google ScholarDigital Library
- Y. Zheng, X. Yi, M. Li, R. Li, Z. Shan, E. Chang, and T. Li, "Forecasting fine-grained air quality based on big data," in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015, pp. 2267--2276. Google ScholarDigital Library
- Y. Wang, Y. Zheng, and Y. Xue, "Travel time estimation of a path using sparse trajectories," in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014, pp. 25--34. Google ScholarDigital Library
- A. M. Hendawi, J. Bao, M. F. Mokbel, and M. Ali, "Predictive tree: An efficient index for predictive queries on road networks," in 2015 IEEE 31st International Conference on Data Engineering. IEEE, 2015, pp. 1215--1226. Google ScholarCross Ref
- A. Aji, F. Wang, H. Vo, R. Lee, Q. Liu, X. Zhang, and J. Saltz, "Hadoop gis: a high performance spatial data warehousing system over mapreduce," Proceedings of the VLDB Endowment, vol. 6, no. 11, pp. 1009--1020, 2013. Google ScholarDigital Library
- A. Eldawy and M. F. Mokbel, "A demonstration of spatialhadoop: an efficient mapreduce framework for spatial data," Proceedings of the VLDB Endowment, vol. 6, no. 12, pp. 1230--1233, 2013. Google ScholarDigital Library
- S. You, J. Zhang, and L. Gruenwald, "Spatial join query processing in cloud: Analyzing design choices and performance comparisons," in Parallel Processing Workshops (ICPPW), 2015 44th International Conference on. IEEE, 2015, pp. 90--97. Google ScholarDigital Library
- J. Yu, J. Wu, and M. Sarwat, "Geospark: A cluster computing framework for processing large-scale spatial data," in Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, 2015, p. 70. Google ScholarDigital Library
- Y. Li, Y. Zheng, S. Ji, W. Wang, and Z. Gong, "Location selection for ambulance stations: a data-driven approach," in Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, 2015, p. 85. Google ScholarDigital Library
- J. Yuan, Y. Zheng, C. Zhang, X. Xie, and G.-Z. Sun, "An interactive-voting based map matching algorithm," in Proceedings of the 2010 Eleventh International Conference on Mobile Data Management. IEEE Computer Society, 2010, pp. 43--52. Google ScholarDigital Library
- J. Shang, Y. Zheng, W. Tong, E. Chang, and Y. Yu, "Inferring gas consumption and pollution emission of vehicles throughout a city," in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014, pp. 1027--1036. Google ScholarDigital Library
- D. Liu, D. Weng, Y. Li, Y. Wu, J. Bao, Y. Zheng, and H. Qu, "SmartAdP: Visual Analytics of Large-scale Taxi Trajectories for Selecting Billboard Locations," in The IEEE Conference on Visual Analytics Science and Technology (IEEE VAST 2016). IEEE Computer Society, 2016.Google Scholar
- Y. Li, J. Bao, Y. Li, Y. Wu, Z. Gong, and Y. Zheng, "Mining the Most Influential k-Location Set from Massive Trajectories," in SIGSPATIAL. ACM, 2016.Google Scholar
- M. F. Mokbel, T. M. Ghanem, and W. G. Aref, "Spatio-temporal access methods," IEEE Data Eng. Bull., vol. 26, no. 2, pp. 40--49, 2003.Google Scholar
- L.-V. Nguyen-Dinh, W. G. Aref, and M. Mokbel, "Spatio-temporal access methods: Part 2 (2003--2010)," 2010.Google Scholar
- K. Zheng, Y. Zheng, N. J. Yuan, and S. Shang, "On discovery of gathering patterns from trajectories," in ICDE, 2013, pp. 242--253.Google Scholar
- Y. Wang, Y. Zheng, and Y. Xue, "Travel time estimation of a path using sparse trajectories," in SIGKDD, 2014, pp. 25--34.Google Scholar
- W. Luo, H. Tan, L. Chen, and L. M. Ni, "Finding time period-based most frequent path in big trajectory data," in SIGMOD, 2013, pp. 713--724.Google Scholar
- "Geocouch," https://github.com/couchbase/geocouch/.Google Scholar
- "neo4j/spatial," https://github.com/neo4j/spatial/.Google Scholar
- J. S. Greenfeld, "Matching gps observations to locations on a digital map," in Transportation Research Board 81st Annual Meeting, 2002.Google Scholar
- H. Yin and O. Wolfson, "A weight-based map matching method in moving objects databases," in Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on. IEEE, 2004, pp. 437--438.Google Scholar
- O. Pink and B. Hummel, "A statistical approach to map matching using road network geometry, topology and vehicular motion constraints," in 2008 11th International IEEE Conference on Intelligent Transportation Systems. IEEE, 2008, pp. 862--867. Google ScholarCross Ref
- J. Huang, S. Qiao, H. Yu, J. Qie, and C. Liu, "Parallel map matching on massive vehicle gps data using mapreduce," in High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC). IEEE, 2013, pp. 1498--1503. Google ScholarCross Ref
- J. Lu and R. H. Güting, "Parallel secondo: boosting database engines with hadoop," in Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on. IEEE, 2012, pp. 738--743. Google ScholarDigital Library
- Q. Ma, B. Yang, W. Qian, and A. Zhou, "Query processing of massive trajectory data based on mapreduce," in Proceedings of the first international workshop on Cloud data management. ACM, 2009, pp. 9--16. Google ScholarDigital Library
Index Terms
- Managing massive trajectories on the cloud
Recommendations
A Cloud-Based Trajectory Data Management System
SIGSPATIAL '17: Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information SystemsWith the rapid development of location-acquisition techniques, massive trajectories are continuously generated. Many urban applications rely heavily on the data mining/analysis results of massive trajectory data. This demo presents a holistic data ...
Querying Massive Trajectories by Path on the Cloud
SIGSPATIAL '17: Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information SystemsA path query aims to find the trajectories that pass a given sequence of connected road segments within a time period. It is very useful in many urban applications, e.g., 1) traffic modeling, 2) frequent path mining, and 3) traffic anomaly detection. ...
Analysis of Cloud Computing Delivery Architecture Models
WAINA '11: Proceedings of the 2011 IEEE Workshops of International Conference on Advanced Information Networking and ApplicationsCloud computing is one of the emerging technologies that will lead to the next generation of Internet. It provides optimized and efficient computing through enhanced collaboration, agility, scalability, and availability. In this paper, the evolution of ...
Comments