skip to main content
10.1145/3678717.3691258acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
short-paper

On Splitting Raw Trajectories

Published: 22 November 2024 Publication History

Abstract

With the surge of data-driven solutions for trajectory analysis operations, the need for accurate trajectory trip data has spiked. However, the available datasets are raw trajectories spanning from hours to years, not representing actual trips for downstream applications. Therefore, pre-processing steps, such as basic rules to extract trips, are needed to use the datasets. However, this paper demonstrates that the current pre-processing steps are not enough and result in low accuracy, negatively affecting the downstream applications. This paper presents an overview of an accurate and scalable algorithm for splitting raw trajectories for trip extraction. We go beyond the basic rules to introduce a realistic definition of a trip and offer two scalable heuristics over the exhaustive brute force approach of the algorithm with similar accuracy. Experimental results show that the proposed algorithm is: (a) far more accurate than the basic rules, (b) scalable when employing either of the heuristics.

References

[1]
S. Abbar, R. Stanojevic, M. Musleh, M. M. ElShrif, and M. F. Mokbel. A Demonstration of QARTA: An ML-based System for Accurate Map Services. Proceedings of the VLDB Endowment, 14(12):2723--2726, 2021.
[2]
S. P. A. Alewijnse, K. Buchin, M. Buchin, A. Kölzsch, H. Kruckenberg, and M. A. Westenberg. A Framework for Trajectory Segmentation by Stable Criteria. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 351--360, Dallas, Texas, Nov. 2014.
[3]
R. Alseghayer. Racoon: Rapid Contact Tracing of Moving Objects Using Smart Indexes. In Proceedings of the IEEE International Conference on Mobile Data Management, MDM, pages 274--276, June 2021.
[4]
L. Bracciale, M. Bonola, P. Loreti, G. Bianchi, R. Amici, and A. Rabuffi. CRAWDAD dataset roma/taxi (v. 2014--07-17). Downloaded from https://ieee-dataport.org/open-access/crawdad-romataxi, July 2014.
[5]
M. Buchin, A. Driemel, M. J. van Kreveld, and V. Sacristán. Segmenting Trajectories: A Framework and Algorithms Using Spatiotemporal Criteria. Journal of Spatial Information Science, 3:33--63, 2011.
[6]
R. D. Das and S. Winter. Automated Urban Travel Interpretation: A Bottom-up Approach for Trajectory Segmentation. Sensors, 16, 2016.
[7]
D. Dias and L. H. M. K. Costa. CRAWDAD dataset coppe-ufrj/riobuses (v. 2018--03-19). Downloaded from https://ieee-dataport.org/open-access/crawdad-coppe-ufrjriobuses, Mar. 2018.
[8]
M. Etemad, Z. Etemad, A. Soares, V. Bogorny, S. Matwin, and L. Torgo. Wise Sliding Window Segmentation: A Classification-Aided Approach for Trajectory Segmentation. In Advances in Artificial Intelligence, pages 208--219, Cham, May 2020.
[9]
M. Etemad, A. S. Júnior, A. Hoseyni, J. Rose, and S. Matwin. A Trajectory Segmentation Algorithm Based on Interpolation-based Change Detection Strategies. In EDBT/ICDT Workshops, page 58, Mar. 2019.
[10]
S. Guo, X. Li, W.-K. Ching, R. Dan, W.-K. Li, and Z. Zhang. GPS trajectory data segmentation based on probabilistic logic. International Journal of Approximate Reasoning, 103:227--247, 2018.
[11]
R. Hariharan and K. Toyama. Project Lachesis: Parsing and Modeling Location Histories. In GIScience, pages 106--124, Berlin, Heidelberg, 2004.
[12]
B. Hossain, K. A. Adnan, M. F. Rabbi, and M. E. Ali. Modelling Road Traffic Congestion from Trajectories. In Proceedings of the ACM International Conference on Data Science and Information Technology, DSIT, pages 117--122, July 2020.
[13]
X. Huang, Y. Yin, S. Lim, G. Wang, B. Hu, J. Varadarajan, S. Zheng, A. Bulusu, and R. Zimmermann. Grab-posisi: An extensive real-life GPS trajectory dataset in southeast asia. In Proceedings of the ACM SIGSPATIAL International Workshop on Prediction of Human Mobility, PredictGIS 2019, pages 1--10, Chicago, IL, USA, Nov. 2019.
[14]
C. S.Jensen. Value Creation from Massive Data in Transportation? The Case of Vehicle Routing. IEEE Data Engineering Bulletin, 42(3):4--8, 2019.
[15]
M. Kafsi, M. Grossglauser, and P. Thiran. Traveling Salesman in Reverse: Conditional Markov Entropy for Trajectory Segmentation. In Proceedings of the IEEE International Conference on Data Mining, ICDM, pages 201--210, Nov. 2015.
[16]
A. Karatzoglou, A. Jablonski, and M. Beigl. A Seq2Seq Learning Approach for Modeling Semantic Trajectories and Predicting the Next Location. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 528--531, Nov. 2018.
[17]
M. Laass, M. Kiermeier, and M. Werner. Improving Persistence Based Trajectory Simplification. In Proceedings of the IEEE International Conference on Mobile Data Management, MDM, pages 157--162, June 2021.
[18]
A. Liu, Y. Zhang, X. Zhang, G. Liu, Y. Zhang, Z. Li, L. Zhao, Q. Li, and X. Zhou. Representation Learning With Multi-Level Attention for Activity Trajectory Similarity Computation. IEEE Transactions on Knowledge and Data Engineering, TKDE, 34(5):2387--2400, 2022.
[19]
C. Long, R. C. Wong, and H. V. Jagadish. Trajectory Simplification: On Minimizing the Direction-based Error. Proceedings of the International Conference on Very Large Data Bases, PVLDB, 8(1):49--60, 2014.
[20]
C. Markos, J. J. Q. Yu, and Y. D. R. Xu. Capturing Uncertainty in Unsupervised GPS Trajectory Segmentation Using Bayesian Deep Learning. Proceedings of the AAAI Conference on Artificial Intelligence, AAAI, 35:390--398, 2021.
[21]
M. F. Mokbel, W. G. Aref, S. E. Hambrusch, and S. Prabhakar. Towards Scalable Location-aware Services: Requirements and Research Issues. In Proceedings of the ACM Symposium on Advances in Geographic Information Systems, ACM GIS, page 110--117, Nov. 2003.
[22]
S. Moosavi, R. Ramnath, and A. Nandi. Discovery of Driving Patterns by Trajectory Segmentation. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 1--4, Burlingame, CA, USA, Oct. 2016.
[23]
M. Musleh, S. Abbar, R. Stanojevic, and M. F. Mokbel. QARTA: An ML-based System for Accurate Map Services. Proceedings of the International Conference on Very Large Data Bases, PVLDB, 14(11):2273--2282, 2021.
[24]
Kaggle. New York City Taxi Trip Duration. https://www.kaggle.com/c/nyc-taxi-trip duration/data.
[25]
Open Source Routing Machine (OSRM). http://project-osrm.org/.
[26]
S. A. Pedersen, B. Yang, and C. S. Jensen. Anytime Stochastic Routing with Hybrid Learning. Proceedings of the International Conference on Very Large Data Bases, PVLDB, 13(9):1555--1567, 2020.
[27]
Taxi Service Trajectory. Prediction Challenge. ECML PKDD 2015. https://archive.ics.uci.edu/dataset/339/taxi+service+trajectory+prediction+challenge+ecml+pkdd+2015.
[28]
SafeGraph. Your Partner in Places Data. https://www.safegraph.com/.
[29]
San Francisco Municipal Transportation Agency (SFMTA) - Transit Vehicle Location History (Current Year). https://data.sfgov.org/Transportation/SFMTA-Transit-Vehicle-Location-History-Current-Yea/x344-v6h6/about_data.
[30]
Y. Shen, H. Dong, L. Jia, Y. Qin, F. Su, M. Wu, K. Liu, P. Li, and Z. Tian. A Method of Traffic Travel Status Segmentation Based on Position Trajectories. In Proceedings of the IEEE International Intelligent Transportation Systems Conference, ITSC, pages 2877--2882, Gran Canaria, Spain, Sept. 2015.
[31]
R. Stanojevic, S. Abbar, and M. Mokbel. W-edge: Weighing the Edges of the Road Network. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 424--427, Seattle, WA, USA, Nov. 2018.
[32]
R. Stanojevic, S. Abbar, S. Thirumuruganathan, S. Chawla, F. Filali, and A. Aleimat. Robust Road Map Inference through Network Alignment of Trajectories. In Proceedings of the SIAM International Conference on Data Mining, SDM, pages 135--143, May 2018.
[33]
C. Sydora, F. Nawaz, L. Bindra, and E. Stroulia. Building Occupancy Simulation and Analysis under Virus Scenarios. ACM Transactions on Spatial Algorithms and Systems, TSAS, 8(3):1--20, 2022.
[34]
T-Drive trajectory data sample. https://www.microsoft.com/en-us/research/publication/t-drive-trajectory-data-sample/.
[35]
Unacast. Build Better Products and Make Smarter Decisions with Real-world Location Data. https://www.unacast.com/.
[36]
Veraset. Your Trusted Partner for Mobility Data. https://www.veraset.com/.
[37]
F. Wang, J. Wang, J. Cao, C. Chen, and X. J. Ban. Extracting Trips from Multi-Sourced Data for Mobility Pattern Analysis: An App-based Data Example. Journal of Transportation Research Part C: Emerging Technologies, 105:183--202, 2019.
[38]
G. Wang, X. Chen, F. Zhang, Y. Wang, and D. Zhang. Experience: Understanding Long-Term Evolving Patterns of Shared Electric Vehicle Networks. In Proceeding of the International Conference on Mobile Computing and Networking, MobiCom, pages 1--12, Los Cabos, Mexico, Oct. 2019.
[39]
L. Wang, W. Ma, Y. Fan, and Z. Zuo. Trip Chain Extraction using Smartphone-Collected Trajectory Data. Transportmetrica B: Transport Dynamics, 7:255--274, 2019.
[40]
S. Wang, Z. Bao, J. S. Culpepper, and G. Cong. A Survey on Trajectory Data Management, Analytics, and Learning. ACM Computing Surveys, 54(2):39:1--39:36, 2021.
[41]
N. Wu, J. Wang, W. X. Zhao, and Y. Jin. Learning to Effectively Estimate the Travel Time for Fastest Route Recommendation. In Proceedings of the International Conference on Information and Knowledge Management, CIKM, pages 1923--1932, New York, NY, USA, Nov. 2019.
[42]
H. Xue, F. D. Salim, Y. Ren, and N. Oliver. MobTCast: Leveraging Auxiliary Trajectory Forecasting for Human Mobility Prediction. In Proceedings of the Annual Conference on Neural Information Processing Systems, NeurIPS, pages 30380--30391, Dec. 2021.
[43]
B. Yang, M. Kaul, and C. S. Jensen. Using incomplete information for complete weight annotation of road networks. IEEE Transactions on Knowledge and Data Engineering, TKDE, 26(5):1267--1279, 2014.
[44]
Y. Yang, F. Zhang, and D. Zhang. SharedEdge: GPS-Free Fine-Grained Travel Time Estimation in State-Level Highway Systems. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(1):48:1--48:26, 2018.
[45]
J. J. Ying, W. Lee, T. Weng, and V. S. Tseng. Semantic Trajectory Mining for Location Prediction. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 34--43, Nov. 2011.
[46]
H. Yuan, G. Li, and Z. Bao. Route Travel Time Estimation on a Road Network Revisited: Heterogeneity, Proximity, Periodicity and Dynamicity. Proceedings of the International Conference on Very Large Data Bases, PVLDB, 16(3):393--405, 2023.
[47]
J. Yuan, Y. Zheng, X. Xie, and G. Sun. T-Drive: Enhancing Driving Directions with Taxi Drivers' Intelligence. IEEE Transactions on Knowledge and Data Engineering, TKDE, 25(1):220--232, 2013.
[48]
Y. Zheng. Trajectory Data Mining: An Overview. ACM Transactions on Intelligent Systems and Technology, TIST, 6(3):29:1--29:41, 2015.
[49]
F. Zhou, H. Wu, G. Trajcevski, A. A. Khokhar, and K. Zhang. Semi-supervised Trajectory Understanding with POI Attention for End-to-End Trip Recommendation. ACM Transactions on Spatial Algorithms and Systems, TSAS, 6(2):13:1--13:25, 2020.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGSPATIAL '24: Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems
October 2024
743 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2024
Accepted: 23 August 2024
Revised: 26 July 2024
Received: 07 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Segmentation
  2. Spatial-temporal
  3. Trajectory
  4. Trips

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

Conference

SIGSPATIAL '24
Sponsor:

Acceptance Rates

SIGSPATIAL '24 Paper Acceptance Rate 37 of 122 submissions, 30%;
Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 41
    Total Downloads
  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)20
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media