research-article

Ratel: Interactive Analytics for Large Scale Trajectories

Authors:
Haoda Li

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Guoliang Li

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Jiayang Liu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Haitao Yuan

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Haiquan Wang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataJune 2019Pages 1949–1952https://doi.org/10.1145/3299869.3320222

Published:25 June 2019Publication History

SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data

Pages 1949–1952

ABSTRACT

Trajectory data analytics plays an important role in many applications, such as transportation optimization, urban planning, taxi scheduling, and so on. However, trajectory data analytics has a great challenge that the time cost for processing queries is too high on big datasets. In this paper, we demonstrate a distributed in-memory framework Ratel base on Spark for analyzing large scale trajectories. Ratel groups trajectories into partitions by considering the data locality and load balance. We build R-Tree based global indexes to prune partitions when applying trajectory search and join. For each partition, Ratel uses a filter-refinement method to efficiently find similar trajectories. We show three kinds of scenarios - bus station planning, route recommendation, and transportation analytics. Demo attendees can interact with a web UI, pose different queries on the dataset, and navigate the query result.

References

Helmut Alt and Michael Godau. 1995. Computing the Fréchet distance between two polygonal curves. International Journal of Computational Geometry & Applications 5, 01n02 (1995), 75--91.Google ScholarCross Ref
Donald J Berndt and James Clifford. 1994. Using dynamic time warping to find patterns in time series. In KDD workshop, Vol. 10. 359--370. Google ScholarDigital Library
Lei Chen and Raymond Ng. 2004. On the marriage of lp-norms and edit distance. In VLDB. VLDB, 792--803.Google Scholar
Jean-Francois Hangouet. 1995. Computation of the Hausdorff distance between plane vector polylines. In AUTOCARTO-CONFERENCE-. 1--10.Google Scholar
Zeyuan Shang, Guoliang Li, and Zhifeng Bao. 2018. Dita: Distributed in-memory trajectory analytics. In SIGMOD. ACM, 725--740. Google ScholarDigital Library
Haiquan Wang, Guoliang Li, Nan Tang, and Jianhua Feng. 2019. Distributed Trajectory Similarity Search and Join. In VLDB.Google Scholar
Xiaoyue Wang, Abdullah Mueen, Hui Ding, Goce Trajcevski, Peter Scheuermann, and Eamonn Keogh. 2013. Experimental comparison of representation methods and distance measures for time series data. DMKD (2013). Google ScholarDigital Library
Haitao Yuan and Guoliang Li. 2019. Distributed In-Memory Trajectory Similarity Search and Join on Road Network. In ICDE.Google Scholar

Index Terms

Ratel: Interactive Analytics for Large Scale Trajectories
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms

Recommendations

Uncertain top-k query processing in distributed environments

The top-k query on uncertain data set has been a very hot topic these years, and there have been many studies on uncertain top-k queries. Unfortunately, most of the existing algorithms only consider centralized processing environments, and they are not ...
Read More
GeoFlink: A Distributed and Scalable Framework for the Real-time Processing of Spatial Streams
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Apache Flink is an open-source system for scalable processing of batch and streaming data. Flink does not natively support efficient processing of spatial data streams, which is a requirement of many applications dealing with spatial data. Besides Flink,...
Read More
SWEclat: a frequent itemset mining algorithm over streaming data using Spark Streaming
Abstract
Finding frequent itemsets in a continuous streaming data is an important data mining task which is widely used in network monitoring, Internet of Things data analysis and so on. In the era of big data, it is necessary to develop a distributed ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data
June 2019
2106 pages
ISBN:9781450356435
DOI:10.1145/3299869
General Chairs:
Peter Boncz
CWI & Vrije Universiteit Amsterdam, The Netherlands
,
Stefan Manegold
CWI & Universiteit Leiden, The Netherlands
,
Program Chairs:
Anastasia Ailamaki
EPFL, Switzerland
,
Amol Deshpande
University of Maryland, USA
,
Tim Kraska
MIT, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 June 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
distributed
interactive
trajectory analytics
Qualifiers
- research-article
Conference

Acceptance Rates
SIGMOD '19 Paper Acceptance Rate88of430submissions,20%Overall Acceptance Rate785of4,003submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 262
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Ratel: Interactive Analytics for Large Scale Trajectories

SIGMOD '19: Proceedings of the 2019 International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Uncertain top-k query processing in distributed environments

GeoFlink: A Distributed and Scalable Framework for the Real-time Processing of Spatial Streams

SWEclat: a frequent itemset mining algorithm over streaming data using Spark Streaming