TrajViz: A Tool for Visualizing Patterns and Anomalies in Trajectory

Gao, Yifeng; Li, Qingzhe; Li, Xiaosheng; Lin, Jessica; Rangwala, Huzefa

doi:10.1007/978-3-319-71273-4_45

Yifeng Gao²²,
Qingzhe Li²²,
Xiaosheng Li²²,
Jessica Lin²² &
…
Huzefa Rangwala²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10536))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3330 Accesses

Abstract

Visualizing frequently occurring patterns and potentially unusual behaviors in trajectory can provide valuable insights into activities behind the data. In this paper, we introduce TrajViz, a motif (frequently repeated subsequences) based visualization software that detects patterns and anomalies by inducing “grammars” from discretized spatial trajectories. We consider patterns as a set of sub-trajectories with unknown lengths that are spatially similar to each other. We demonstrate that TrajViz has the capacity to help users visualize anomalies and patterns effectively.

You have full access to this open access chapter, Download conference paper PDF

STAVIS 2.0: Mining Spatial Trajectories via Motifs

PerSE: visual analytics for calendar related spatiotemporal periodicity detection and analysis

Article 15 December 2016

ITEA—interactive trajectories and events analysis: exploring sequences of spatio-temporal events in movement data

Article 10 May 2016

1 Introduction

With the rapid growth of tracking technology, a large amount of trajectory data are generated from users’ daily activities. Discovering frequently occurring patterns (motifs) and potentially unusual behaviors can be used to summarize the overwhelming amount of trajectories data and obtain meaningful knowledge. In this paper, we present TrajViz, a software that visualizes patterns and anomalies in trajectory datasets. TrajViz extends our previous work in time series motif discovery [1] to sub-trajectory pattern visualization. We consider patterns as a set of sub-trajectories with unknown lengths that are spatially similar to each other. We use a grid-based discretization approach to remove the speed information and adapt a grammar-based motif discovery algorithm, Iterative Sequitur (ItrSequitur), to discover the patterns. We design a user-friendly interface to allow visualization of repeated, as well as unusual sub-trajectories within the datasets.

2 Relate Work and Overview of TrajViz

Previously, we introduced a grammar-based motif discovery framework [7], which uses Sequitur [4], a grammar induction algorithm, to find approximate motifs of variable lengths in time series. However, the unique characteristics and challenges associated with spatial trajectory data make it unsuitable and difficult to apply the algorithms directly on trajectory data. In [5], the authors introduced STAVIS, a trajectory analytical system that uses grammar induction to infer variable-length patterns. However, its definition of “pattern” is based on time series motifs. Therefore, speed variation will significantly affect the quality of patterns discovered. Other work such as [2, 9] focuses on either sequential pattern mining based on important locations, or trajectory clustering, both of which are different from the goal of our software.

A screenshot of TrajViz is shown in Fig. 1. TrajViz follows the Visual Information-Seeking Mantra [8]. After processing the data, an overview heat map of pattern density is displayed. User can zoom in to see the detailed map and use domain knowledge to filter out unwanted patterns by setting minimum frequency, minimum continuous blocks length (Minimal Motif Length) and maximum frequency for anomaly detection (Anomaly Frequency). Adjusting these thresholds does not require re-running the discretization and grammar induction steps (introduced in the next subsection). Further details on TrajViz can be found in goo.gl/cKCeDt.

3 Our Approach

3.1 Discretization

Before we can induce grammars on trajectory data, it is necessary to pre-process the data. We first convert the trajectory data to speed-insensitive symbolic sequences after removing noises from the trajectory dataset. To prepare for discretization, we divide the entire region into an \((\alpha \times \alpha )\) equal-frequency grid, where \(\alpha \) is the grid size. We assign each grid cell a block ID sequentially from left to right and from top to bottom.

After block IDs are assigned, we use a four-step procedure to convert raw trajectory to a block ID sequence \(S_{block}\). First, we up-sample the raw trajectory by using linear interpolation to ensure that the consecutive blocks in \(S_{block}\) are spatially adjacent. Then trajectories are converted into block ID sequences based on the order of traversal. Next, we perform further noise removal by removing blocks that are barely covered by the trajectory. Finally, numerosity reduction [3] is adopted to compress the sequence by only recording the first occurrence of consecutively repeating symbol. \(S_{block}\) is insensitive to speed variation. This is an important property that allows us to detect spatially-similar sub-trajectories.

3.2 Grammar Induction with ItrSequitur

As demonstrated in previous work [7], a context-free grammar summarizes the structure of an input sequence. Intuitively, repeated substrings in \(S_{block}\) represent a set of similar sub-trajectories. Therefore, learning a set of grammar rules to identify repeating substrings from \(S_{block}\) can discover frequently occurring patterns (sub-trajectories) in trajectory data. Previous work [5] utilizes Sequitur [4], a linear complexity grammar induction approach, to learn the grammar rules. However, Sequitur can only detect patterns if they have identical symbolic representation. In TrajViz, we adapt an iterative version of Sequitur, called ItrSequitur [1], for more robust grammar induction. ItrSequitur iteratively rewrites the input sequence based on the output of Sequitur and re-induces the grammar on the revised sequence until no new grammar can be found. Different from Sequitur, ItrSequitur allows small variation in matching substrings. Therefore, it is robust to noise in the dataset.

3.3 Patterns/Anomalies Discovery and Motif Heatmap

TrajViz consolidates the patterns detected by merging patterns that have similar symbolic representations. Top-ranked frequent patterns that satisfy user-defined filtering conditions are listed in the motifs/anomalies table. User can navigate the patterns by clicking through the items in the table; a zoom-in of the selected pattern is then shown on the right panel. Figure 2 shows screenshots of a motif and an anomaly detected. To show the direction of the trajectories, the start points are marked by black circles, and the end points are denoted by black squares.

For each point in a motif, we compute the point density by counting the number of points from other motifs within some distance threshold, and create a motif heatmap. A five-color gradient (blue-cyan-green-yellow-red) is built to linearly map the densities to their specific colors. The most dense points have the red colors while the least dense ones are in blue.

To find anomalies, we create a trajectory rule-density curve by counting the number of grammar rules covering each consecutive pair of block IDs (we consider a pair at a time in order to preserve the direction of the trajectory). The intuition is that, an anomalous subsequence would have zero or very few repetitions, hence low rule-density. TrajViz finds low-density subsequences within a trajectory and marks them as unusual routes (Fig. 2(c)).

4 Target Audience

TrajViz provides an efficient, interpretable, and user-interactive mechanism to understand functional activities behind massive trajectory data. TrajViz targets a diverse audience including researchers, practitioners, and scientists who are interested in discovering patterns in trajectory data.

References

Gao, Y., Lin, J., Rangwala, H.: Iterative grammar-based framework for discovering variable-length time series motifs. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 7–12. IEEE (2016)
Google Scholar
Lee, J.-G., Han, J., Li, X., Gonzalez, H.: Traclass: trajectory classification using hierarchical region-based and trajectory-based clustering. Proc. VLDB Endow. 1(1), 1081–1094 (2008)
Article Google Scholar
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing sax: a novel symbolic representation of time series. Data Min. Knowl. Disc. 15(2), 107–144 (2007)
Article MathSciNet Google Scholar
Nevill-Manning, C.G., Witten, I.H.: Identifying hierarchical strcture in sequences: a linear-time algorithm. J. Artif. Intell. Res. (JAIR) 7, 67–82 (1997)
MATH Google Scholar
Oates, T., Boedihardjo, A.P., Lin, J., Chen, C., Frankenstein, S., Gandhi, S.: Motif discovery in spatial trajectories using grammar inference. In: Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management, pp. 1465–1468. ACM (2013)
Google Scholar
Piorkowski, M., Sarafijanovic-Djukic, N., Grossglauser, M.: A parsimonious model of mobile partitioned networks with clustering. In: 2009 First International Communication Systems and Networks and Workshops, pp. 1–10. IEEE (2009)
Google Scholar
Senin, P., Lin, J., Wang, X., Oates, T., Gandhi, S., Boedihardjo, A.P., Chen, C., Frankenstein, S., Lerner, M.: GrammarViz 2.0: a tool for grammar-based pattern discovery in time series. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8726, pp. 468–472. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44845-8_37
Google Scholar
Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualizations. In: IEEE Symposium on Visual Languages, 1996. Proceedings, pp. 336–343. IEEE (1996)
Google Scholar
Zheng, Y., Zhang, L., Xie, X., Ma, W.-Y.: Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th International Conference on World Wide Web, pp. 791–800. ACM (2009)
Google Scholar

Download references

Acknowledgements

We would like to thank Ranjeev Mittu at the Naval Research Lab (NRL) for the support and valuable suggestions on our work.

Author information

Authors and Affiliations

George Mason University, Fairfax, USA
Yifeng Gao, Qingzhe Li, Xiaosheng Li, Jessica Lin & Huzefa Rangwala

Authors

Yifeng Gao
View author publications
You can also search for this author in PubMed Google Scholar
Qingzhe Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaosheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Lin
View author publications
You can also search for this author in PubMed Google Scholar
Huzefa Rangwala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yifeng Gao .

Editor information

Editors and Affiliations

Google Research, Google Inc., Zurich, Switzerland
Yasemin Altun
NASA Ames Research Center, Mountain View, USA
Kamalika Das
Oath, Sunnyvale, USA
Taneli Mielikäinen
Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
Donato Malerba
Institute of Computing Science, Poznan University of Technology, Poznan, Poland
Jerzy Stefanowski
Laboratoire d’ Informatique (LIX), École Polytechnique, Palaiseau, France
Jesse Read
Department of Computer Science, Stanford University, Stanford, USA
Marinka Žitnik
Università degli Studi di Bari Aldo Moro, Bari, Italy
Michelangelo Ceci
Jožef Stefan Institute, Ljubljana, Slovenia
Sašo Džeroski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, Y., Li, Q., Li, X., Lin, J., Rangwala, H. (2017). TrajViz: A Tool for Visualizing Patterns and Anomalies in Trajectory. In: Altun, Y., et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2017. Lecture Notes in Computer Science(), vol 10536. Springer, Cham. https://doi.org/10.1007/978-3-319-71273-4_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-71273-4_45
Published: 30 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71272-7
Online ISBN: 978-3-319-71273-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics