Dynamic scene understanding using temporal association rules

doi:10.1016/j.imavis.2014.08.010

Image and Vision Computing

Volume 32, Issue 12, December 2014, Pages 1102-1116

https://doi.org/10.1016/j.imavis.2014.08.010 Get rights and content

Highlights

•
Uses temporal mining technique event recognition in dynamic scenes
•
Temporal association rules are then generated from frequent patterns. These association rules help model the sequence cycle.
•
Spatio-temporal anomalies are identified and detected in a hierarchical manner.

Abstract

The basic goal of scene understanding is to organize the video into sets of events and to find the associated temporal dependencies. Such systems aim to automatically interpret activities in the scene, as well as detect unusual events that could be of particular interest, such as traffic violations and unauthorized entry. The objective of this work, therefore, is to learn behaviors of multi-agent actions and interactions in a semi-supervised manner. Using tracked object trajectories, we organize similar motion trajectories into clusters using the spectral clustering technique. This set of clusters depicts the different paths/routes, i.e., the distinct events taking place at various locations in the scene. A temporal mining algorithm is used to mine interval-based frequent temporal patterns occurring in the scene. A temporal pattern indicates a set of events that are linked based on their relationship with other events in the set, and we use Allen's interval-based temporal logic to describe these relations. The resulting frequent patterns are used to generate temporal association rules, which convey the semantic information contained in the scene. Our overall aim is to generate rules that govern the dynamics of the scene and perform anomaly detection. We apply the proposed approach on two publicly available complex traffic datasets and demonstrate considerable improvements over the existing techniques.

Introduction

In visual surveillance, there has been an increasing interest in recognizing object behaviors, by interpreting high-level semantics of scene dynamics. However, computing relationships between different actions in the scene or detecting rare events in an ocean of video data is a daunting task. Analyzing event interactions manually is practically impossible, and is solely dependent on human operators. In addition, as the scene gets crowded, the complexity of the relationships between the agents increases as well. Even though it has become an active research area, it is still a complex problem with a lot of constraints, and an unsupervised method is required to make the task easier. An elegant solution to this problem can open doors to a wide spectrum of applications, such as video surveillance [1], anomaly detection [2], and crowd analysis [3].

Typically, the input to a dynamic scene analysis system is a video, and the first task is to detect moving objects and record their motion characteristics, in the form of object trajectories (or optical flows). Each trajectory denotes an individual event in the scene during a time interval. This step is generally followed by behavior or activity segmentation, which identifies semantically meaningful components and groupings to reveal different events. Traditionally, algorithms such as K-means and fuzzy clustering have been used extensively, while many recent works have explored spectral clustering and normalized cuts [4]. The resulting clusters model the various events, indicating the spatial layout of the scene. Finally, the last step is to learn the temporal scene behavior. Behavior in our context explains the way an object acts in relation to the other objects in the scene. It can be defined as a sequence of events with spatial and temporal constraints. Recently, probabilistic methods such as Dynamic Bayesian Networks (DBN) [5], Hidden Markov Models (HMM) [6], and Probabilistic Topic Models (PTM) [2], have been used extensively by the computer vision community to learn the scene dynamics.

The dynamic scene understanding problem can be expressed as: obtain the motion patterns in the scene, build the scene structure and lastly, interpret the high-level semantics of the scene. A dynamic scene may also involve multiple agents interacting with one another, and the actions may occur in parallel with one another or recur over time. Thus, we are interested in answering questions such as: what is happening in the scene, where the objects are located and how they interact within their environment. In this work, we aim at developing a robust system that can learn the scene model with minimal human intervention. In this regard, video mining can help extract salient information from a video without such supervision [7]. In order to analyze and discover the temporal interdependencies and relationships between various events occurring in a scene, we make use of temporal mining algorithms. These relationships between events are modeled as temporal patterns, discovered using a frequent temporal pattern mining algorithm. A frequent temporal pattern can be defined as a set of composite events that occur repetitively in the video, and are expressed using temporal relations in Allen's taxonomy [8], such as before, after, and meet. Once these frequent patterns are obtained, forward temporal association rules are generated. These rules capture the correlations between the frequent temporal patterns present in the video.

We define an anomaly as an atypical behavioral pattern based entirely on the model in context, thus every scene can have a different set of anomalies. In this work, anomaly detection is performed in a hierarchical manner. First, we identify unusual events within a spatial context. These spatial anomalies can be found once unique event clusters are identified. The second type of anomalous behavior can be found by using frequent temporal patterns (and their time duration) to discriminate between the usual and the unusual complex composite events.

Our goal is to extract complex activity patterns in a multi-agent environment. This is not trivial, as in most real-world scenarios, the underlying dynamic scene behavior is very complex and perhaps ambiguous, making high-level activity interpretation a challenge. Most of the existing techniques employ various probabilistic models, however, the learning and inference in such methods is computationally prohibitive. Moreover, as the scene gets crowded, the complexity of the relationships increases, and this necessitates a huge amount of training data for accurate analysis. Therefore, in this work we have proposed to learn the scene dynamics using temporal mining techniques. The frequent pattern discovery algorithm utilized in this work has an exploratory nature of operation. In addition, pattern matching allows for accurate and efficient anomaly detection.

•
To the best of our knowledge, temporal mining techniques have not been used for event recognition in dynamic scenes. We discover frequent temporal patterns using [9] to learn the scene behavior.
•
We indicate exactly how two events are related (overlaps, equals, starts, etc.) using Allen's relations [8]. Moreover, we include the duration of composite events in each pattern.
•
To eliminate the spurious frequent temporal patterns discovered, we suggest a few steps in Section 5.2 in order to prune the pattern space.
•
Once these patterns are obtained, we generate temporal rules. These temporal association rules help model the traffic cycle sequence, which is the main test domain for our work.
•
Using a hierarchical anomaly detection algorithm, spatial anomalies are detected based on object trajectories, and spatio-temporal anomalies are identified using a frequent pattern matching approach.

•
We track objects to obtain events that unfold over time. As with any trajectory-based approach, a good tracking algorithm is needed to overcome its inherent issues. In this paper, we focus only on vehicle motion in complex traffic scenes. Pedestrian activity is disregarded as complete trajectories are hard to obtain in crowded scenes.
•
For the temporal mining algorithms, user-defined parameters have to be determined by domain experts. Even though mining techniques do not require the definition of events or rules in advance, the temporal support and the confidence thresholds (cf. Table 1) have to be specified.

The work is organized as follows: Section 2 presents some existing works on the topic. Section 3 briefly describes the proposed methodology. Section 4 focuses on feature extraction and segmentation, while Section 5 presents the second phase, i.e., using the video mining techniques to learn the dynamic scene model. The anomaly detection methodology is discussed in length in Section 6. Experiments are conducted on two datasets, and the results with evaluation measures are illustrated and explained in Section 7, followed by conclusions in Section 8.

Section snippets

Related work

Existing approaches in the literature generally start with motion feature extraction, such as object trajectories or optical flow. Event modeling is done by clustering these features using similarity based distance measures. Trajectory-based approaches [3], [10], [11], [12] primarily rely on how well a tracker performs. The results may be compromised in crowded scenarios due to the presence of multiple objects, inter-object occlusions and low resolution videos [3], [13]. In their seminal work,

Overview

The proposed approach is illustrated in Fig. 1, comprising of the following steps:

•
Feature extraction: We employ a semi-automatic mean-shift tracker [35] to obtain object trajectories.
•
Motion segmentation: Spectral clustering is used to cluster trajectories into different event classes. The number of clusters is determined iteratively.
•
Learning frequent temporal patterns: Relationships are discovered between events based on their time duration characteristics. Temporal patterns, often represented

Feature extraction and segmentation

Objects tend to follow common pathways in a traffic scenario, and two key points are of particular interest: the entry point, where an object appears in the scene, and the exit point where it disappears from the scene. Since we focus solely on traffic scenarios in this work, we use [35] to perform the object tracking, and pedestrian trajectories, if any, are subsequently removed (as in [28], [36]). Moving average low-pass filters are used to remove noise from the trajectories.

The extracted

Video association mining

Events reoccur over time, and this means that each event corresponds to multiple time intervals. We first start by forming event sequences and then, extract the frequent temporal patterns from them. Allen's First Order Interval Logic is used to describe relationships between event pairs in sequences. Next, temporal association rules are generated from the obtained frequent patterns (Section 5.3). Association rules are used to predict future events or the expected behavior between various

Spatial level

Each trajectory cluster defines a single event and each event is represented by its cluster centroid. That is, the centroid models the general appearance of trajectories for any given event [37]. Having obtained the individual events in the scene, trajectories in test clips are classified to their respective event categories. The nearest-neighbor classification scheme is utilized for this purpose, where the distance of each test trajectory is computed to all other centroid trajectories using

Datasets

We test our system on two public datasets [45]. These datasets feature complex activities between numerous agents in the scene, governed by traffic lights.

Conclusions

In this work, we have proposed a method that analyzes traffic patterns and detect irregular events. To the best of our knowledge, temporal mining techniques have not been used for event recognition in dynamic scenes. We first discover frequent temporal patterns and use Allen's temporal relations [8] for representation. The time duration of composite events is included in the pattern as well. Temporal association rules are then generated from these frequent patterns. These association rules help

References (45)

Y. Zhang et al.
Modeling Temporal Interactions with Interval Temporal Bayesian Networks for Complex Activity Recognition
IEEE Trans. Pattern Anal. Mach. Intell.
(2013)
R. Hamid et al.
A novel sequence representation for unsupervised analysis of human activities
Artif. Intell.
(2009)
I. Junejo et al.
Euclidean path modeling for video surveillance
Image Vis. Comput.
(2008)
D. Kuettel et al.
What's going on? Discovering spatio-temporal dependencies in dynamic scenes
V. Mahadevan et al.
Anomaly detection in crowded scenes
B. Zhou et al.
Random field topic model for semantic region analysis in crowded scenes from tracklets
U. Von Luxburg
A tutorial on spectral clustering
Stat. Comput.
(2007)
D. Damen et al.
Recognizing linked events: searching the space of feasible explanations
L. Kratz et al.
Spatio-temporal motion pattern modeling of extremely crowded scenes
N. Harikrishna et al.
Temporal classification of events in cricket videos

J. Allen et al.

Actions and events in interval temporal logic

J. Log. Comput.

(1994)

D. Patel et al.

Mining relationships among interval-based events for classification

E. Jouneau et al.

Particle-based Tracking Model for Automatic Anomaly Detection

(2011)

V. Morariu et al.

Multi-agent event recognition in structured scenarios

Z. Zhang et al.

Trajectory series analysis based event rule induction for visual surveillance

T. Hospedales et al.

Identifying rare and subtle behaviours: a weakly supervised joint topic model

IEEE Trans. Pattern Anal. Mach. Intell.

(2011)

C. Stauffer et al.

Learning patterns of activity using real-time tracking

IEEE Trans. Pattern Anal. Mach. Intell.

(2000)

R. Emonet et al.

Extracting and locating temporal motifs in video scenes using a hierarchical nonparametric Bayesian model

J. Li et al.

Discovering multi-camera behaviour correlations for on-the-fly global activity prediction and anomaly detection

J. Varadarajan et al.

Topic models for scene analysis and abnormality detection

C. Loy et al.

Stream-based active unusual event detection

ACCV

(2010)

L. Song et al.

Understanding dynamic scenes by hierarchical motion pattern mining

Cited by (12)

Dual-scale point cloud completion network based on high-frequency feature fusion
2023, Image and Vision Computing
For many vision tasks and intelligent robotics applications, it is common that the scanned 3D point cloud is not complete, so inferring from the residual defect shape to the intact shape becomes an essential task. Previous 3D completion neural network models generally use voxel-based or point-based methods to learn and process 3D data. For the voxel-based models, the computational cost and memory increase exponentially with the improvement of input resolution, and fine-grained features cannot be guaranteed in the completed point cloud due to limited computational resources. Point-based models suffer from the lack of precision in feature acquisition and crude reconstruction of complicated structures, making it extremely hard to accomplish elaborated semantic shapes. Combining advantages of voxel-based and point-based feature extraction through the high-frequency feature fusion module, this paper proposes a dual-scale point cloud completion network called DSNet, which performs global feature analysis at the voxel scale, and local feature analysis at the point cloud scale. The fused features are then integrated into the decoding and generation process, so as to complete the point cloud completion task from coarse to fine. Experimental results, at both quantitative and qualitative perspectives, in several prevailing datasets demonstrate that our approach surpasses state-of-the-art point cloud completion networks and has a good generalization performance. Code is available at https://github.com/engqing/DSNet.
Domino effect in marine accidents: Evidence from temporal association rules
2021, Transport Policy
Marine accidents cause not only significant economic losses, but also severe environmental pollution and inestimable human casualties, which have become a worldwide concern. To better cope with this concern, this paper adopts temporal association rules (TARs) to mine and discover the domino effect in marine accidents. Using the dataset of 5754 marine domino accidents (MDAs) collected from the International Maritime Organization and IHS Markit Company, the main findings of this paper are as follows. First, ‘hull damage’ was found to be the most frequent accident in MDAs, and ‘collision’ was more likely to cause the damage in the whole hull. Second, ‘oil spill’ was most often observed as a final marine accident. Meanwhile, ‘foundered’ was more likely to cause ‘oil spill’ in both oil tanker and general cargo ship MDAs. Third, it is pointed out that most probable scenarios involved ‘hull damage’ as the basic accident which ended with ‘foundered’ and ‘oil spill’ as top accidents. These findings not only advance our knowledge of marine accidents from the perspective of the domino effect, but also provide insights into improving marine safety.
Mining temporal association rules with frequent itemsets tree
2018, Applied Soft Computing Journal
Citation Excerpt :
However, both of these methods only involve the temporal pattern mining, but did not obtain the temporal association rules. In [26], a temporal pattern indicates a set of events that are linked based on their relationship with other events in the set, and the resulting frequent patterns are used to generate temporal association rules. But the temporal constraints of the antecedent and consequence of the resulting rules are the same time period.
A novel framework for mining temporal association rules by discovering itemsets with frequent itemsets tree is introduced. In order to solve the problem of handling time series by including temporal relation between the multi items into association rules, a frequent itemsets tree is constructed in parallel with mining frequent itemsets to improve the efficiency and interpretability of rule mining without generating candidate itemsets. Experimental results show that our algorithm can provide better efficiency and interpretability in mining temporal association rules in comparison with other algorithms and has good application prospects.
Crowd Modeling using Temporal Association Rules
2021, Proceedings of the 2021 IEEE International Conference on Human-Machine Systems, ICHMS 2021
Cooperative heterogeneous multi-robot systems: A survey
2019, ACM Computing Surveys
Specific temporal association rules and temporal correlations to enlarge and detect inconsistencies in a large growing knowledge base
2018, ICNC-FSKD 2017 - 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery

View all citing articles on Scopus

^☆: This paper has been recommended for acceptance by Ivan Laptev.

View full text

Review articleDynamic scene understanding using temporal association rules☆

Highlights

Abstract

Introduction

Section snippets

Related work

Overview

Feature extraction and segmentation

Video association mining

Spatial level

Datasets

Conclusions

IEEE Trans. Pattern Anal. Mach. Intell.

Artif. Intell.

Image Vis. Comput.

What's going on? Discovering spatio-temporal dependencies in dynamic scenes

Anomaly detection in crowded scenes

Random field topic model for semantic region analysis in crowded scenes from tracklets

A tutorial on spectral clustering

Stat. Comput.

Recognizing linked events: searching the space of feasible explanations

Spatio-temporal motion pattern modeling of extremely crowded scenes

Temporal classification of events in cricket videos

Actions and events in interval temporal logic

J. Log. Comput.

Mining relationships among interval-based events for classification

Particle-based Tracking Model for Automatic Anomaly Detection

Multi-agent event recognition in structured scenarios

Trajectory series analysis based event rule induction for visual surveillance

Identifying rare and subtle behaviours: a weakly supervised joint topic model

IEEE Trans. Pattern Anal. Mach. Intell.

Learning patterns of activity using real-time tracking

IEEE Trans. Pattern Anal. Mach. Intell.

Extracting and locating temporal motifs in video scenes using a hierarchical nonparametric Bayesian model

Discovering multi-camera behaviour correlations for on-the-fly global activity prediction and anomaly detection

Topic models for scene analysis and abnormality detection

Stream-based active unusual event detection

ACCV

Understanding dynamic scenes by hierarchical motion pattern mining

Review article
Dynamic scene understanding using temporal association rules☆