Surveillance scene representation and trajectory abnormality detection using aggregation of multiple concepts

https://doi.org/10.1016/j.eswa.2018.02.013Get rights and content

Highlights

  • An expert sysytem for monitoring surveillance environment is proposed.

  • A graph-based knowledge accusation tool is used to learn multiple features of target.

  • Fuzzy aggregation methods are applied to combine multiple features of target.

  • A suitable reweighting method is proposed to reduce missed alarm.

Abstract

Use of CCTV is growing rapidly in surveillance applications. Rapid advancement in machine learning and camera hardware has opened-up adequate scopes to build next generation of expert systems aiming at understanding surveillance environments automatically by detection of trajectory abnormality through analyzing object behavior. Such intelligent surveillance systems should be able to learn and combine multiple concepts of abnormality in real-life scenario and classify the events of interest as normal or abnormal. Primary challenges of such systems are to represent and learn patterns in surveillance scenes and combine multiple concepts of abnormalities to activate the alarm system.

This paper presents a graph-based representation of a given surveillance scene and learning of relevant features including origin, destination, path, speed, size, etc. These features are combined and correlated with target behaviors to detect abnormalities in moving object trajectories. We also propose an aggregation method that reduces the number of missed alarms during aggregation. Several cases using publicly available surveillance video datasets have been presented and the results indicate that the proposed method can be useful to design intelligent and expert surveillance systems.

Introduction

In recent years, abnormality detection in surveillance video has increased significantly. Research in video abnormality detection for security (Vishwakarma & Agrawal, 2013) has received increased attention of the computer vision research community. Manual analysis of large amount of CCTV video data in the quest of abnormal situations, may not be feasible always. Therefore, researchers look for automatic or semi-automatic intelligent methodologies to analyze the data and find abnormalities. However, defining abnormality or anomaly can be subjective and context dependent. It is challenging due to several reasons. Firstly, abnormalities may be rare in appearance, and sometimes subtle. Secondly, understanding the behavior of moving objects in real life environment, is complex. Lastly, abnormal situations depend on the type of environment (Popoola, Wang, 2012, Song, Shao, Zhang, Shibasaki, Zhao, Cui, Zha, 2013, Xiao, Zhang, Zha, 2015). There are a few possible solutions for the problem. For examples, supervised abnormality detection using training data (Zhu, Liu, Wang, Li, & Lu, 2014), unsupervised methods using low-level visual features like texture or motion (Chen & Huang, 2011), tracking of moving objects (Walia & Kapoor, 2016), and topic-based model (Li, Chang, Wang, Ni, Hong, Yan, 2015), are available in literature. Recent progress in multi object tracking (MOT) (Walia & Kapoor, 2016) has shown substantial improvement in moving object tracking in complex environment. Thus, researchers have started focusing more on abnormality detection (Dogra, Ahmed, & Bhaskar, 2016). For examples, researchers are putting efforts to improve the algorithms and models for intelligent surveillance systems (Abdullah, Adawiyah, 2014, Cancela, Ortega, Fernández, Penedo, 2013, Castro, Delgado, Medina, Ruiz-Lozano, 2011, Chacon-Murguia, Gonzalez-Duarte, 2012, Chan, Liu, 2009, Fernández-Caballero, Castillo, Rodríguez-Sánchez, 2012). Majority of these existing systems are aimed at application specific visual surveillance, which benefit public safety. Intelligent surveillance applications (Gómez, García, Martín, de la Escalera, Armingol, 2015, Lim, Tang, Chan, 2014) that can deal with multiple events have also been proposed. In order to build an unsupervised, multi-concept guided system that is applicable to generic surveillance video, we have considered a graph-based representation of the scene that is further used in fuzzy aggregation to combine multiple concepts. In the next subsection, we have discussed existing research work and key contributions of this paper.

The primary idea behind majority of the existing research work of this domain includes defining a normal model and then analyze real-time trajectories collected by MOT. Here, abnormality is defined by some distance from the normal model (Albusac, Vallejo, Jimenez-Linares, Castro-Schez, & Rodriguez-Benitez, 2009). Intelligent surveillance system with similar concept has been reported in (Albusac, Vallejo, Castro-Schez, Remagnino, Gonzalez, Jimenez, 2010). These methods can produce promising results when normality model is defined by human experts (Gómez, García, Martín, de la Escalera, Armingol, 2015, Lim, Tang, Chan, 2014); however, monitoring complex scenes such as railway stations, shopping malls, and parking lots are difficult because the definition of normality concept may change with time/experiences. This may lead to failure in abnormality detection or may produce false alarms. Thus, we require structured and scalable solutions that can learn, analyze, and merge multiple concepts to take decisions about abnormalities and trigger alarms. The study of such intelligent surveillance systems (Albusac, Vallejo, Castro-Schez, Glez-Morcillo, & Jiménez, 2014) can combine the knowledge, expert opinion, and machine learning in a sequence. More recent work have adopted this concept in surveillance domain, such as service-oriented architecture (SOA) approach applied in independent surveillance applications (Valls, López, & Villar, 2013). Some of the relevant research work have been developed to model and monitor specific activities (Albusac, Vallejo, Castro-Schez, Remagnino, Gonzalez, Jimenez, 2010, Fan, Wang, Huang, 2017) through a supervised way by expert knowledge. However, these techniques usually fail to learn new events of interest from past observations. Intelligent surveillance system such as the agent-based method (Zhou, Tang, & Wang, 2015) has also been applied in a multi-sensor environment. The main goal is to produce meaningful alarms for abnormality when different sensors act separately to detect known events of interest. The aim is toward designing flexible systems that provide a scalable solution to learn a concept from previous experiences and detect abnormality based on the knowledge base. In this work, we have paid special attention to design an unsupervised, scalable, and generic framework to learn and detect abnormalities or anomalies in video object trajectories. We assume the following conditions on the motion and the scene:

  • Objects are allowed to move freely within the surveillance scene, and their movements can only be restricted by the scene boundary.

  • A typical surveillance scene consists of entry and exit regions and moving objects follow entry-to-exit paths.

  • An abnormality or anomaly can be defined as an event of interest that rarely/never happens or diverges from the concept of normality.

This paper presents an expert system aiming at understanding surveillance environments automatically by detection of trajectory abnormalities through analyzing object behavior. In accomplishing this, following research contributions have been made:

(i) In the proposed method, we construct a model of an intelligent surveillance system for monitoring abnormal events. Targets are tracked and represented by a set of features such as size, speed, origin, destination, path, deviation or duration.

(ii) We define a generic graph-based knowledge accusation tool (GKAT) to define and learn multiple concepts of normality in visual surveillance context, and targets are analyzed based on the knowledge. Targets are represented by normality scores of all of the features.

(iii) Finally, fuzzy aggregation methods such as ordered weighted averaging (OWA), Sugeno integral, and Choquet integral have been applied for aggregating multiple concepts. A dynamic weighting method has been proposed to aggregate different abnormality scores to trigger alarm.

(iv) In our experiments, we have used four publicly available datasets to evaluate the proposed method. Results reveal that the proposed method is capable of learning the concepts in a scene independent manner and it can reduce rate of missed alarms.

Rest of the paper is organized as follows. Section 2 describes the proposed method. Section 3 discusses surveillance scene representation and abnormality detection. In Section 4, we present multi-concept aggregation. Experimental results are discussed in Section 5. Finally, Section 6 concludes the paper with future directions.

Section snippets

Proposed method

Our aim is to build an intelligent surveillance system for monitoring abnormal events. First, we introduce a general framework to learn movement patterns in a typical surveillance scene. Next, the scene is represented using a non-linear data structure (graph). The graph-based scene representation allows systematic trajectory analysis, which not only helps to find abnormal activities, but also it provides a scalable framework for generic detection of irregular patterns of movement. Further, the

Scene representation using GKAT

This section presents the knowledge accusation process based on surveillance concepts. We assume every surveillance scene consists of a few key regions such as origin and destination, and the moving objects typically follow a path from origin to destination. A target can be classified by size and speed. A moving target assumes to be normal if the appearance and the behavior of the target is normal according to the scene model. In contrast to article (Albusac, Vallejo, Castro-Schez,

Multi-criteria aggregation

In this section, we present the multi-criteria aggregation problem in details, and introduce fuzzy aggregation as a new tool in surveillance applications. The article (Grabisch, 1995, Grabisch, 1996) discusses more on the detailed representation. The main aim of multi-attribute aggregation is to find a suitable function that can combine all criteria into a single global value. To begin with, let us assume that a set of alternatives/criteria ={a1,a2,,an} be already present to the decision

Experiment results and discussion

In this section, we present the results of the proposed abnormality detection technique. First, we discuss about the datasets that are used for experiments. Next, we demonstrate the results of scene understanding. As the features origin, destination, and deviation can easily be understood from the path feature, we only present the paths of the surveillance scenes and highlight the change in path with respect to time. Finally, we discuss the results of the have proposed fusion-based

Conclusion

Understanding and monitoring of complex and dynamic surveillance environment is a challenging task. Within this context, it is essential to design an expert and intelligent system to deal with the dynamic nature of the environment by fusion of multiple information. An expert surveillance system based on normality concept is proposed in this paper. Normality concept specifies the position of the object in a surveillance aspect or events of interest such as speed. The surveillance scene is

Funding

This study is not Funded from anywhere.

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors. Informed consent: Informed consent was obtained from all individual participants included in the study.

References (41)

  • F. Petitjean et al.

    A global averaging method for dynamic time warping, with applications to clustering

    Pattern Recognition

    (2011)
  • D. Xu et al.

    Video anomaly detection based on a hierarchical activity discovery within spatio-temporal contexts

    Neurocomputing

    (2014)
  • L.A. Zadeh

    Fuzzy sets as a basis for a theory of possibility

    Fuzzy sets and systems

    (1978)
  • X. Zhu et al.

    Sparse representation for robust abnormality detection in crowded scenes

    Pattern Recognition

    (2014)
  • L. Abdullah et al.

    Simple additive weighting methods of multi criteria decision making and applications: A decade review

    International Journal of Information Processing and Management

    (2014)
  • J. Albusac et al.

    Monitoring complex environments using a knowledge-driven approach based on intelligent agents

    IEEE Intelligent Systems

    (2010)
  • J. Albusac et al.

    Intelligent surveillance based on normality analysis to detect abnormal behaviors

    International Journal of Pattern Recognition and Artificial Intelligence

    (2009)
  • S. Ali et al.

    A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis

    Proceedings of the IEEE computer society conference on computer vision and pattern recognition

    (2007)
  • M.I. Chacon-Murguia et al.

    An adaptive neural-fuzzy approach for object detection in dynamic backgrounds for surveillance systems

    IEEE Transactions on Industrial Electronics

    (2012)
  • C.S. Chan et al.

    Fuzzy qualitative human motion analysis

    IEEE Transactions on Fuzzy Systems

    (2009)
  • Cited by (29)

    • A real time crime scene intelligent video surveillance systems in violence detection framework using deep learning techniques

      2022, Computers and Electrical Engineering
      Citation Excerpt :

      Several attempts have been made to partially or completely automate this labour with applications such as human activity recognition, Event detection and behaviour analysis. They utilised a Harris detector [7] to extract important points and a SIFT as a descriptor, then a BoVW to extract mid-level features, which they solved using the same method as visual categorization [8,9] employed the Spacetime Interest Point (STIP) to distinguish face emotions, human activities, and mouse activity with 83%, 80%, and 72% accuracy. To categorise video sequences, [10] combines Gaussian Difference [11] with PCA-SIFT (Principal Component Analysis SIFT) [12] and BoVW, resulting in the conclusion that the amount of the vocabulary employed in BoVW is highly influenced by complexity of the scenes classified.

    • ELM-HTM guided bio-inspired unsupervised learning for anomalous trajectory classification

      2020, Cognitive Systems Research
      Citation Excerpt :

      Time series data is one of the important sources of information used in various pattern understanding tasks. Trajectories as a sequence of data (Ahmed, Dogra, Kar, & Roy, 2018b) have been used in various tasks including but not limited to visual surveillance (Yi, Li, & Wang, 2016), traffic monitoring (Ahmed, Dogra, Kar, & Roy, 2018a), 3D signature analysis (Behera, Dogra, & Roy, 2018), etc. Learning through observation is the primary learning process adopted by human brain (Deng et al., 2015; Hawkins & Blakeslee, 2007).

    • Queuing theory guided intelligent traffic scheduling through video analysis using Dirichlet process mixture model

      2019, Expert Systems with Applications
      Citation Excerpt :

      Also, there has been a surge in utilization of machine learning for developing intelligent systems in diverse domains (Aguilar-Rivera, Valenzuela-Rendn, & Rodrguez-Ortiz, 2015; Amrit, Paauw, Aly, & Lavric, 2017; Cosma, Brown, Archer, Khan, & Pockley, 2017; Onofri, Soda, Pechenizkiy, & Iannello, 2016; Podolak, Roman, Szykua, & Zieliski, 2018; Portugal, Alencar, & Cowan, 2018). Intelligent and expert systems applicable to surveillance typically include tracking (Mithun, Howlader, & Rahman, 2016b), scene analysis (Kardas & Cicekli, 2017), scene learning (Gmez-Romero, Patricio, Garca, & Molina, 2011), event detection (Lim, Tang, & Chan, 2014), anomaly detection (Ahmed, Dogra, Kar, & Roy, 2018; Mabrouk & Zagrouba, 2018), etc. However, there is a dearth of computer vision guided intelligent traffic signaling systems as per the present state-of-the-art.

    • Deep Surveillance System

      2023, Research Square
    View all citing articles on Scopus
    View full text