Surveillance scene representation and trajectory abnormality detection using aggregation of multiple concepts
Introduction
In recent years, abnormality detection in surveillance video has increased significantly. Research in video abnormality detection for security (Vishwakarma & Agrawal, 2013) has received increased attention of the computer vision research community. Manual analysis of large amount of CCTV video data in the quest of abnormal situations, may not be feasible always. Therefore, researchers look for automatic or semi-automatic intelligent methodologies to analyze the data and find abnormalities. However, defining abnormality or anomaly can be subjective and context dependent. It is challenging due to several reasons. Firstly, abnormalities may be rare in appearance, and sometimes subtle. Secondly, understanding the behavior of moving objects in real life environment, is complex. Lastly, abnormal situations depend on the type of environment (Popoola, Wang, 2012, Song, Shao, Zhang, Shibasaki, Zhao, Cui, Zha, 2013, Xiao, Zhang, Zha, 2015). There are a few possible solutions for the problem. For examples, supervised abnormality detection using training data (Zhu, Liu, Wang, Li, & Lu, 2014), unsupervised methods using low-level visual features like texture or motion (Chen & Huang, 2011), tracking of moving objects (Walia & Kapoor, 2016), and topic-based model (Li, Chang, Wang, Ni, Hong, Yan, 2015), are available in literature. Recent progress in multi object tracking (MOT) (Walia & Kapoor, 2016) has shown substantial improvement in moving object tracking in complex environment. Thus, researchers have started focusing more on abnormality detection (Dogra, Ahmed, & Bhaskar, 2016). For examples, researchers are putting efforts to improve the algorithms and models for intelligent surveillance systems (Abdullah, Adawiyah, 2014, Cancela, Ortega, Fernández, Penedo, 2013, Castro, Delgado, Medina, Ruiz-Lozano, 2011, Chacon-Murguia, Gonzalez-Duarte, 2012, Chan, Liu, 2009, Fernández-Caballero, Castillo, Rodríguez-Sánchez, 2012). Majority of these existing systems are aimed at application specific visual surveillance, which benefit public safety. Intelligent surveillance applications (Gómez, García, Martín, de la Escalera, Armingol, 2015, Lim, Tang, Chan, 2014) that can deal with multiple events have also been proposed. In order to build an unsupervised, multi-concept guided system that is applicable to generic surveillance video, we have considered a graph-based representation of the scene that is further used in fuzzy aggregation to combine multiple concepts. In the next subsection, we have discussed existing research work and key contributions of this paper.
The primary idea behind majority of the existing research work of this domain includes defining a normal model and then analyze real-time trajectories collected by MOT. Here, abnormality is defined by some distance from the normal model (Albusac, Vallejo, Jimenez-Linares, Castro-Schez, & Rodriguez-Benitez, 2009). Intelligent surveillance system with similar concept has been reported in (Albusac, Vallejo, Castro-Schez, Remagnino, Gonzalez, Jimenez, 2010). These methods can produce promising results when normality model is defined by human experts (Gómez, García, Martín, de la Escalera, Armingol, 2015, Lim, Tang, Chan, 2014); however, monitoring complex scenes such as railway stations, shopping malls, and parking lots are difficult because the definition of normality concept may change with time/experiences. This may lead to failure in abnormality detection or may produce false alarms. Thus, we require structured and scalable solutions that can learn, analyze, and merge multiple concepts to take decisions about abnormalities and trigger alarms. The study of such intelligent surveillance systems (Albusac, Vallejo, Castro-Schez, Glez-Morcillo, & Jiménez, 2014) can combine the knowledge, expert opinion, and machine learning in a sequence. More recent work have adopted this concept in surveillance domain, such as service-oriented architecture (SOA) approach applied in independent surveillance applications (Valls, López, & Villar, 2013). Some of the relevant research work have been developed to model and monitor specific activities (Albusac, Vallejo, Castro-Schez, Remagnino, Gonzalez, Jimenez, 2010, Fan, Wang, Huang, 2017) through a supervised way by expert knowledge. However, these techniques usually fail to learn new events of interest from past observations. Intelligent surveillance system such as the agent-based method (Zhou, Tang, & Wang, 2015) has also been applied in a multi-sensor environment. The main goal is to produce meaningful alarms for abnormality when different sensors act separately to detect known events of interest. The aim is toward designing flexible systems that provide a scalable solution to learn a concept from previous experiences and detect abnormality based on the knowledge base. In this work, we have paid special attention to design an unsupervised, scalable, and generic framework to learn and detect abnormalities or anomalies in video object trajectories. We assume the following conditions on the motion and the scene:
- •
Objects are allowed to move freely within the surveillance scene, and their movements can only be restricted by the scene boundary.
- •
A typical surveillance scene consists of entry and exit regions and moving objects follow entry-to-exit paths.
- •
An abnormality or anomaly can be defined as an event of interest that rarely/never happens or diverges from the concept of normality.
This paper presents an expert system aiming at understanding surveillance environments automatically by detection of trajectory abnormalities through analyzing object behavior. In accomplishing this, following research contributions have been made:
(i) In the proposed method, we construct a model of an intelligent surveillance system for monitoring abnormal events. Targets are tracked and represented by a set of features such as size, speed, origin, destination, path, deviation or duration.
(ii) We define a generic graph-based knowledge accusation tool (GKAT) to define and learn multiple concepts of normality in visual surveillance context, and targets are analyzed based on the knowledge. Targets are represented by normality scores of all of the features.
(iii) Finally, fuzzy aggregation methods such as ordered weighted averaging (OWA), Sugeno integral, and Choquet integral have been applied for aggregating multiple concepts. A dynamic weighting method has been proposed to aggregate different abnormality scores to trigger alarm.
(iv) In our experiments, we have used four publicly available datasets to evaluate the proposed method. Results reveal that the proposed method is capable of learning the concepts in a scene independent manner and it can reduce rate of missed alarms.
Rest of the paper is organized as follows. Section 2 describes the proposed method. Section 3 discusses surveillance scene representation and abnormality detection. In Section 4, we present multi-concept aggregation. Experimental results are discussed in Section 5. Finally, Section 6 concludes the paper with future directions.
Section snippets
Proposed method
Our aim is to build an intelligent surveillance system for monitoring abnormal events. First, we introduce a general framework to learn movement patterns in a typical surveillance scene. Next, the scene is represented using a non-linear data structure (graph). The graph-based scene representation allows systematic trajectory analysis, which not only helps to find abnormal activities, but also it provides a scalable framework for generic detection of irregular patterns of movement. Further, the
Scene representation using GKAT
This section presents the knowledge accusation process based on surveillance concepts. We assume every surveillance scene consists of a few key regions such as origin and destination, and the moving objects typically follow a path from origin to destination. A target can be classified by size and speed. A moving target assumes to be normal if the appearance and the behavior of the target is normal according to the scene model. In contrast to article (Albusac, Vallejo, Castro-Schez,
Multi-criteria aggregation
In this section, we present the multi-criteria aggregation problem in details, and introduce fuzzy aggregation as a new tool in surveillance applications. The article (Grabisch, 1995, Grabisch, 1996) discusses more on the detailed representation. The main aim of multi-attribute aggregation is to find a suitable function that can combine all criteria into a single global value. To begin with, let us assume that a set of alternatives/criteria be already present to the decision
Experiment results and discussion
In this section, we present the results of the proposed abnormality detection technique. First, we discuss about the datasets that are used for experiments. Next, we demonstrate the results of scene understanding. As the features origin, destination, and deviation can easily be understood from the path feature, we only present the paths of the surveillance scenes and highlight the change in path with respect to time. Finally, we discuss the results of the have proposed fusion-based
Conclusion
Understanding and monitoring of complex and dynamic surveillance environment is a challenging task. Within this context, it is essential to design an expert and intelligent system to deal with the dynamic nature of the environment by fusion of multiple information. An expert surveillance system based on normality concept is proposed in this paper. Normality concept specifies the position of the object in a surveillance aspect or events of interest such as speed. The surveillance scene is
Funding
This study is not Funded from anywhere.
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors. Informed consent: Informed consent was obtained from all individual participants included in the study.
References (41)
- et al.
Dynamic weighted aggregation for normality analysis in intelligent surveillance systems
Expert Systems with Applications
(2014) - et al.
Hierarchical framework for robust and fast multiple-target tracking in surveillance scenarios
Expert Systems with Applications
(2013) - et al.
Intelligent surveillance system with integration of heterogeneous information for intrusion detection
Expert Systems with Applications
(2011) - et al.
Motion-based unusual event detection in human crowds
Journal of Visual Communication and Image Representation
(2011) K-means iterative fisher (KIF) unsupervised clustering algorithm applied to image texture segmentation
Pattern Recognition
(2002)- et al.
Human activity monitoring by local and global finite state machines
Expert Systems with Applications
(2012) - et al.
Intelligent surveillance of indoor environments based on computer vision and 3d point cloud fusion
Expert Systems with Applications
(2015) Fuzzy integral in multicriteria decision making
Fuzzy sets and Systems
(1995)The application of fuzzy integrals in multicriteria decision making
European journal of operational research
(1996)- et al.
Isurveillance: Intelligent framework for multiple events detection in surveillance videos
Expert Systems with Applications
(2014)
A global averaging method for dynamic time warping, with applications to clustering
Pattern Recognition
Video anomaly detection based on a hierarchical activity discovery within spatio-temporal contexts
Neurocomputing
Fuzzy sets as a basis for a theory of possibility
Fuzzy sets and systems
Sparse representation for robust abnormality detection in crowded scenes
Pattern Recognition
Simple additive weighting methods of multi criteria decision making and applications: A decade review
International Journal of Information Processing and Management
Monitoring complex environments using a knowledge-driven approach based on intelligent agents
IEEE Intelligent Systems
Intelligent surveillance based on normality analysis to detect abnormal behaviors
International Journal of Pattern Recognition and Artificial Intelligence
A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis
Proceedings of the IEEE computer society conference on computer vision and pattern recognition
An adaptive neural-fuzzy approach for object detection in dynamic backgrounds for surveillance systems
IEEE Transactions on Industrial Electronics
Fuzzy qualitative human motion analysis
IEEE Transactions on Fuzzy Systems
Cited by (29)
A real time crime scene intelligent video surveillance systems in violence detection framework using deep learning techniques
2022, Computers and Electrical EngineeringCitation Excerpt :Several attempts have been made to partially or completely automate this labour with applications such as human activity recognition, Event detection and behaviour analysis. They utilised a Harris detector [7] to extract important points and a SIFT as a descriptor, then a BoVW to extract mid-level features, which they solved using the same method as visual categorization [8,9] employed the Spacetime Interest Point (STIP) to distinguish face emotions, human activities, and mouse activity with 83%, 80%, and 72% accuracy. To categorise video sequences, [10] combines Gaussian Difference [11] with PCA-SIFT (Principal Component Analysis SIFT) [12] and BoVW, resulting in the conclusion that the amount of the vocabulary employed in BoVW is highly influenced by complexity of the scenes classified.
ELM-HTM guided bio-inspired unsupervised learning for anomalous trajectory classification
2020, Cognitive Systems ResearchCitation Excerpt :Time series data is one of the important sources of information used in various pattern understanding tasks. Trajectories as a sequence of data (Ahmed, Dogra, Kar, & Roy, 2018b) have been used in various tasks including but not limited to visual surveillance (Yi, Li, & Wang, 2016), traffic monitoring (Ahmed, Dogra, Kar, & Roy, 2018a), 3D signature analysis (Behera, Dogra, & Roy, 2018), etc. Learning through observation is the primary learning process adopted by human brain (Deng et al., 2015; Hawkins & Blakeslee, 2007).
Queuing theory guided intelligent traffic scheduling through video analysis using Dirichlet process mixture model
2019, Expert Systems with ApplicationsCitation Excerpt :Also, there has been a surge in utilization of machine learning for developing intelligent systems in diverse domains (Aguilar-Rivera, Valenzuela-Rendn, & Rodrguez-Ortiz, 2015; Amrit, Paauw, Aly, & Lavric, 2017; Cosma, Brown, Archer, Khan, & Pockley, 2017; Onofri, Soda, Pechenizkiy, & Iannello, 2016; Podolak, Roman, Szykua, & Zieliski, 2018; Portugal, Alencar, & Cowan, 2018). Intelligent and expert systems applicable to surveillance typically include tracking (Mithun, Howlader, & Rahman, 2016b), scene analysis (Kardas & Cicekli, 2017), scene learning (Gmez-Romero, Patricio, Garca, & Molina, 2011), event detection (Lim, Tang, & Chan, 2014), anomaly detection (Ahmed, Dogra, Kar, & Roy, 2018; Mabrouk & Zagrouba, 2018), etc. However, there is a dearth of computer vision guided intelligent traffic signaling systems as per the present state-of-the-art.
Deep Surveillance System
2023, Research Square