A novel video based system for detecting and counting vehicles at user-defined virtual loops☆
Introduction
Traffic management can generate many benefits to drivers, pedestrians, governments and to the environment. Information about the traffic conditions can be used in several ways, such as to synchronize traffic lights, assist drivers in the selection of routes, and assist governments in planning the traffic system expansion and on building new roads. Drivers benefit with less time spent in urban and road traffic, resulting in economy and better quality of life. Governments acquire data for designing better solutions for the urban and road traffic, and the environment benefits from the reduction in the emission of pollutants, resulting from an optimized flow of vehicles.
Conventional techniques for measuring traffic flow such as inductive loops, sonar or microwave detectors, have disadvantages such as the installation cost, traffic disruption during installation or maintenance, and usually these methods are unable to detect slow or static vehicles (Mandellos, Keramitsoglou, & Kiranoudis, 2011).
The recent improvements in sensor and communication technology systems allow local transport authorities to monitor closely the conditions of the urban transport systems, promoting the development of a wide variety of techniques for monitoring the traffic flow, and collecting data on the traffic flow characteristics (Cho, Quek, Seah, & Chong, 2009).
The use of image-based sensors and computer vision techniques for data acquisition on the traffic of vehicles have been intensely investigated in the recent years, since traffic videos provide more information about the traffic of vehicles than other classes of sensors (e.g. inductive loops, sonar or microwave detectors), and sometimes such video based systems can expand their monitoring capabilities by taking advantage of the video cameras already installed on site (Tian, Yao, Gu, Wang, & Li, 2011). Moreover, video based systems are easy to install, can be easily upgraded since they offer the possibility of redesigning the system and its functionalities by upgrading the installed algorithms. Among several possible applications of these video-based systems, they can be used for counting and classifying vehicles, measuring the speed of vehicles, and for the identification of traffic incidents (Mandellos et al., 2011).
Therefore, the current technological trend in the area of traffic monitoring is oriented towards video based systems, since video sensors have relatively low maintenance costs and allow detecting and counting vehicles in a non-intrusive way. Besides, there are several applications demanding traffic video surveillance nowadays, such as: providing essential traffic and travel information to drivers so road safety and traffic efficiency can be improved (Cheng, Gau, Huang, & Hwang, 2012), detecting pedestrians in intelligent transportation systems (ITS), providing traffic data to safety driving assistance systems (SDASs) (Guo, Ge, Zhang, Li, & Zhao, 2012), vehicle overtaking (Milanés et al., 2012), detecting and extracting vehicles in traffic surveillance scenarios (Mandellos et al., 2011), and counting vehicles and/or detecting traffic incidents (Cho et al., 2009).
Currently, there are several methods for detecting, tracking and counting vehicles in traffic videos. Generally, these methods start by separating the static part of the scene (background), from the non-static part of the scene (foreground) where the moving objects of interest are usually found (i.e. moving vehicles) (Tian et al., 2011). Various techniques can be used to segment the background and the foreground. The subtraction of a static background model from each video frame is often used. This background model can be obtained by using simple methods, such as the average pixel intensities in a set of frames (Lai & Yung, 1998), or by more elaborate methods such as building Gaussian Mixtures Models for each background pixel (Stauffer & Grimson, 1999), by background reconstruction (Mandellos et al., 2011), or yet by determining the optimal threshold for foreground–background segmentation and object detection (Karasulu & Korukoglu, 2012). However, often is challenging for background subtraction methods to deal with noise, illumination changes, occlusions and the splitting of multiple objects that have been incorrectly merged by the foreground segmentation process. Other approaches, like pixel-by-pixel differences between two or more adjacent frames also have been used to detect the objects of interest, since this method is more robust to illumination variations than background subtraction, but using this approach only the objects moving against a static background can be detected (Cucchiara, Piccardi, & Mello, 2000). To avoid incorrectly merging spatially close vehicles (e.g. in cast shadows situations), shading removal has been investigated to help improve the vehicle identification when cast shadows are present (Zhong & Junping, 2008).
There are several methods for detecting and tracking moving vehicles (Tian et al., 2011). Often the approaches used for detecting targets (i.e. vehicles) are model-based methods that use prior knowledge to detect the desired targets (Lai et al., 2010, Shen, 2008), or deformable templates which are used when targets are matched against known vehicle models in the video frames (Takeuchi, Mita, & McAllester, 2010), or yet methods that rely on simpler features such as corners and edges (Tu, Xu, & Zhou, 2008). The identified targets (vehicles) often are tracked using approaches such as mean-shift (Bouttefroy, Bouzerdoum, Phung, & Beghdadi, 2008), Kalman filtering (Xie, Zhu, Wang, Xu, & Zhang, 2005), or yet particle filtering (Scharcanski, de Oliveira, Cavalcanti, & Yari, 2011). Different schemes have been proposed for vehicle counting, such as incrementing a vehicle counter when new vehicles are detected in a video scene (Sánchez, Suarez, Conci, & de Oliveira Nunes, 2011), or by incrementing a vehicle counter only when the tracked vehicles are on pre-defined virtual loops (Tseng, Lin, & Smith, 2002), or yet by counting new vehicles passing at user-defined virtual loops without previously tracking these vehicles (Purnama, Zaini, Putra, & Hariadi, 2009).
Despite the recent advances, still there are challenging issues in vehicle detection and tracking, such as: (a) detecting accurately the foreground, specially when there are rapid changes in background lighting or imaging artifacts; (b) identifying the vehicles to be tracked when there are multiple vehicles in the scene; and (c) tracking vehicles in occlusion situations, specially when a vehicle being tracked is partially (or completely) occluded by other vehicles or obstacles. In the present work, we try to address the first two challenges, and restrict ourselves to the cases where the camera positioning minimizes vehicle occlusions.
The proposed method improves on the scheme presented by Bouvie, Scharcanski, Barcellos, and Escouto (2013) by providing a new segmentation of the moving vehicles against the background (road or street), which tends to be robust to artifacts in traffic videos, leading to less vehicle counting errors. The approach used to estimate the background in Bouvie et al. (2013) uses a simple temporal median, which has limitations when the scene illumination changes abruptly. In the present work, we use of a background model based on Mixtures of Gaussians, that is more robust to scene illumination changes, improving background and vehicles detection even in adverse conditions. Vehicle tracking is performed by a particle filtering method that is significantly more robust than the approach proposed in Bouvie et al. (2013), as the comparative experimental results indicate. Vehicle counting is performed by detecting the intersection of the tracked particle groups (i.e. moving vehicles) with a set of user-defined virtual loops. The experimental comparisons with methods representative of the state-of-the-art (Bouvie et al., 2013, Kim, 2008, Sánchez et al., 2011, Yuan et al., 2013) suggest that the proposed approach can achieve more accurate results in terms of vehicle detection and counting, while better handling challenging vehicle tracking issues, such as tracking long vehicles which other methods tend to divide into smaller moving objects, leading to inaccuracies in vehicle counting.
This paper is organized as follows: Section 2 presents our proposed vehicle tracking method, Section 3 presents and discusses the obtained experimental results, and finally Section 4 concludes with our final remarks.
Section snippets
Our proposed vehicle detection and counting method
In order to reduce the number of pixels that must be processed, we sub-sample video frames using particles (see Section 2.1). Therefore, particles belonging to the same vehicle are assumed to be: (a) spatially coherent, i.e. particles associated with the same vehicle must be spatially close to each other, and groups of particles must be distant from each other if associated to different vehicles; (b) temporally coherent, meaning that particles associated to a vehicle appearing in a given frame
Experimental results
Our goal is to count tracked vehicles on user-defined virtual loops. In order to validate the proposed method, we compared the obtained results in terms of vehicle counts with other approaches representative of the state of the art, such as the methods proposed by Kim, 2008, Bouvie et al., 2013 that also use particle filtering, the method proposed by Sánchez et al. (2011) that does not use particle filtering to detect moving vehicles, and the method proposed by Yuan et al. (2013) that relies on
Conclusion
This paper presents a new method for detecting and counting vehicles on urban traffic video sequences. The proposed method uses a particle filtering approach to measure the sampling particles motion coherence and spatial adjacency, and associates groups of sampling particles to moving vehicles locations in urban video sequences. Moving vehicles are detected when the groups of sampling particles have convex shapes, and the group members (i.e. moving particles) are persistent and show similar
Acknowledgment
This work was supported by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Brazil.
References (34)
- et al.
Advanced formation and delivery of traffic information in intelligent transportation systems
Expert Systems with Applications
(2012) - et al.
Hebbr2-taffic: A novel application of neuro-fuzzy network for visual based traffic monitoring system
Expert Systems with Applications
(2009) - et al.
Pedestrian detection for intelligent transportation systems combining adaboost algorithm and support vector machine
Expert Systems with Applications
(2012) - et al.
Moving object detection and tracking by using annealed background subtraction method in videos: Performance optimization
Expert Systems with Applications
(2012) - et al.
A background subtraction algorithm for detecting and tracking vehicles
Expert Systems with Applications
(2011) - et al.
Intelligent automatic overtaking system using vision for vehicle detection
Expert Systems with Applications
(2012) Silhouettes a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
(1987)- Barjatya, A. (2004). Block matching algorithms for motion estimation. Final project paper for spring 2004 digital image...
- et al.
Vehicle tracking by non-drifting mean-shift using projective kalman filter
- et al.
Tracking and counting vehicles in traffic video sequences using particle filtering
Real-time tracking of non-rigid objects using mean shift
Image analysis and rule-based reasoning for a traffic monitoring system
IEEE Transactions on Intelligent Transportation Systems
The representation and recognition of human movement using temporal templates
Digital image processing
An efficient k-means clustering algorithm: Analysis and implementation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Real time object tracking based on dynamic feature grouping with background subtraction
Cited by (51)
DSPNet: Deep scale purifier network for dense crowd counting
2020, Expert Systems with ApplicationsCitation Excerpt :By determining the number of pedestrians in such scenes and subsequently taking effective measures, some tragedies may be entirely avoided. Furthermore, counting semantic features can be extended to other important domains, including medical and biological image processing (Lempitsky & Zisserman, 2010), traffic monitoring (Barcellos, Bouvié, Escouto, & Scharcanski, 2015; De Almeida, Oliveira, Britto Jr, Silva Jr, & Koerich, 2015), and wildlife census (Laradji, Rostamzadeh, Pinheiro, Vazquez, & Schmidt, 2018). As a well-established problem in computer vision, crowd counting has plagued researchers with many challenges over the last few years.
Intelligent traffic analysis system for Indian road conditions
2022, International Journal of Information Technology (Singapore)Lane-Level Vehicle Counting Based on V2X and Centimeter-level Positioning at Urban Intersections
2022, International Journal of Intelligent Transportation Systems ResearchA Deep Learning Framework for Video-Based Vehicle Counting
2022, Frontiers in Physics
- ☆
The authors thank CNPq – Conselho Nacional de Desenvolvimento Científico e Tecnológico, and CAPES – Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Brazil and DIGICON for funding this project.