A novel video based system for detecting and counting vehicles at user-defined virtual loops

doi:10.1016/j.eswa.2014.09.045

Expert Systems with Applications

Volume 42, Issue 4, March 2015, Pages 1845-1856

https://doi.org/10.1016/j.eswa.2014.09.045 Get rights and content

Highlights

•
We present a system for detecting and counting vehicles in urban traffic videos.
•
The detected and tracked vehicles are counted at user defined virtual loops.
•
A background model is defined using Mixtures of Gaussians and Motion Energy Images.
•
Vehicle detection and tracking relies on a flexible particle clustering scheme.
•
The method potentially can be reliable based on experiments and method comparisons.

Abstract

This paper presents a new system for detecting and counting vehicles in urban traffic videos at user-defined virtual loops. The proposed method uses motion coherence and spatial adjacency to group sampling particles in urban video sequences. A foreground mask is created using Gaussian Mixture Models and Motion Energy Images to determine the preferable locations that the particles must sample, and the convex particle groups are then analyzed to detect the vehicles. After a vehicle is detected, it is tracked using the similarity of its colors in adjacent frames. The vehicles are counted in user-defined virtual loops, by detecting the intersections of the tracked vehicles with these virtual loops. The experimental results based on different traffic videos, with a total of 80,000 video frames, suggest that our approach potentially can be more reliable than comparable methods available in the literature.

Introduction

Traffic management can generate many benefits to drivers, pedestrians, governments and to the environment. Information about the traffic conditions can be used in several ways, such as to synchronize traffic lights, assist drivers in the selection of routes, and assist governments in planning the traffic system expansion and on building new roads. Drivers benefit with less time spent in urban and road traffic, resulting in economy and better quality of life. Governments acquire data for designing better solutions for the urban and road traffic, and the environment benefits from the reduction in the emission of pollutants, resulting from an optimized flow of vehicles.

Conventional techniques for measuring traffic flow such as inductive loops, sonar or microwave detectors, have disadvantages such as the installation cost, traffic disruption during installation or maintenance, and usually these methods are unable to detect slow or static vehicles (Mandellos, Keramitsoglou, & Kiranoudis, 2011).

The recent improvements in sensor and communication technology systems allow local transport authorities to monitor closely the conditions of the urban transport systems, promoting the development of a wide variety of techniques for monitoring the traffic flow, and collecting data on the traffic flow characteristics (Cho, Quek, Seah, & Chong, 2009).

The use of image-based sensors and computer vision techniques for data acquisition on the traffic of vehicles have been intensely investigated in the recent years, since traffic videos provide more information about the traffic of vehicles than other classes of sensors (e.g. inductive loops, sonar or microwave detectors), and sometimes such video based systems can expand their monitoring capabilities by taking advantage of the video cameras already installed on site (Tian, Yao, Gu, Wang, & Li, 2011). Moreover, video based systems are easy to install, can be easily upgraded since they offer the possibility of redesigning the system and its functionalities by upgrading the installed algorithms. Among several possible applications of these video-based systems, they can be used for counting and classifying vehicles, measuring the speed of vehicles, and for the identification of traffic incidents (Mandellos et al., 2011).

Therefore, the current technological trend in the area of traffic monitoring is oriented towards video based systems, since video sensors have relatively low maintenance costs and allow detecting and counting vehicles in a non-intrusive way. Besides, there are several applications demanding traffic video surveillance nowadays, such as: providing essential traffic and travel information to drivers so road safety and traffic efficiency can be improved (Cheng, Gau, Huang, & Hwang, 2012), detecting pedestrians in intelligent transportation systems (ITS), providing traffic data to safety driving assistance systems (SDASs) (Guo, Ge, Zhang, Li, & Zhao, 2012), vehicle overtaking (Milanés et al., 2012), detecting and extracting vehicles in traffic surveillance scenarios (Mandellos et al., 2011), and counting vehicles and/or detecting traffic incidents (Cho et al., 2009).

Currently, there are several methods for detecting, tracking and counting vehicles in traffic videos. Generally, these methods start by separating the static part of the scene (background), from the non-static part of the scene (foreground) where the moving objects of interest are usually found (i.e. moving vehicles) (Tian et al., 2011). Various techniques can be used to segment the background and the foreground. The subtraction of a static background model from each video frame is often used. This background model can be obtained by using simple methods, such as the average pixel intensities in a set of frames (Lai & Yung, 1998), or by more elaborate methods such as building Gaussian Mixtures Models for each background pixel (Stauffer & Grimson, 1999), by background reconstruction (Mandellos et al., 2011), or yet by determining the optimal threshold for foreground–background segmentation and object detection (Karasulu & Korukoglu, 2012). However, often is challenging for background subtraction methods to deal with noise, illumination changes, occlusions and the splitting of multiple objects that have been incorrectly merged by the foreground segmentation process. Other approaches, like pixel-by-pixel differences between two or more adjacent frames also have been used to detect the objects of interest, since this method is more robust to illumination variations than background subtraction, but using this approach only the objects moving against a static background can be detected (Cucchiara, Piccardi, & Mello, 2000). To avoid incorrectly merging spatially close vehicles (e.g. in cast shadows situations), shading removal has been investigated to help improve the vehicle identification when cast shadows are present (Zhong & Junping, 2008).

There are several methods for detecting and tracking moving vehicles (Tian et al., 2011). Often the approaches used for detecting targets (i.e. vehicles) are model-based methods that use prior knowledge to detect the desired targets (Lai et al., 2010, Shen, 2008), or deformable templates which are used when targets are matched against known vehicle models in the video frames (Takeuchi, Mita, & McAllester, 2010), or yet methods that rely on simpler features such as corners and edges (Tu, Xu, & Zhou, 2008). The identified targets (vehicles) often are tracked using approaches such as mean-shift (Bouttefroy, Bouzerdoum, Phung, & Beghdadi, 2008), Kalman filtering (Xie, Zhu, Wang, Xu, & Zhang, 2005), or yet particle filtering (Scharcanski, de Oliveira, Cavalcanti, & Yari, 2011). Different schemes have been proposed for vehicle counting, such as incrementing a vehicle counter when new vehicles are detected in a video scene (Sánchez, Suarez, Conci, & de Oliveira Nunes, 2011), or by incrementing a vehicle counter only when the tracked vehicles are on pre-defined virtual loops (Tseng, Lin, & Smith, 2002), or yet by counting new vehicles passing at user-defined virtual loops without previously tracking these vehicles (Purnama, Zaini, Putra, & Hariadi, 2009).

Despite the recent advances, still there are challenging issues in vehicle detection and tracking, such as: (a) detecting accurately the foreground, specially when there are rapid changes in background lighting or imaging artifacts; (b) identifying the vehicles to be tracked when there are multiple vehicles in the scene; and (c) tracking vehicles in occlusion situations, specially when a vehicle being tracked is partially (or completely) occluded by other vehicles or obstacles. In the present work, we try to address the first two challenges, and restrict ourselves to the cases where the camera positioning minimizes vehicle occlusions.

The proposed method improves on the scheme presented by Bouvie, Scharcanski, Barcellos, and Escouto (2013) by providing a new segmentation of the moving vehicles against the background (road or street), which tends to be robust to artifacts in traffic videos, leading to less vehicle counting errors. The approach used to estimate the background in Bouvie et al. (2013) uses a simple temporal median, which has limitations when the scene illumination changes abruptly. In the present work, we use of a background model based on Mixtures of Gaussians, that is more robust to scene illumination changes, improving background and vehicles detection even in adverse conditions. Vehicle tracking is performed by a particle filtering method that is significantly more robust than the approach proposed in Bouvie et al. (2013), as the comparative experimental results indicate. Vehicle counting is performed by detecting the intersection of the tracked particle groups (i.e. moving vehicles) with a set of user-defined virtual loops. The experimental comparisons with methods representative of the state-of-the-art (Bouvie et al., 2013, Kim, 2008, Sánchez et al., 2011, Yuan et al., 2013) suggest that the proposed approach can achieve more accurate results in terms of vehicle detection and counting, while better handling challenging vehicle tracking issues, such as tracking long vehicles which other methods tend to divide into smaller moving objects, leading to inaccuracies in vehicle counting.

This paper is organized as follows: Section 2 presents our proposed vehicle tracking method, Section 3 presents and discusses the obtained experimental results, and finally Section 4 concludes with our final remarks.

Section snippets

Our proposed vehicle detection and counting method

In order to reduce the number of pixels that must be processed, we sub-sample video frames using particles (see Section 2.1). Therefore, particles belonging to the same vehicle are assumed to be: (a) spatially coherent, i.e. particles associated with the same vehicle must be spatially close to each other, and groups of particles must be distant from each other if associated to different vehicles; (b) temporally coherent, meaning that particles associated to a vehicle appearing in a given frame

Experimental results

Our goal is to count tracked vehicles on user-defined virtual loops. In order to validate the proposed method, we compared the obtained results in terms of vehicle counts with other approaches representative of the state of the art, such as the methods proposed by Kim, 2008, Bouvie et al., 2013 that also use particle filtering, the method proposed by Sánchez et al. (2011) that does not use particle filtering to detect moving vehicles, and the method proposed by Yuan et al. (2013) that relies on

Conclusion

This paper presents a new method for detecting and counting vehicles on urban traffic video sequences. The proposed method uses a particle filtering approach to measure the sampling particles motion coherence and spatial adjacency, and associates groups of sampling particles to moving vehicles locations in urban video sequences. Moving vehicles are detected when the groups of sampling particles have convex shapes, and the group members (i.e. moving particles) are persistent and show similar

Acknowledgment

This work was supported by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Brazil.

References (34)

H.-Y. Cheng et al.
Advanced formation and delivery of traffic information in intelligent transportation systems
Expert Systems with Applications
(2012)
S.-Y. Cho et al.
Hebbr2-taffic: A novel application of neuro-fuzzy network for visual based traffic monitoring system
Expert Systems with Applications
(2009)
L. Guo et al.
Pedestrian detection for intelligent transportation systems combining adaboost algorithm and support vector machine
Expert Systems with Applications
(2012)
B. Karasulu et al.
Moving object detection and tracking by using annealed background subtraction method in videos: Performance optimization
Expert Systems with Applications
(2012)
N.A. Mandellos et al.
A background subtraction algorithm for detecting and tracking vehicles
Expert Systems with Applications
(2011)
V. Milanés et al.
Intelligent automatic overtaking system using vision for vehicle detection
Expert Systems with Applications
(2012)
P.J. Rousseeuw
Silhouettes a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
(1987)
Barjatya, A. (2004). Block matching algorithms for motion estimation. Final project paper for spring 2004 digital image...
P.L.M. Bouttefroy et al.
Vehicle tracking by non-drifting mean-shift using projective kalman filter
C. Bouvie et al.
Tracking and counting vehicles in traffic video sequences using particle filtering

D. Comaniciu et al.

Real-time tracking of non-rigid objects using mean shift

R. Cucchiara et al.

Image analysis and rule-based reasoning for a traffic monitoring system

IEEE Transactions on Intelligent Transportation Systems

(2000)

J.W. Davis et al.

The representation and recognition of human movement using temporal templates

R.C. Gonzalez et al.

Digital image processing

(2002)

KaewTraKulPong, P., & Bowden, R. (2002). An improved adaptive background mixture model for real-time tracking with...

T. Kanungo et al.

An efficient k-means clustering algorithm: Analysis and implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2002)

Z. Kim

Real time object tracking based on dynamic feature grouping with background subtraction

Cited by (51)

DSPNet: Deep scale purifier network for dense crowd counting
2020, Expert Systems with Applications
Citation Excerpt :
By determining the number of pedestrians in such scenes and subsequently taking effective measures, some tragedies may be entirely avoided. Furthermore, counting semantic features can be extended to other important domains, including medical and biological image processing (Lempitsky & Zisserman, 2010), traffic monitoring (Barcellos, Bouvié, Escouto, & Scharcanski, 2015; De Almeida, Oliveira, Britto Jr, Silva Jr, & Koerich, 2015), and wildlife census (Laradji, Rostamzadeh, Pinheiro, Vazquez, & Schmidt, 2018). As a well-established problem in computer vision, crowd counting has plagued researchers with many challenges over the last few years.
Crowd counting has produced considerable concern in recent years. However, crowd counting in highly congested scenes is a challenging problem owing to scale variation. To remedy this issue, we propose a novel deep scale purifier network (DSPNet) that can encode multiscale features and reduce the loss of contextual information for dense crowd counting. Our proposed method has two strong points. First, the DSPNet model consists of a frontend and a backend. The frontend is a conventional deep convolutional neural network, while the unified deep neural network backend adopts a “maximal ratio combining” strategy to learn complementary scale information at different levels. The scale purifier module, which improves scale representations, can effectively fuse multiscale features. Second, DSPNet performs the whole RGB image-based inference to facilitate model learning and decrease contextual information loss. Our customized network is end-to-end and has a fully convolutional architecture. We demonstrate the generalization ability of our approach by cross-scene evaluation. Extensive experiments on three publicly available crowd counting benchmarks (i.e., UCF-QNRF, ShanghaiTech, and UCF_CC_50 datasets) show that our DSPNet delivers superior performance against state-of-the-art methods.
Improving Turn Movement Count Using Cooperative Feedback
2023, Sensors
Identification and Tracking of Vehicles between Multiple Cameras on Bridges Using a YOLOv4 and OSNet-Based Method
2023, Sensors
Intelligent traffic analysis system for Indian road conditions
2022, International Journal of Information Technology (Singapore)
Lane-Level Vehicle Counting Based on V2X and Centimeter-level Positioning at Urban Intersections
2022, International Journal of Intelligent Transportation Systems Research
A Deep Learning Framework for Video-Based Vehicle Counting
2022, Frontiers in Physics

View all citing articles on Scopus

^☆: The authors thank CNPq – Conselho Nacional de Desenvolvimento Científico e Tecnológico, and CAPES – Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Brazil and DIGICON for funding this project.

View full text

A novel video based system for detecting and counting vehicles at user-defined virtual loops☆

Highlights

Abstract

Introduction

Section snippets

Our proposed vehicle detection and counting method

Experimental results

Conclusion

Acknowledgment

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Journal of Computational and Applied Mathematics

Vehicle tracking by non-drifting mean-shift using projective kalman filter

Tracking and counting vehicles in traffic video sequences using particle filtering

Real-time tracking of non-rigid objects using mean shift

Image analysis and rule-based reasoning for a traffic monitoring system

IEEE Transactions on Intelligent Transportation Systems

The representation and recognition of human movement using temporal templates

Digital image processing

An efficient k-means clustering algorithm: Analysis and implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence

Real time object tracking based on dynamic feature grouping with background subtraction