A real-time vehicle detection and a novel vehicle tracking systems for estimating and monitoring traffic flow on highways

doi:10.1016/j.aei.2021.101393

Advanced Engineering Informatics

Volume 50, October 2021, 101393

https://doi.org/10.1016/j.aei.2021.101393 Get rights and content

Abstract

Real-time highway traffic monitoring systems play a vital role in road traffic management, planning, and preventing frequent traffic jams, traffic rule violations, and fatal road accidents. These systems rely entirely on online traffic flow info estimated from time-dependent vehicle trajectories. Vehicle trajectories are extracted from vehicle detection and tracking data obtained by processing road-side camera images. General-purpose object detectors including Yolo, SSD, EfficientNet have been utilized extensively for real-time object detection task, but, in principle, Yolo is preferred because it provides a high frame per second (FPS) performance and robust object localization functionality. However, this algorithm’s average vehicle classification accuracy is below 57%, which is insufficient for traffic flow monitoring. This study proposes improving the vehicle classification accuracy of Yolo, and developing a novel bounding box (Bbox)-based vehicle tracking algorithm. For this purpose, a new vehicle dataset is prepared by annotating 7216 images with 123831 object patterns collected from highway videos. Nine machine learning-based classifiers and a CNN-based classifier were selected. Next, the classifiers were trained via the dataset. One out of ten classifiers with the highest accuracy was selected to combine to Yolo. This way, the classification accuracy of the Yolo-based vehicle detector was increased from 57% to 95.45%. Vehicle detector 1 (Yolo) and vehicle detector 2 (Yolo + best classifier), and the Kalman filter-based tracking as vehicle tracker 1 and the Bbox-based tracking as vehicle tracker 2 were applied to the categorical/total vehicle counting tasks on 4 highway videos. The vehicle counting results show that the vehicle counting accuracy of the developed approach (vehicle detector 2 + vehicle tracker 2) was improved by 13.25% and this method performed better than the other 3 vehicle counting systems implemented in this study.

Introduction

Classification of vehicles (such as cars, trucks, buses, motorbikes, or bicycles) on urban roads/highways, and estimating statistical traffic flow information (for example, flow frequency of vehicles and determining the numbers of which vehicle types go in what direction) are an important input in urban/highway traffic analysis and planning tools [1]. However, real-time highway traffic flow monitoring is the challenging issue of urban areas in this modern age of growing technology and population. Poor road/highway traffic management results in frequent traffic jams, traffic rules violations, and fatal road accidents. Using traditional techniques (RADAR, LIDAR, RFID, or LASAR) to address this problem is time-consuming, expensive, and tedious [2]. In cases where such sensors are insufficient, human observers go to the region and count vehicles pass through. After all, this is not a practical solution, and these methods cannot generate real-time traffic flow information. And they are insufficient in vehicle classification or in obtaining information such as the number of vehicles by their types and moving directions [3]. However, recent artificial intelligence (computer vision) approaches, especially deep and machine learning-based image processing techniques, which are among modern data processing methods, are used for online video processing systems. These systems generally contain vehicle detection, tracking (by associating best-matched vehicle peers in successive frames), time-dependent trajectory extraction and traffic flow information estimation units. Traffic flow information includes speed, categorical or total number of vehicles, vehicles’ entry and exit points to the specified area and the period of time between the entry and exit points. All this information is extracted from the vehicle trajectories via vehicle detection and tracking methods in their core. Thus, robust and high-performance vehicle detection and tracking algorithms are critical for these kinds of systems, [4], [5].

Vehicle detection is a technique for recognition the type of target objects and localizing them on a video frame. Object detection algorithms are usually divided into conventional machine learning and deep learning methods. Conventional detection techniques such as “Background Subtraction (BS) + Support Vector Machines (SVM)”, “BS + K-Nearest Neighbor (KNN)”, or other classical detection algorithms based on the Speeded Up Robust Features (SURF) or the Scale Invariant Feature Transform (SIFT) visual features, utilize manually and hand-crafted feature vectors. Thus, all these algorithms require deep knowledge and expertise to select the best representative features of the target objects [6]. It is a daunting task to manually determine the most effective and the most representative features that can perfectly describe the contents of objects in an image frame. Furthermore, the classical detection methods are very slow, which is insufficient for real-time traffic flow monitoring. On the other hand, deep learning methods don’t require any expertise and deep understanding of the contents of objects on an image because these approaches automatically extract deep and hidden features using deep neural nets (DNN). The DNNs use tens of hidden layers that contain linear or non-linear activation functions, so the DNNs have the capability to extract feature vectors from original images and learn to make accurate and optimal decisions [3].

Vehicle tracking is a technique for re-identifying the detected objects and associating them to the best-matched peers through consecutive frames. Pixel, shape, color, and bounding box (Bbox) information are widely used to trace the detected objects and to extract the object trajectories. Although pixel, shape, and color-based tracking methods are considered robust to track objects through successive frames, these methods are not sufficient for real-time video analysis applications since they are very slow. However, the methods including Kalman or Particle Filter tracking algorithms, which use bounding box info, are slightly faster than pixel, shape, or color-based approaches because they only process the coordinate information of the detected objects. But, these methods also become short when the number of objects on a certain frame increases [7]. For instance, the Kalman and Particle Filter tracking algorithms struggle on highways where objects move very fast and the number of objects is more than 30 on a frame. Furthermore, vehicle detection and tracking have been challenging tasks of classical computer vision and image processing research because of the issues such as partial or full occlusion of objects, illusion, camera shaking, extremely high or low-quality picture, distinct weather conditions including rain, snow and wind that complicate the vehicle detection, tracking, and data association processes, and in some cases, these problems make such systems completely fail [8]. Due to these reasons, a robust and real-time tracking algorithm is paramount for the effective and efficient video analysis tools.

General-purpose object detection architectures such as Yolo, Single-Shot-Detector, EfficientNet are being widely used for the online vehicle detection tasks, however, among them, Yolo has clear advantages with its high fps rate(frame per second) and robust vehicle localization functionality. Nevertheless, this algorithm’s average vehicle classification accuracy is below 57%, which is not enough for traffic flow monitoring systems [9], [10], [11]. This study proposes improving the vehicle classification accuracy of the Yolo algorithm by combining a robust classification layer based on the accuracy results of 10 classification algorithms. Additionally, a novel Bbox-based vehicle tracking algorithm was developed in this study. For these purposes, a new vehicle dataset was prepared by annotating 7216 images with 123831 object patterns obtained directly from the selected road/highway videos [12]. Nine object classification algorithms plus a CNN-based classification method were trained via the dataset. One out of ten classifiers with the highest accuracy were selected to combine to Yolo. This way, the classification accuracy of the Yolo-based vehicle detection algorithm was increased from 57% to 95.45%. The flow-chart of the entire process is illustrated in Fig. 1. Besides, we have implemented the Kalman filter-based vehicle tracking. Then, Yolo and “Yolo + the best classifier” as vehicle detectors, and the Bbox-based and Kalman filter-based trackers as vehicle tracking algorithms were applied to the categorical and total vehicle counting tasks on 4 highways, see Fig. 2. The vehicle counting results show that vehicle counter 2 (Yolo + the best classifier + the Bbox tracker) performed with an average 13.25% better accuracy than vehicle counter 1 (Yolo + the Bbox tracker), and this approach performed better than other vehicle counting systems developed in this study. The contribution of the study is as follows:

(i)
creating a new dataset by annotating 7216 images with 123831 object patterns collected directly from road/highway videos. Implementing nine classifiers and developing our CNN-based classifier, and training the classifiers via the created new vehicle dataset. Next, determining a classifier with the highest accuracy out of the ten classifiers, and developing real-time and high-accuracy vehicle detection system by combining the determined best classifier to Yolo,
(ii)
developing a novel bounding-box-based vehicle tracking algorithm and the implementation of the Kalman filter-based tracking algorithm. Then, applying both vehicle detectors and trackers into the vehicle trajectory extraction task,
(iii)
developing real-time traffic flow monitoring systems that can process up to 500 vehicles in a video frame simultaneously. The systems monitor the traffic flow by estimating the categorical and total numbers of vehicles using the extracted vehicle trajectories. The four case study highway videos were processed via the developed traffic flow monitoring systems. Then, the accuracy of the developed vehicle counting systems were compared based on the developed vehicle detectors and trackers.

Section snippets

Literature review

Extraction of valuable traffic flow information by analyzing video scenes is crucial for a vision-based highway/intersection monitoring and management system (a vision-based traffic flow monitoring and management system) since it enables dynamic intelligent transportation systems (ITS), which is an important component of smart (sustainable) cities. The system fully relies on vehicle recognition and extraction of vehicle trajectories. Recognizing vehicles and extraction of their trajectory data

Methodology

The developed highway traffic monitoring system consists of four main modules: vehicle detection, tracking, trajectory extraction, and traffic flow info estimation. In this study, the categorical and total number of vehicles were estimated as traffic flow information for 4 highway videos. These videos were processed via 2 vehicle detection approaches and 2 vehicle tracking algorithms. Vehicle detector 1 was based on the general-purpose weight model of the Yolo object detection algorithm, and

Training and test results of the classification systems

The accuracy results of ten classification algorithms are illustrated in Table 1. The weighted average accuracy were taken in this study since the distribution of vehicle numbers by their types are not equal or close to each other. The dataset includes 123831 images, each image contains only one object label. The dataset was split into train/test parts 75/25 ratio, 92873/30958 object images, respectively. In the test set the distribution of the numbers of vehicles by the vehicle types is 23518

Conclusion

Real-time traffic flow data extraction system has been developed by processing ordinary camera images using vehicle detection and tracking algorithms. On four highway videos, the categorical and total number of vehicles were estimated with two vehicle counting systems. Vehicle counting 1 and 2 were developed via vehicle detection 1 and 2, respectively. Vehicle detection 1 was built on the general-purpose weight model of Yolo, and vehicle detection 2 was developed by combining a CNN-based

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research was supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under the Grant No:119E077 and Title: “Development of a Customized Traffic Planning System for Sakarya City by Processing Multiple Camera Images with Convolutional Neural Networks (CNN) and Machine Learning Techniques”.

References (38)

F. Li et al.
Hybrid data-driven vigilance model in traffic control center using eye-tracking data and context data
Adv. Eng. Inform.
(2019)
Z. Yang et al.
Vehicle detection in intelligent transportation systems and its applications under varying environments: A review
Image Vis. Comput.
(2018)
X. Feng et al.
Computer vision algorithms and hardware implementations: A survey
Integration
(2019)
S. Du et al.
Fault-tolerant control of variable speed limits for freeway work zone using likelihood estimation
Adv. Eng. Inform.
(2020)
J. Dong et al.
A framework of pavement management system based on iot and big data
Adv. Eng. Inform.
(2021)
H. Luo et al.
Real-time smart video surveillance to manage safety: A case study of a transport mega-project
Adv. Eng. Inform.
(2020)
M. Fernandez-Sanjurjo et al.
Real-time visual detection and tracking system for traffic monitoring
Eng. Appl. Artif. Intell.
(2019)
N.A. Mandellos et al.
A background subtraction algorithm for detecting and tracking vehicles
Expert Syst. Appl.
(2011)
Y. Pi et al.
Convolutional neural networks for object detection in aerial imagery for disaster response and recovery
Adv. Eng. Inform.
(2020)
B. Nguyen et al.
Real-time validation of vision-based over-height vehicle detection system
Adv. Eng. Inform.
(2018)

P. Liu et al.

Vehicle tracking based on shape information and inter-frame motion vector

Comput. Electr. Eng.

(2019)

D. Song et al.

Multi-vehicle tracking with microscopic traffic flow model-based particle filtering

Automatica

(2019)

X. Xiao et al.

A kalman filter algorithm for identifying track irregularities of railway bridges using vehicle dynamic responses

Mech. Syst. Signal Process.

(2020)

T. Yang et al.

Online multi-object tracking combining optical flow and compressive tracking in markov decision process

J. Vis. Commun. Image Represent.

(2019)

J. Yang et al.

Tracking multiple workers on construction sites using video cameras

Adv. Eng. Inform.

(2010)

S. Khan et al.

An intelligent monitoring system of vehicles on highway traffic

Z. Zhao et al.

Object detection with deep learning: A review

IEEE Trans. Neural Netw. Learn. Syst.

(2019)

N.K. Chauhan et al.

A review on conventional machine learning vs deep learning

S.R.E. Datondji et al.

A survey of vision-based traffic monitoring of road intersections

IEEE Trans. Intell. Transp. Syst.

(2016)

Cited by (63)

Online 3D behavioral tracking of aquatic model organism with a dual-camera system
2024, Advanced Engineering Informatics
Behavioral tracking system of aquatic model organism is crucial for applications in aquaculture, environment and biomedicine, as it facilitates human to monitor subject states by automatically recognizing individual identities, and quantify their movement trajectories. Previous research has been devoted to this topic, but they are still not simple and effective enough. Therefore, this work introduces a novel online monitoring system implemented by dual-camera equipment and software modules consisting of an object detector and a multi-view multi-target tracker. The tracker provides the abilities of cross-view matching, underwater 3D reconstruction, and 3D target tracking. Specifically, our solution adopts a new paradigm, called tracking by early-reconstruction, which prioritizes the 3D reconstruction of targets’ coordinates on a frame-by-frame basis and then tracks them directly in 3D space rather than in a 2D image plane. This paradigm simplifies the complex multi-view tracking problem into a series of local association procedures, allowing us to achieve an online resolution through the iterative approach. To verify the effectiveness of the system, we employ zebrafish as the research subject, and evaluate the accuracy and robustness of the system on tracking benchmark, behavioral tasks and simulated data. Finally, we conducted extensive experiments and demonstrated the efficiency and effectiveness of the proposed system.
OSTM-NET: Joint scale variation and occlusion handling deep network for real-time vehicle counting and volume estimation
2024, Digital Signal Processing: A Review Journal
Vehicle counting and traffic volume estimation using videos are difficult tasks crucial for efficient traffic control in smart cities. Several existing techniques rely in tracking and detecting mechanisms of the vehicles. These methods are ineffective for detecting occluded and small vehicles. Also, contextual information loss occurs in deep counting networks. An innovative Internet-of-Things (IoT)- driven Intelligent Transportation Management (ITM) system is proposed to address these issues. Initially, traffic videos are converted to temporal-spatial images instead of using complex detection and tracking methods. The density map for the temporal spatial image is estimated using an occlusion-aware spatio-temporal multi-scale network (OSTM-Net). It consists of two sub-networks for capturing occluded and small vehicles simultaneously. The scale-aware column network (SCNet) accurately captures small vehicles and preserves contextual information through enhanced scale representation. At the same time, the occlusion management network (OM-Net) uses position-sensitive regions of interest (PSRoI) deformable pooling to address the occlusion issues. Finally, volume estimation and counting are calculated in accordance with the density map obtained from OSTM-Net. Every path in the videos is processed separately using OSTM-Net to calculate the vehicle count in every path for effective traffic control in this proposed approach. Furthermore, the effectiveness of the sub-networks (SCNet and OM-Net) is validated using ablation experiments. The proposed IoT-based ITM achieves high performance in counting vehicles and estimating traffic volume compared to other existing approaches.
Advanced CRITIC–GRA–GMM model with multiple restart simulation for assuaging decision uncertainty: An application to transport safety engineering for OECD members
2024, Advanced Engineering Informatics
When dealing with multi-criteria decision-making (MCDM) activities in an uncertain environment, one of the fundamental requirements of the methodology used is that it provides creditable and defensible decisions. This is particularly true for transport safety engineering. In the current study, a hybrid decision-making model was developed by integrating several advanced mathematical models and an embedded simulation. Specifically, this study integrates criteria importance through intercriteria correlation (CRITIC), grey relational analysis (GRA), and the Gaussian mixture model (GMM), namely CRITIC–GRA–GMM with multiple restart simulation; the ultimate aim is to offer a decision tool with substantial stability and reliability. In particular, this study addresses the issue faced by conventional GMM – namely uncertain initialization and trapping in local optima – by embedding a multiple restart simulation to obtain a more robust model. The multilevel contrasts of the results in the application to transport safety engineering in Organisation for Economic Co-operation and Development (OECD) countries verified the efficiency, quality, and reliability of the proposed model, indicating its practicability in real-life decision-making activities. This novel framework contributes to existing decision approach databases by enhancing the decision-making process. Overall, the current study provides decision-makers and policymakers in OECD countries with a valuable tool for identifying the strengths and weaknesses of transport safety, enabling the formulation of effective measures and action plans to maximize safety performance.
Real-time video surveillance on highways using combination of extended Kalman Filter and deep reinforcement learning
2024, Heliyon
Highways, as one of the main arteries of transit and transportation in today's world, play a fundamental role in accelerating transportation, and for this reason, continuous monitoring of them is of great importance. Among these, monitoring compliance with transportation laws by vehicles is of utmost importance; for automation, efficient and vehicle-specific models can be used. In this article, a new method for video surveillance of highways is presented using an extended Kalman filter (EKF) and reinforcement learning models. There are three primary stages to the suggested approach. During the first stage, the extended Kalman filter (EKF) is used to identify and track multiple targets. Next, in the second stage, a convolutional neural network (CNN) processes each detected moving item to determine the kind of vehicle. During this stage, the CNN model's ideal configuration is ascertained using a new optimization approach that combines Particle Swarm Optimization (PSO) and reinforcement learning. After identifying the type of vehicle, in the third phase, the proposed method uses a separate CNN model for each target vehicle to assess its compliance with transportation safety principles. It should be mentioned that each vehicle's associated CNN model is configured during this phase using the suggested optimization methodology. Investigations have been conducted into the effectiveness of the suggested method in identifying violations of road safety laws as well as how well it performed in the two phases of vehicle type identification. According to the findings, the suggested approach can identify the kind of vehicle with 98.72% accuracy, which is at least 3.41% better than the approaches that were compared. On the other hand, this model can detect the violation of road safety laws for each vehicle with an average accuracy of 91.5%, which shows at least a 3.49% improvement compared to the other methods.
CARVING-DETC: A network scaling and NMS ensemble for Balinese carving motif detection method
2023, Visual Informatics
Balinese carvings are cultural objects that adorn sacred buildings. The carvings consist of several motifs, each representing the values adopted by the Balinese people. Detection of Balinese carving motifs is challenging due to the unavailability of a Balinese carving dataset for detection tasks, high variance, and tiny-size carving motifs. This research aims to improve carving motif detection performance on challenging Balinese carving motifs detection task through a modification of YOLOv5 to support a digital carving conservation system. We proposed CARVING-DETC, a deep learning-based Balinese carving detection method consisting of three steps. First, the data generation step performs data augmentation and annotation on Balinese carving images. Second, we proposed a network scaling strategy on the YOLOv5 model and performed non-maximum suppression (NMS) on the model ensemble to generate the most optimal predictions. The ensemble model utilizes NMS to produce higher performance by optimizing the detection results based on the highest confidence score and suppressing other overlap predictions with a lower confidence score. Third, performance evaluation on scaled-YOLOv5 versions and NMS ensemble models. The research findings are beneficial in conserving the cultural heritage and as a reference for other researchers. In addition, this study proposed a novel Balinese carving dataset through data collection, augmentation, and annotation. To our knowledge, it is the first Balinese carving dataset for the object detection task. Based on experimental results, CARVING-DETC achieved a detection performance of 98%, which outperforms the baseline model.
An efficient 3D object detection method based on Fast Guided Anchor Stereo RCNN
2023, Advanced Engineering Informatics
In most binocular 3D detection algorithms, a large number of anchor points need to be selected, which leads to the problem of slow feature extraction. To solve this problem, an anchor-guided 3D object detection algorithm for autonomous driving is proposed based on Stereo Recurrent Convolutional Neutral Network (Stereo RCNN), which is called Fast Guided Anchored Stereo RCNN (FGAS RCNN). The proposed FGAS framework is divided into two stages. In the first stage, a probability map is generated for the left and right input images to determine the foreground position. Sparse anchor points and corresponding sparse anchor boxes are generated from the prior information. Left and right anchors are used as a whole to generate a 2D preselection box. In the second stage, a Feature Pyramid Network (FPN) based on key point generation network is used to generate key points, which are combined with stereo regression to generate 3D preselected boxes. Finally, instance-level disparity estimation is proposed to solve the problem of pixel-level information loss in the original image. Instance-level disparity is combined with instance segmentation masks to improve the accuracy of center depth on the 3D bounding box. Extensive experiments on the challenging Kitti dataset and NuScences dataset show that the proposed method reduces the computational cost while maintaining a high regression rate without any depth information and prior information of position. Compared to other methods, the proposed method has higher efficiency, better robustness and stronger generalization ability.

View all citing articles on Scopus

¹: Jahongir Azimjonov is a Ph.D researcher. He is studying computer vision/image processing, machine and deep learning methods on intelligent transportation systems.

²: Ahmet Özmen is a full professor at the Department of Software Engineering. His research interests are computer vision and system monitoring.

View full text

A real-time vehicle detection and a novel vehicle tracking systems for estimating and monitoring traffic flow on highways

Abstract

Introduction

Section snippets

Literature review

Methodology

Training and test results of the classification systems

Conclusion

Declaration of Competing Interest

Acknowledgment

Adv. Eng. Inform.

Image Vis. Comput.

Integration

Adv. Eng. Inform.

Adv. Eng. Inform.

Adv. Eng. Inform.

Eng. Appl. Artif. Intell.

Expert Syst. Appl.

Adv. Eng. Inform.

Adv. Eng. Inform.

Comput. Electr. Eng.

Automatica

Mech. Syst. Signal Process.

J. Vis. Commun. Image Represent.

Adv. Eng. Inform.

An intelligent monitoring system of vehicles on highway traffic

Object detection with deep learning: A review

IEEE Trans. Neural Netw. Learn. Syst.

A review on conventional machine learning vs deep learning

A survey of vision-based traffic monitoring of road intersections

IEEE Trans. Intell. Transp. Syst.