Multi-label learning with multi-label smoothing regularization for vehicle re-identification

doi:10.1016/j.neucom.2018.11.088

Neurocomputing

Volume 345, 14 June 2019, Pages 15-22

https://doi.org/10.1016/j.neucom.2018.11.088 Get rights and content

Abstract

Vehicle re-identification (re-ID) is a vital technique to the urban intelligent video surveillance system and smart city. Given a query vehicle image, the vehicle re-ID aims to search and retrieve the images of the same vehicle that have been captured by different surveillance cameras with various viewing angles. Based on the observation that essential vehicle attributes, like vehicle‘s color and types (e.g., sedan, bus, truck, and so on), could be used as important traits to recognize vehicle, an effective multi-label learning (MLL) method is proposed in this paper that can simultaneously learn three labels: vehicle’s ID, type, and color. With three labels, a multi-label smoothing regularization (MLSR) is further proposed, which can allocate a uniform label distribution to the multi-labeled training images to regularize MLL model and improve vehicle re-ID performance. Extensive experiments conducted on the VeRi and VehicleID datasets have demonstrated that the proposed MLL with MLSR approach can effectively improve the performance delivered by the baseline and outperform multiple state-of-the-art vehicle re-ID methods as well.

Introduction

Vehicle recognition is one of the important research topics in the urban intelligent video surveillance system and smart city. This computer vision task can be encountered in many applications, such as vehicle classification [1], [2], vehicle tracking [3], [4], and vehicle detection [5], [6], to name a few. For these applications, vehicle re-identification (re-ID) is an essential functionality that aims to search and retrieve all the vehicle images of the same query vehicle that have been captured by different cameras under various viewing angles. Refer to Fig. 1 for demonstrations. Consequently, how to develop an effective vehicle re-ID method becomes a challenging task and is our focus in this paper.

Compared with person re-ID [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], vehicle re-ID is a more recently-emerged computer vision research topic. Appearance-based person re-ID has been extensively studied in the past. They utilized the low-level features, such as color histogram [7] and scale-invariant feature transform (SIFT) [18], high-level ones, like deep learning features with more promising performance achieved [11], [12], [17], and spatial-temporal features [8], [10]. Recently, appearance-based vehicle re-ID has drawn increasing attention for further studies. Many well-designed low-level features, such as LOMO [9], BOW-CN [13], and BOW-SIFT [19], have been deployed for conducting vehicle re-ID problem. Some well-known deep convolutional neural networks, like VGGNet [20] and GoogLeNet [21], have demonstrated their superior performance to the low-level feature approaches (e.g., [22], [23], [24]).

In this paper, an effective deep neural network based multi-label learning (MLL) method is proposed for tackling vehicle re-ID problem. For that, three essential information (treated as labels) of the vehicles are simultaneously learned; that is, vehicle’s ID, category (or model), and color. There are nine categories in our considerations, such as sedan, bus, lorry, cargo container, and so on. Another novelty lies in this work is that a multi-label smoothing regularization (MLSR) is proposed for conducting MLL process that involves multi-labeled vehicle images. To be more specific, multi-labeled vehicle images are first fed into the ResNet-50 MLL model [25]. The MLSR is then exploited to regularize the learning process by integrating the multi-labeled vehicle images. To compare the vehicle re-ID performance of the proposed method and that of state-of-the-arts, experiments have been conducted on two large vehicle datasets (i.e., VeRi [22], [23] and VehicleID [24]). Note that the images contained in these datasets are acquired from real-world urban surveillance environments. The results have demonstrated the superiority of the proposed method.

The rest of this paper is organized as follows. Section 2 introduces the related work. Section 3 succinctly describes the proposed MLL method with MLSR for conducting vehicle re-ID. Section 4 presents experimental results. Section 5 concludes the paper.

Section snippets

Related work

In this section, existing vehicle re-ID works will be reviewed. These works can be approximately divided into two categories of approaches: (1) sensor-/clue-based methods, and (2) appearance-based methods.

The proposed approach

As shown in Fig. 2, the training vehicle image samples are first labeled with three essential information about each vehicle; that is, vehicle’s ID, model, and color information for performing multi-label learning. The proposed MLSR is further employed for conducting MLL. All these are described in the following sub-sections in detail, respectively.

Vehicle dataset and evaluation protocol

We mainly evaluate the proposed method using the VehicleID [24] and VeRi [22], [23] datasets. The VehicleID and VeRi are two recently released vehicle datasets proposed by Peking University and Beijing University of Posts & Telecommunications, respectively. The detailed descriptions of VehicleID and VeRi datasets are as follows.

VehicleID contains 221,763 vehicle images of 26,267 vehicles collected by multiple non-overlapping surveillance cameras and each vehicle is only captured with front or

Conclusion

In this paper, we propose an effective multi-label learning (MLL) method for vehicle re-ID. Moreover, a multi-label smoothing regularization (MLSR) method is proposed for multi-labeled vehicle images in the proposed MLL. The proposed MLL method simultaneously learns three labels (i.e., vehicle ID, vehicle model, vehicle color), which helps to achieve better performance than the baseline model. Through the proposed MLSR method, we regularize the MLL model with a specially-designed multi-label

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under the grants 61871434, 61602191, and 61802136, in part by the Natural Science Foundation of Fujian Province under the grants 2019J06017, 2016J01308 and 2017J05103, in part by the Fujian-100 Talented People Program, in part by High-level Talent Innovation Program of Quanzhou City under the grant 2017G027, in part by the Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of

Jinhui Hou received the B.E. degree in communication engineering from Huaqiao University, Xiamen, China. He is currently pursuing the M.S. degree in the School of Information Science and Engineering, Huaqiao University, China. His research interests include deep learning and object recognition.

References (40)

CaiL. et al.
Hog-assisted deep feature learning for pedestrian gender recognition
J. Frankl. Inst.
(2018)
M.A. Lalimi et al.
A vehicle license plate detection method using region and edge based methods
Comput. Electr. Eng.
(2013)
K. Kwong et al.
Arterial travel time estimation based on vehicle re-identification using wireless magnetic sensors
Transp. Res. Part C: Emerg. Technol.
(2009)
YangL. et al.
A large-scale car dataset for fine-grained categorization and verification
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2015)
J. Sochor et al.
3D boxes as CNN input for improved fine-grained vehicle recognition
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2016)
B.C. Matei et al.
Vehicle tracking across nonoverlapping cameras using joint kinematic and appearance features
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2011)
GuoJ.-M. et al.
Nighttime vehicle detection and tracking with adaptive mask training
IEEE Trans. Veh. Technol.
(2016)
ChenX. et al.
Vehicle detection in satellite images by hybrid deep convolutional neural networks
IEEE Geosci. Remote Sens. Lett.
(2017)
HsiaC.-H. et al.
Nighttime vehicle detection using hybrid feature method
J. Chung Cheng Inst. Technol.
(2017)
M. Farenzena et al.
Person re-identification by symmetry-driven accumulation of local features
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(2010)

W. Zhang, Y. Li, W. Lu, X. Xu, Z. Liu, X. Ji, Learning intra-video difference for person re-identification, in:...

LiaoS. et al.

Person re-identification by local maximal occurrence representation and metric learning

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2015)

W. Zhang, S. Hu, K. Liu, Z. Zha, Compact appearance learning for video-based person re-identification, in: Proceedings...

E. Ahmed et al.

An improved deep learning architecture for person re-identification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2015)

LiW. et al.

DeepReID: deep filter pairing neural network for person re-identification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2014)

ZhengL. et al.

Scalable person re-identification: a benchmark

Proceedings of the IEEE International Conference on Computer Vision

(2015)

ZhangW. et al.

Video-based pedestrian re-identification by adaptive spatio-temporal appearance model

IEEE Trans. Image Process.

(2017)

ZhangW. et al.

Learning bidirectional temporal cues for video-based person re-identification

IEEE Trans. Circuits Syst. Video Technol.

(2017)

ZhuJ. et al.

Deep hybrid similarity learning for person re-identification

IEEE Trans. Circuits Syst. Video Technol.

(2018)

ZhaoR. et al.

Unsupervised salience learning for person re-identification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

(2013)

Cited by (36)

Research trends, themes, and insights on artificial neural networks for smart cities towards SDG-11
2023, Journal of Cleaner Production
Smart Cities can promote economic growth, sustainable transport, environmental sustainability, and good governance among cities. These benefits can support cities in achieving the SDG-11 targets. Smart Cities entails the integration of smart technologies, including machine learning techniques, in cities. Among the machine learning techniques, Artificial Neural Network (ANN) is prominent. Literature revealed significant research interest on ANN for Smart Cities, resulting to several existing review works. Existing works revealed research interests on applications for structural monitoring, Internet of Things (IoT), transport systems, and cybersecurity among others. However there is a scarcity in understanding the implications of ANN for Smart Cities towards SDG-11. This work, therefore, reviews the research trends on ANN for Smart Cities towards SDG-11 through a systematic bibliometric methodology. This work utilizes a keyword-based search retrieving 743 documents for descriptive analysis and 131 documents for content analysis. The results reveal an exponential growth in research interest and cluster formation among pertinent themes. This work determined the prominent themes on Environmental Impact, on Transport Systems, and on Urbanization. This review highlights insights on research trends, on thematic prominence, and on specific SDG-11 themes.
Joint learning with diverse knowledge for re-identification
2023, Signal Processing: Image Communication
Re-identification (re-ID) aims to search the target images of pedestrians or vehicles with the same identity in the non-overlapping camera network. Some factors, such as, cluttered background, different illumination and occlusion, which increases the difficulty of the re-identification task. To fulfill the re-ID task, this paper presents a re-ID method with diverse knowledge, which contains a multi-branch structure to extract various detailed features of samples from multiple perspectives. Furthermore, the attention mechanism is introduced to suppress the influence of background information. Moreover, to improve the discriminative ability, a joint learning strategy is presented to refine the network and align the output of multiple branches. To verify the effectiveness of the proposed method, corresponding experiments have been conducted on multiple public datasets and experimental results demonstrate that our method is effective and achieves the competitive results with existing methods on CUHK03 and Market1501.
LABNet: Local graph aggregation network with class balanced loss for vehicle re-identification
2021, Neurocomputing
Vehicle re-identification is an important computer vision task where the objective is to identify a specific vehicle among a set of vehicles seen at various viewpoints. Recent methods based on deep learning utilize a global average pooling layer after the backbone feature extractor, however, this ignores any spatial reasoning on the feature map. In this paper, we propose local graph aggregation on the backbone feature map, to learn associations of local information and hence improve feature learning as well as reduce the effects of partial occlusion and background clutter. Our local graph aggregation network considers spatial regions of the feature map as nodes and builds a local neighborhood graph that performs local feature aggregation before the global average pooling layer. We further utilize a batch normalization layer to improve the system effectiveness. Additionally, we introduce a class balanced loss to compensate for the imbalance in the sample distributions found in the most widely used vehicle re-identification datasets. Finally, we evaluate our method in three popular benchmarks and show that our approach outperforms many state-of-the-art methods.
Part alignment network for vehicle re-identification
2020, Neurocomputing
Citation Excerpt :
The quantity distribution of the three datasets is shown in Table 1. The proposed approach is compared with several state-of-the-art vehicle re-ID methods, i.e. GPR [9], MGLR [9], ATTs [9], C2F-Rank [13], HDC [15], MLL + MLSR [35], OIN [31], and GS-TRE [14], together with some conventional re-ID methods such as DRDL [8] and the KEPLER method [36], which learns salient regions for constructing discriminative features. We conduct performance evaluation on VD1, VD2, and VehicleID datasets.
Vehicle re-identification (re-ID) has numerous applications in real life, such as video surveillance, information retrieval, and public security. However, it suffers from the misalignment of vehicles, which is a critical problem caused by inaccurate detection and different views. This paper proposes a novel network named Part Alignment Network (PAN) which can properly handle the misalignment of vehicles. It is therefore adapted to the vehicle re-ID task, avoiding the use of any additional pre-processing steps such as the annotation of vehicle key points and part segmentation by hand. In PAN, cross-correlation is adopted to the alignment of vehicle parts. Then, an effective network architecture is designed to extract the discriminative aligned features. By combining complementary aligned features and original features, more robust feature representations are learned. To show the effectiveness of PAN, this paper conducts experiments on three vehicle re-ID databases (VD1, VD2, and VehicleID), on which it improves the current state-of-the-art performance.
Algorithm for Provision of Distributed Cloud Data Center Resources Based on Central Management System
2024, Lecture Notes in Networks and Systems
Identification of Influence Lines for Highway Bridges Using Bayesian Parametric Estimation Based on Computer Vision Measurements
2023, Journal of Bridge Engineering

View all citing articles on Scopus

Huanqiang Zeng received the B.S. and M.S. degrees from Huaqiao University, Xiamen, China and the Ph.D. degree from Nanyang Technological University, Singapore, all in electrical engineering. He is now a Professor at the School of Information Science and Engineering, Huaqiao University, Xiamen, China. He was a Postdoctoral Fellow at the Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong from 2012 to 2013, and a Research Associate at the Temasek Laboratories, Nanyang Technological University, Singapore in 2008. His research interests are in the areas of image processing and video coding, machine learning and pattern recognition, and computer vision. He has published more than 80 papers in well-known international journals and conferences. He has been actively serving as the Associate Editor for IEEE Access, IET Electronics Letters, and International Journal of Image and Graphics, Guest Editor for multiple international journals, including Journal of Visual Communication and Image Representation, Multimedia Tools and Applications, and Journal of Ambient Intelligence and Humanized Computing, the General Co-Chair for IEEE International Symposium on Intelligent Signal Processing and Communication Systems 2017 (ISPACS2017), the Technical Program Co-Chair for Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2017 (APSIPA ASC2017), the Area Chair for IEEE International Conference on Visual Communications and Image Processing (VCIP2015), the Technical Program Committee Member for multiple flagship international conferences. He received the Best Paper Award from Chinese Conference on Signal Processing 2017 (CCSP2017). He is an IEEE Senior member, and a Member of International Steering Committee of International Symposium on Intelligent Signal Processing and Communication Systems.

Lei Cai received the B.E. degree in Detection Guidance and Control Techniques from Changchun University of Science and Technology, Changchun, China, and the M.S. degree in Information Science and Engineering from Huaqiao University, Xiamen, China. He is currently pursuing the Ph.D. degree in the School of Electronics and Information South China Institute of Technology, Guangzhou, China. His research interests include deep learning and object recognition.

Jianqing Zhu received the B.S. degree in communication engineering and the M.S. degree in communication and information system from the School of Information Science and Engineering, Huaqiao University, Xiamen, China, in 2009 and 2012, respectively. He received the Ph.D. degree in Institute of Automation, Chinese Academy of Sciences, Beijing, China, in 2015. He is currently an Associate Professor at the College of Engineering, Huaqiao University, Quanzhou, China. His current research interests include computer vision and pattern recognition, with a focus on image and video analysis, particularly person re-identification, object detection and video surveillance. He was awarded the Best Biometrics Student Paper award at the International Conference on Biometrics in 2015.

Jing Chen received the B.S. and M.S. degrees from Huaqiao University, Xiamen, China, and the Ph.D. degree from Xiamen University, Xiamen, China, all in computer science. She is now an Associate Professor at the School of Information Science and Engineering, Huaqiao University, Xiamen, China. Her current research interests include image processing and video coding.

Kai-Kuang Ma received his B.E. degree (electronic engineering) from Chung Yuan Christian University, Chung Li, Taiwan, Republic of China, M.S. degree (electrical engineering) from Duke University, Durham, NC, U.S.A., and the Ph.D. degree (electrical engineering) from North Carolina State University, Raleigh, NC, U.S.A. He is now a full Professor at the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. From 1992 to 1995, he was a Member of Technical Staff at the Institute of Microelectronics (IME), Singapore, working on digital video coding and the MPEG standards. From 1984 to 1992, he was with in IBM Corporation at Kingston, NY, and Research Triangle Park, NC, U.S.A., engaging on various DSP and VLSI advanced product development. His research interests are in the areas of digital image/video processing and computer vision, including digital image/video coding and standards, image/video segmentation, denoising and enhancement, interpolation and super-resolution. His research interests on computer vision include image matching and registration, scene analysis and recognition, and human–computer interaction. He has published extensively and holds one USA patent on fast motion estimation algorithm. He was serving as Singapore MPEG Chairman and Head of Delegation (1997–2001). On the MPEG contributions, two fast motion estimation algorithms (Diamond Search and MVFAST) produced from his research group have been adopted by the MPEG-4 standard, as the reference core technology for fast motion estimation. He was the General Chair of organizing a series of international standard meetings (MPEG and JPEG), JPEG2000 and MPEG-7 workshops held in Singapore (March 2001). He is an IEEE Fellow. He was elected as a Distinguished Lecturer of the IEEE Circuits and Systems Society for 2008–2009. He is a General Co-Chair of ISPACS2017, ASIPA2017, ACCV2016 Workshop, VCIP-2013; Technical Program Co-Chair of ICIP-2004, ISPACS-2007, IIH-MSP-2009, and PSIVT-2010; and Area Chair of ACCV-2009 and ACCV-2010. He has been serving as an Editorial Board Member for several leading international journals in his research area, such as Senior Area Editor for the IEEE Transactions on Image Processing (2016–2019), Associate Editor for the IEEE Transactions on Circuits and Systems for Video Technology (2015-now), the IEEE Signal Processing Letters (2014–2016), the IEEE Transactions on Image Processing (2007–2010), the IEEE Transactions on Communications (1997–2012 as Editor), the IEEE Transactions on Multimedia (2002–2009), the International Journal of Image and Graphics (2003–2015) and the Journal of Visual Communication and Image Representation (2005–2015). He is an elected member of three IEEE Technical Committees: Image and Multidimensional Signal Processing (IMDSP) Committee, Multimedia Communications Committee, and Digital Signal Processing. He has been serving as Technical Program Committee member, reviewer and Session Chair of multiple IEEE international conferences. He is Chairman of IEEE Signal Processing Singapore Chapter (2000–2002). He is a member of Sigma Xi and Eta Kappa Nu.

View full text

Multi-label learning with multi-label smoothing regularization for vehicle re-identification

Abstract

Introduction

Section snippets

Related work

The proposed approach

Vehicle dataset and evaluation protocol

Conclusion

Acknowledgment

J. Frankl. Inst.

Comput. Electr. Eng.

Transp. Res. Part C: Emerg. Technol.

A large-scale car dataset for fine-grained categorization and verification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

3D boxes as CNN input for improved fine-grained vehicle recognition

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Vehicle tracking across nonoverlapping cameras using joint kinematic and appearance features

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Nighttime vehicle detection and tracking with adaptive mask training

IEEE Trans. Veh. Technol.

Vehicle detection in satellite images by hybrid deep convolutional neural networks

IEEE Geosci. Remote Sens. Lett.

Nighttime vehicle detection using hybrid feature method

J. Chung Cheng Inst. Technol.

Person re-identification by symmetry-driven accumulation of local features

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Person re-identification by local maximal occurrence representation and metric learning

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

An improved deep learning architecture for person re-identification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

DeepReID: deep filter pairing neural network for person re-identification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Scalable person re-identification: a benchmark

Proceedings of the IEEE International Conference on Computer Vision

Video-based pedestrian re-identification by adaptive spatio-temporal appearance model

IEEE Trans. Image Process.

Learning bidirectional temporal cues for video-based person re-identification

IEEE Trans. Circuits Syst. Video Technol.

Deep hybrid similarity learning for person re-identification

IEEE Trans. Circuits Syst. Video Technol.

Unsupervised salience learning for person re-identification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition