Complex background classification network: A deep learning method for urban images classification☆
Introduction
Urbanization is an important symbol for measuring the economic, social, cultural, technological development levels of a country [1]. With population growth of major cities, various “urban diseases” are seriously affecting people's quality of life, including deteriorating living condition, traffic congestion, and deterioration of public security [2]. In China, there has the world's largest ‘Skynet’ system, which consists of surveillance cameras arranged throughout city streets and lanes, and is used by the police to fight crime and maintain social stability and security [3]. Like ‘Skynet’ system, urban management systems which get urban images from surveillance cameras are running efficiently, citizens also can use mobile application, WeChat client, computer and other terminals to upload urban images conveniently and fill out relevant information. Managers in information system send message about urban issues to other relevant departments.
However, urban issues are increasing so rapidly that manual processing capacities of managers cannot meet current requirements [4]. There is an urgent need to develop quick classification tools for urban issues to solve this question. In this paper, we propose a deep learning method to classify urban images with complex backgrounds and multiple objects.
Some traditional classification methods [5] utilize Gabor wavelets, Gaussian Markov Random Field (GMRF), or Scale Invariant Feature Transform (SIFT) to extract textural features, remote sensing image features, and digital image features from images, and then use support vector machine (SVM), Lagrange support vector machine (LSVM), and other traditional machine learning techniques for image classification [6]. The quality of features extracted from input images may directly influence traditional methods’ final classification results and make the classification unstable. To overcome these challenges, convolutional neural network (CNN) based methods are developed, which can extract high-dimensional features from images for classification. CNNs are designed by the inspiration of visual neural mechanism. Lecun Y et al. [13] proposed the backpropagation (BP) algorithm, which promoted the development of convolutional neural networks greatly. This basic design has a good performance on MNIST, CIFAR and other data sets for image classification, especially on ImageNet Category Challenge [7]. In recent years, with the development of computer hardware and intensive research on convolutional neural networks, deeper and more complete network models break records on ImageNet, such as AlexNet [8], very deep convolutional networks (VGGNet) [9], Inception [10] and residual network (ResNet) [11].
Urban images are usually collected manually by surveillance cameras and mobile phones. This results in the collected images with complex background, low resolution, uneven brightness and multiple object content [12]. We need to propose a new recognition method for classifying complex background urban images. Our proposed method is capable of identifying objects representing key information and vital features in urban images. This can significantly improve the urban image classification performance.
To overcome the challenges mentioned above, we propose a complex background classification network (CBCNet) to classify urban images. Our proposed network draws on the idea that image classification to predict all information depends on the key objects in images [14]. The overview of CBCNet is shown in Fig. 1. The proposed network consists of two sub-networks: a detection sub-network and an evaluation sub-network. The input of the detection sub-network is urban images. The detection sub-network is capable of recognizing multiple target objects from the urban images. Then these images with the target objects will put into the evaluation sub-network, where a backpropagation (BP) neural network is used to adjust the coordinate parameters of the key target [13]. Thus, the proposed network can classify urban images according to the key information and vital features in the images.
Furthermore, we build a standard urban image dataset (UID) containing 8 categories and 4996 target objects in 4119 pictures, including: Graffiti and Posters, Gutter cover plate damage, Manhole cover damage, Traffic guardrail damage, Illegal parking of the bike-sharing, Illegal parking of the non-motor vehicle, Illegal parking of the motor vehicle, and Exposed garbage. We conducted extensive experiments in the established urban image dataset (UID) and the PASCAL VOC 2007 [18] dataset. The experimental results verify the effectiveness of the proposed method. Experiments are conducted to demonstrate the performance of proposed method. Our network draws on the idea that image classification to predict all information may depend on a key object [14].
The main contribution of our work is twofold: (i) We propose a novel Complex Background Classification Network (CBCNet) for urban image classification, in which a detection sub-network is developed to recognize key information in urban images and an evaluation sub-network is developed to optimize object parameters and improve the robustness of the whole network. (ii) We build a standard urban image dataset (UID) containing 8 categories and 4996 target objects in 4119 pictures.
The remainder of our paper is organized as follows. We introduce related work in Section 2. In Section 3, CBCNet model is introduced in detail. The experimental results and analysis are presented in Section 4, followed by the conclusion in Section 5.
Section snippets
Urban computing
Urban computing is a process of acquisition, integration, and analysis of big and heterogeneous data generated by diverse sources in urban spaces, such as sensors, devices, vehicles, buildings, and humans, to tackle the major issues that cities face (e.g., air pollution, increased energy consumption, and traffic congestion) [19].
Urban computing connects unobtrusive and ubiquitous sensing technologies, advanced data management and analytic models, and novel visualization methods to create
Proposed method
In this section, we explain the proposed CBCNet in detail. Although state-of-the-art methods have shown good performance in dealing with image classification and object detection tasks, there are still some shortcomings in processing urban images. This is due to the fact that there are complex backgrounds and multiple target objects in urban images that can cause uncertainty in the classification and detection process. In other words, given an urban image, the existing methods can classify it
Experiments and results
In this section, we firstly introduce our established urban image dataset in detail. Then we conduct experiments with several state-of-the-art methods to verify the performance of CBCNet.
Conclusion
We have presented a novel CBCNet for classifying urban images with complex backgrounds. CBCNet utilizes a multi-layer perceptron instead of a traditional convolutional layer for urban image feature extraction. This convolutional layer processing of network can further improve the nonlinear expression of the proposed network and reduce parameters. CBCNet is a deeper convolutional neural network that extracts useful features from complex background images. It classifies urban issues by detecting
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The authors would like to sincerely thank the reviewers for their comments and suggestions that significantly improved the quality of this paper. Zhenbing Liu and Zeya Li contributed equally to this work. This study is supported by the National Natural Science Foundation of China (Grant No. 61562013, 61866009, 61906050), the Study Abroad Program for Graduate Student of Guilin University of Electronic Technology (GDYX2018006), the Natural Science Foundation of Guangxi Province (CN) (
ZHENBING LIU received the Ph.D. degree in computer science from Huazhong University of Science and Technology, China, in 2010. He was a Visiting Research Fellow with Pennsylvania University, in 2015. He is currently a Professor with the Guilin University of Electronic Technology, China. He has published more than 30 papers. His-research interests include machine learning and medical image processing.
References (25)
- et al.
Smart city with Chinese characteristics against the background of big data: idea, action and risk
J Clean Prod
(2018) - et al.
Integrated chaotic systems for image encryption
Signal Process
(2018) - et al.
An integrated scattering feature with application to medical image retrieval
Comput Electr Eng
(2018) - et al.
Guest editorial: Urban computing
IEEE Trans Big Data
(2017) - et al.
Constructing a data‐driven Society: China's social credit system as a state surveillance infrastructure
Policy Internet
(2018) - et al.
A city in common: a framework to orchestrate large-scale citizen engagement around urban issues
- et al.
Prior knowledge-based probabilistic collaborative representation for visual recognition
IEEE T Cybernetics
(2020) - et al.
Methodologies for assessing costs of rail transit systems based on small sample data
Int J Rail Transp
(2015) Imagenet large scale visual recognition challenge
Int J Comput Vis
(2015)- et al.
ImageNet classification with deep convolutional neural networks
Communications of the ACM
(2017)
Very deep convolutional networks for large-scale image recognition
arXiv:1409.1556
Going deeper with convolutions
Cited by (3)
Cascading 1D-Convnet Bidirectional Long Short Term Memory Network with Modified COCOB Optimizer: A Novel Approach for Protein Secondary Structure Prediction
2021, Chaos, Solitons and FractalsCitation Excerpt :In this model, CNN is used to find the multiscale local contextual features. Bidirectional gated recurrent is used to capture global information [36–38]. The different popular methods used to predict secondary structure of protein is summarized in Table 1.
Layout Optimization of Urban Cultural Space Construction Based on Forward Three-Layer Neural Network Model
2022, Computational Intelligence and NeuroscienceRethinking Lynch’s “the image of the City” model in the context of urban fabric dynamics. Case study: Craiova, Romania
2021, Journal of Settlements and Spatial Planning
ZHENBING LIU received the Ph.D. degree in computer science from Huazhong University of Science and Technology, China, in 2010. He was a Visiting Research Fellow with Pennsylvania University, in 2015. He is currently a Professor with the Guilin University of Electronic Technology, China. He has published more than 30 papers. His-research interests include machine learning and medical image processing.
ZEYA LI received the B.S. degree in Henan University of Science and Technology, China in 2017. He was a Visiting Student with Massey University, in 2019. He is currently a Master candidate in the School of Computer and Information Security, Guilin University of Electronic Technology, China. His-research interests include image processing, object detection and human action recognition.
LINGQIAO LI is currently pursuing the Ph.D. degree with the School of Automation, Beijing University of Posts and Telecommunications, China. He is also an Assistant Researcher with the Guilin University of Electronic and Technology, China. His-research interests include pattern recognition and large spectral data analysis.
HUIHUA YANG received the Ph.D. degree from East China University of Science and Technology, China, in 2005. He was a Postdoctoral Research Fellow of Tsinghua University, from 2005 to 2007. He is currently a Professor with Beijing University of Posts and Telecommunications, China. He has published more than 40 papers. His-research interests include machine learning, spectrum analysis, and optimization.
- ☆
This paper is for CAEE special section SI-air2. Reviews processed and recommended for publication to the Editor-in-Chief by Area Editor Dr. Huimin Lu.