Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion

Hu, Mingdi; Bai, Long; Fan, Jiulun; Zhao, Sirui; Chen, Enhong

doi:10.1007/s11704-022-1389-x

Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion

Research Article
Published: 22 October 2022

Volume 17, article number 173321, (2023)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Mingdi Hu¹,
Long Bai¹,
Jiulun Fan¹,
Sirui Zhao² &
…
Enhong Chen²

133 Accesses
13 Citations
76 Altmetric
10 Mentions
Explore all metrics

Abstract

Vehicle Color Recognition (VCR) plays a vital role in intelligent traffic management and criminal investigation assistance. However, the existing vehicle color datasets only cover 13 classes, which can not meet the current actual demand. Besides, although lots of efforts are devoted to VCR, they suffer from the problem of class imbalance in datasets. To address these challenges, in this paper, we propose a novel VCR method based on Smooth Modulation Neural Network with Multi-Scale Feature Fusion (SMNN-MSFF). Specifically, to construct the benchmark of model training and evaluation, we first present a new VCR dataset with 24 vehicle classes, Vehicle Color-24, consisting of 10091 vehicle images from a 100-hour urban road surveillance video. Then, to tackle the problem of long-tail distribution and improve the recognition performance, we propose the SMNN-MSFF model with multi-scale feature fusion and smooth modulation. The former aims to extract feature information from local to global, and the latter could increase the loss of the images of tail class instances for training with class-imbalance. Finally, comprehensive experimental evaluation on Vehicle Color-24 and previously three representative datasets demonstrate that our proposed SMNN-MSFF outperformed state-of-the-art VCR methods. And extensive ablation studies also demonstrate that each module of our method is effective, especially, the smooth modulation efficiently help feature learning of the minority or tail classes. Vehicle Color-24 and the code of SMNN-MSFF are publicly available and can contact the author to obtain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-Time Vehicle Color Recognition Based on YOLO9000

Vehicle Color Recognition with Vehicle-Color Saliency Detection and Dual-Orientational Dimensionality Reduction of CNN Deep Features

Article 13 June 2017

CoNet: a lightweight color classification architecture using residual connection and MBConv

Article 06 March 2025

References

Ke X, Zhang Y F. Fine-grained vehicle type detection and recognition based on dense attention network. Neurocomputing, 2020, 399: 247–257
Article Google Scholar
Tariq A, Khan M Z, Khan M U G. Real time vehicle detection and colour recognition using tuned features of faster-RCNN. In: Proceedings of the 1st International Conference on Artificial Intelligence and Data Analytics. 2021, 262–267
Chen P, Bai X, Liu W Y. Vehicle color recognition on urban road by feature context. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(5): 2340–2346
Article Google Scholar
Jeong Y, Park K H, Park D. Homogeneity patch search method for voting-based efficient vehicle color classification using front-of-vehicle image. Multimedia Tools and Applications, 2019, 78(20): 28633–28648
Article Google Scholar
Tilakaratna D S B, Watchareeruetai U, Siddhichai S, Natcharapinchai N. Image analysis algorithms for vehicle color recognition. In: Proceedings of 2017 International Electrical Engineering Congress. 2017, 1–4
Dule E, Gökmen M, Beratoğlu M S. A convenient feature vector construction for vehicle color recognition. In: Proceedings of the 11th WSEAS International Conference on Nural Networks and 11th WSEAS International Conference on Evolutionary Computing and 11th WSEAS International Conference on Fuzzy Systems. 2010, 250–255
Hu C P, Bai X, Qi L, Chen P, Xue G J, Mei L. Vehicle color recognition with spatial pyramid deep learning. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(5): 2925–2934
Article Google Scholar
Rachmadi R F, Purnama I K E. Vehicle color recognition using convolutional neural network. 2015, arXiv preprint arXiv: 1510.07391
Zhuo L, Zhang Q, Li J F, Zhang J, Li X G, Zhang H. High-accuracy vehicle color recognition using hierarchical fine-tuning strategy for urban surveillance videos. Journal of Electronic Imaging, 2018, 27(5): 051203
Article Google Scholar
Fu H Y, Ma H D, Wang G Y, Zhang X M, Zhang Y F. MCFF-CNN: multiscale comprehensive feature fusion convolutional neural network for vehicle color recognition based on residual learning. Neurocomputing, 2020, 395: 178–187
Article Google Scholar
Nafzi M, Brauckmann M, Glasmachers T. Vehicle shape and color classification using convolutional neural network. 2019, arXiv preprint arXiv: 1905.08612
Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, Berg A C. SSD: single shot MultiBox detector. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 21–37
Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 779–788
Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection. 2020, arXiv preprint arXiv: 2004.10934
Tan M X, Pang R M, Le Q V. EfficientDet: scalable and efficient object detection. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 10778–10787
Zhou X Y, Wang D Q, Krähenbühl P. Objects as points. 2019, arXiv preprint arXiv: 1904.07850
Lin T Y, Goyal P, Girshick R, He K M, Dollár P. Focal loss for dense object detection. In: Proceedings of 2017 IEEE International Conference on Computer Vision. 2017, 2999–3007
Tang K H, Huang J Q, Zhang H W. Long-tailed classification by keeping the good and removing the bad momentum causal effect. In: Proceedings of the 34th Conference on Neural Information Processing Systems. 2020
Wang X G, Bai X, Liu W Y, Latecki L J. Feature context for image classification and object detection. In: Proceedings of the CVPR 2011. 2011, 961–968
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 26th Annual Conference on Neural Information Processing Systems. 2012, 1106–1114
Cui Y, Jia M L, Lin T Y, Song Y, Belongie S. Class-balanced loss based on effective number of samples. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 9260–9269
Cao K D, Wei C L, Gaidon A, Arechiga N, Ma T Y. Learning imbalanced datasets with label-distribution-aware margin loss. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 140
Ren M Y, Zeng W Y, Yang B, Urtasun R. Learning to reweight examples for robust deep learning. In: Proceedings of the 35th International Conference on Machine Learning. 2018
Shu J, Xie Q, Yi L X, Zhao Q, Zhou S P, Xu Z B, Meng D Y. Meta-weight-net: learning an explicit mapping for sample weighting. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 172
Jamal M A, Brown M, Yang M H, Wang L Q, Gong B Q. Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 7607–7616
Kang B Y, Xie S N, Rohrbach M, Yan Z C, Gordo A, Feng J S, Kalantidis Y. Decoupling representation and classifier for long-tailed recognition. In: Proceedings of the 8th International Conference on Learning Representations. 2020
Zhou B Y, Cui Q, Wei X S, Chen Z M. BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 9716–9725
Yin X, Yu X, Sohn K, Liu X M, Chandraker M. Feature transfer learning for face recognition with under-represented data. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 5697–5706
Liu J L, Sun Y F, Han C C, Dou Z P, Li W H. Deep representation learning on long-tailed data: a learnable embedding augmentation perspective. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 2967–2976
Liu Z W, Miao Z Q, Zhan X H, Wang J Y, Gong B Q, Yu S X. Large-scale long-tailed recognition in an open world. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2532–2541
Chu P, Bian X, Liu S P, Ling H B. Feature space augmentation for long-tailed data. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 694–710
Menon A K, Jayasumana S, Rawat A S, Jain H, Veit A, Kumar S. Long-tail learning via logit adjustment. In: Proceedings of the International Conference on Learning Representations. 2020
Xiang L Y, Ding G G, Han J G. Learning from multiple experts: self-paced knowledge distillation for long-tailed classification. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 247–263
Li Y, Wang T, Kang B Y, Tang S, Wang C F, Li J T, Feng J S. Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 10988–10997
Wang X D, Lian L, Miao Z Q, Liu Z W, Yu S X. Long-tailed recognition by routing diverse distribution-aware experts. 2021, arXiv preprint arXiv: 2010.01809
Xue X Q, Ding J K, Shi Y J. Research and application of illumination processing method in vehicle color recognition. In: Proceedings of the 3rd IEEE International Conference on Computer and Communications. 2017, 1662–1666
Seifert C, Aamir A, Balagopalan A, Jain D, Sharma A, Grottel S, Gumhold S. Visualizations of deep neural networks in computer vision: a survey. In: Cerquitelli T, Quercia D, Pasquale F, eds. Transparent Data Mining for Big and Small Data. Cham: Springer, 2017, 123–144
Chapter Google Scholar
He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778
Lin T Y, Dollár P, Girshick R, He K M, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 936–944

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 62071378), the Shaanxi Province International Science and Technology Cooperation Program (2022KW-04), and the Xi’an Science and Technology Plan Project (21XJZZ0072).

Author information

Authors and Affiliations

School of Communications and Information Engineering & School of Artificial Intelligence, Xi’an University of Posts & Telecommunications, Xi’an, 710121, China
Mingdi Hu, Long Bai & Jiulun Fan
School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026, China
Sirui Zhao & Enhong Chen

Authors

Mingdi Hu
View author publications
You can also search for this author inPubMed Google Scholar
Long Bai
View author publications
You can also search for this author inPubMed Google Scholar
Jiulun Fan
View author publications
You can also search for this author inPubMed Google Scholar
Sirui Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Enhong Chen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Mingdi Hu.

Additional information

Mingdi Hu, doctor, associate professor. She obtained doctor of science degree from school of mathematics and statistics at Shaanxi Normal University, China. Her research interests include image recognition, target retrieval and classification, data enhancement, machine learning, artificial intelligence and fuzzy information processing.

Long Bai, master student of Xi’an University of Posts and Telecommunications, China. He received the BS degree in communication engineering from Xi’an University of Science and Technology, China. His research interests include machine learning, deep neural network, image target recognition and artificial intelligence.

Jiulun Fan, doctor, professor. He graduated from Xidian University, China, majoring in signal and information processing, and obtained doctor degree in engineering. His research interests include pattern recognition and image processing, fuzzy information processing theory and application, image security technology.

Sirui Zhao, doctor student of University of Science and Technology of China, China. His research interests include human-computer interaction, affective computing, computer vision and knowledge representation. He has published several papers in refereed conferences and journals, such as ACM MM2021, Neural Networks.

Enhong Chen, doctor, professor of University of Science and Technology of China, China. He is CCF Fellow, IEEE Senior Member. His research interests includes data mining and machine learning, especially social network analysis and recommender systems. He has published more than 200 papers in refereed conferences and journals, such as TKED, KDD, ICDM, NIPS.

Electronic supplementary material