Abstract
Vehicle fine-grained recognition is an important task in the field of intelligent transportation. However, due to the subtle differences between different vehicle models, and the tremendous difference caused by the different pose of the same vehicle model, the effect of vehicle recognition is less successful and still require further research. In this paper, we propose a dual-rank attention module (DRAM), which can reduce visual interference caused by vehicle multi-pose. In addition, combined with the attention mechanism, our method can not only mine the discriminative features of the vehicle model but also increase the feature richness and improve the accuracy of vehicle model recognition. In this paper, in the initial stage of the network, the STN is used to transform the vehicle pose, meanwhile, the whole image is weighted by the global attention to mine the subtle discriminant features. At the high part of the network, after fusing multi-level feature maps top-down, we designed a fused channel attention module to enhance the feature response of the current category. It achieves 94.1% and 96.9% top-1 accuracy on the Stanford Cars and CompCars datasets, which is more than 2.1% higher than the original baseline network VGG19 but only adds a few parameters. Experimental results show that the DRAM can effectively improve the accuracy of vehicle fine-grained recognition.
This work is partly supported by grants of the National Natural Science Foundation of China (No. 61906061 and No. 62076086) and Open Foundation of the Key lab (center) of Anhui Province Key Laboratory of Intelligent Building & Building Energy Saving (No. IBES2021KF05).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yang, Z., et al.: Learning to navigate for fine-grained classification. In: ECCV, pp. 420ā435 (2018)
Soon, F.C., et al.: PCANet-based convolutional neural network architecture for a vehicle model recognition system. IEEE Trans. Intell. Transp. Syst. 20(2), 749ā759 (2019)
Yang, L., et al.: A large-scale car dataset for fine-grained categorization and verification. In: CVPR (2015). https://doi.org/10.1109/CVPR.2015.7299023
Fu, J., et al.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017). https://doi.org/10.1109/CVPR.2017.476
Jonathan, K., et al.: 3D object representations for fine-grained categorization. In: 3DRR (2013)
Em, Y., et al.: Incorporating intra-class variance to fine-grained visual recognition. In: IEEE International Conference on Multimedia and Expo (ICME), vol. 1, pp. 1452ā1457 (2017). https://doi.org/10.1109/ICME.2017.8019371
Hu, J., et al.: Squeeze-and-excitation networks. In: CVPR (2018)
Lin, Z., et al.: Space: unsupervised object-oriented scene representation via spatial attention and decomposition. arXiv preprint arXiv:2001.02407 (2020)
Yousong, Z., et al.: Attention CoupleNet: fully convolutional attention coupling network for object detection. IEEE Trans. Image Process. 28(1), 113ā126 (2018)
Heliang, Z., et al.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5209ā5217 (2017)
Bo, Z., et al.: Diversified visual attention networks for fine-grained object classification. IEEE Trans. Multimedia 19(6), 1245ā1256 (2017)
Yuanyi, Z., et al.: Toward end-to-end face recognition through alignment learning. IEEE Signal Process. Lett. 24(8), 1213ā1217 (2017)
Hao, L., et al.: STNReID: deep convolutional networks with pairwise spatial transformer networks for partial person re-identification. IEEE Trans. Multimedia 22(11), 2905ā2913 (2020)
Max, J., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, vol. 28, pp. 2017ā2025 (2015)
Jiaqi, W., et al.: CARAFE: Content-Aware ReAssembly of FEatures. In: ICCV, pp. 3007ā3016 (2019)
Jun, F., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3146ā3154 (2020)
Lin, M., et al.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
Ghassemi, S., et al.: Fine-grained vehicle classification using deep residual networks with multiscale attention windows. In: IEEE International Workshop on Multimedia Signal Processing, pp. 1ā6 (2017)
Liu, X., et al.: Fully convolutional attention localization networks: efficient attention localization for fine-grained recognition. arXiv preprint arXiv:1603.06765, vol. 1, no. 2 (2016)
Ruyi, J., et al.: Attention convolutional binary neural tree for fine-grained visual categorization. In: CVPR, pp. 10468ā10477 (2020)
Lin, W., et al.: Deep attention-based spatially recursive networks for fine-grained visual recognition. IEEE Trans. Cybern. 49(5), 1791ā1802 (2018)
Ye, Y., et al.: CAM: a fine-grained vehicle model recognition method based on visual attention model. Image Vis. Comput. 104, 104027 (2020)
Dubey, A., et al.: Maximum-entropy fine grained classification. In: Advances in Neural Information Processing Systems, vol. 31, pp. 635ā645 (2018)
Min, T., et al.: Fine-grained classification via hierarchical bilinear pooling with aggregated slack mask. IEEE Access 7, 117944ā117953 (2019)
Shijin, L.I., et al.: Research on fine-grain model recognition based on branch feedback convolution neural network. In: ICCSE, pp. 47ā51 (2019)
Cao, J., et al.: End-to-end view-aware vehicle classification via progressive CNN learning. In: CCF Chinese Conference on Computer Vision, pp. 729ā737 (2017)
Elkerdawy, S., Ray, N., Zhang, H.: Fine-grained vehicle classification with unsupervised parts co-occurrence learning. In: Leal-TaixĆ©, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11132, pp. 664ā670. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11018-5_54
Yanling, T., et al.: Selective multi-convolutional region feature extraction based iterative discrimination CNN for fine-grained vehicle model recognition. In: International Conference on Pattern Recognition (ICPR), pp. 3279ā3284 (2018)
Ye, Y., et al.: Embedding pose information for multiview vehicle model recognition. IEEE Trans. Circuits Syst. Video Technol. (TCSVT) (2022)
Changdong, Y., et al.: A method of enhancing data based on AT-PGGAN for fine-grained recognition of vehicle models. J. Image Graph. 25(3), 593ā604 (2020)
Ye, Y., et al.: A multilayer pyramid network based on learning for vehicle logo recognition. IEEE Trans. Intell. Transp. Syst. 22(5), 3123ā3134 (2021)
Ye, Y., et al.: Vehicle logo recognition based on overlapping enhanced patterns of oriented edge magnitudes. Comput. Electr. Eng. 71, 273ā283 (2018)
Tanveer, M., et al.: Fine-tuning DARTS for image classification. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 4789ā4796 (2021)
Zhang, Y., et al.: Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
Ā© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, W. et al. (2022). Dual-Rank Attention Module for Fine-Grained Vehicle Model Recognition. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13534. Springer, Cham. https://doi.org/10.1007/978-3-031-18907-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-18907-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18906-7
Online ISBN: 978-3-031-18907-4
eBook Packages: Computer ScienceComputer Science (R0)