SimSANet: a simple sequential attention-aided deep neural network for vehicle make and model recognition

Gayen, Soumyajit; Maity, Sourajit; Singh, Pawan Kumar; Sarkar, Ram

doi:10.1007/s00521-024-10480-z

SimSANet: a simple sequential attention-aided deep neural network for vehicle make and model recognition

Original Article
Published: 18 November 2024

Volume 37, pages 319–339, (2025)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

339 Accesses
Explore all metrics

Abstract

Vehicle make and model recognition (VMMR) is a pivotal task for developing automatic vehicle recognition systems. In recent decades, this field has attracted significant attention from the computer vision and artificial intelligence communities. Previous research heavily emphasized improving recognition by focusing on implementing different types of attention mechanisms. The attention mechanism has demonstrated its effectiveness in considering features and uncovering distinctive local and global intricacies. However, one significant issue with this approach is that it increases complexity, which leads to a costly and pointless computational burden. To this end, we introduce a deep neural network model, called simple sequential attention network, which concurrently blends a sequential multi-kernel approach to achieve a trade-off between complexity and performance. This method reduces the computational load and adopts a faster approach while efficiently capturing essential information from feature maps of different scales, from local to global. To demonstrate the effectiveness of the proposed method, we conduct experiments on a variety of publicly accessible VMMR datasets, including Stanford Cars, Comprehensive Cars (CompCars), Comprehensive Cars Surveillance Nature (CompCarsSV), and the Vehicle Images dataset. The suggested approach performs better in the vehicle make and model recognition task than the most advanced models. With 94.47% accuracy on Stanford Cars, 98.34% on CompCars, 99.20% on CompCarsSV, and 97.20% on the Vehicle Images dataset, our model achieves state-of-the-art performance. The implementation details with the source code can be found at: https://github.com/JUVCSE/SIMSANET.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios

Multi-Vehicle Tracking Using Heterogeneous Neural Networks for Appearance And Motion Features

Article 10 August 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Data availibility

All datasets used in this work are publicly available.

Code availability

The implementation details and source code for this work are available via a GitHub link: https://github.com/JUVCSE/SIMSANET.

References

Islam A, Mallik S, Roy A, Agrebi M, Singh PK (2023) A filter-based feature selection framework for vehicle/non-vehicle classification. Measurements and instrumentation for machine vision. Taylor, North Mankato, pp 677–684
Google Scholar
Maity S, Chakraborty A, Singh PK, Sarkar R (2023) Performance comparison of various yolo models for vehicle detection: an experimental study. In: International conference on data analytics & management. Springer, 677–684
Maity S, Bhattacharyya A, Singh PK, Kumar M, Sarkar R (2022) Last decade in vehicle detection and classification: a comprehensive survey. Arch Comput Methods Eng 29:1–38
Article MathSciNet MATH Google Scholar
Chougula B, Tigadi A, Manage P, Kulkarni S (2020) Road segmentation for autonomous vehicle: a review. In: 2020 3rd international conference on intelligent sustainable systems (ICISS). IEEE, pp 362–365
Tian B, Morris BT, Tang M, Liu Y, Yao Y, Gou C, Shen D, Tang S (2014) Hierarchical and networked vehicle surveillance in its: a survey. IEEE Trans Intell Transp Syst 16(2):557–580
Article MATH Google Scholar
Gayen S, Maity S, Singh PK, Geem ZW, Sarkar R (2023) Two decades of vehicle make and model recognition-survey, challenges and future directions. J King Saud Univ Comput Inf Sci 36:101885
MATH Google Scholar
Bhattacharyya A, Bhattacharya A, Maity S, Singh PK, Sarkar R (2023) Juvdsi v1: developing and benchmarking a new still image database in Indian scenario for automatic vehicle detection. Multimed Tools Appl 82:1–33
Article Google Scholar
Ali A, Sarkar R, Das DK (2023) Iruvd: a new still-image based dataset for automatic vehicle detection. Multimed Tools Appl 83:1–27
MATH Google Scholar
Lin H-Y, Tu K-C, Li C-Y (2020) Vaid: an aerial image dataset for vehicle detection and classification. IEEE Access 8:212209–212219
Article MATH Google Scholar
Xu B, Wang B, Gu Y (2019) Vehicle detection in aerial images using modified yolo. In: 2019 IEEE 19th international conference on communication technology (ICCT). IEEE, pp 1669–1672
Zhang X, Zhu X (2019) Vehicle detection in the aerial infrared images via an improved yolov3 network. In: 2019 IEEE 4th international conference on signal and image processing (ICSIP), pp 372–376. https://doi.org/10.1109/SIPROCESS.2019.8868430
Maity S, Saha D, Singh PK, Sarkar R (2024) Juivcdv1: development of a still-image based dataset for Indian vehicle classification. Multimed Tools Appl 83:1–28
Article MATH Google Scholar
Maity S, Singh PK, Kaplun D, Sarkar R (2024) Current datasets and their inherent challenges for automatic vehicle classification. Machine learning for cyber physical system: advances and challenges. Springer, Berlin, pp 377–406
Chapter MATH Google Scholar
Li X, Yu L, Chang D, Ma Z, Cao J (2019) Dual cross-entropy loss for small-sample fine-grained vehicle classification. IEEE Trans Veh Technol 68(5):4204–4212
Article MATH Google Scholar
Yang K, Hu X, Stiefelhagen R (2021) Is context-aware cnn ready for the surroundings? Panoramic semantic segmentation in the wild. IEEE Trans Image Process 30:1866–1881
Article MATH Google Scholar
Keswani M, Ramakrishnan S, Reddy N, Balasubramanian VN (2022) Proto2proto: can you recognize the car, the way i do? In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10233–10243
Ajitha P, Sivasangari A et al (2021) Vehicle model classification using deep learning. In: 2021 5th international conference on trends in electronics and informatics (ICOEI). IEEE, pp 1544–1548
Lu L, Wang P, Cao Y (2022) A novel part-level feature extraction method for fine-grained vehicle recognition. Pattern Recogn 131:108869
Article MATH Google Scholar
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp 5209–5217
Ji R, Wen L, Zhang L, Du D, Wu Y, Zhao C, Liu X, Huang F (2020) Attention convolutional binary neural tree for fine-grained visual categorization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10468–10477
Zhao B, Wu X, Feng J, Peng Q, Yan S (2017) Diversified visual attention networks for fine-grained object classification. IEEE Trans Multimed 19(6):1245–1256
Article MATH Google Scholar
Zheng H, Fu J, Zha Z-J, Luo J, Mei T (2019) Learning rich part hierarchies with progressive attention networks for fine-grained image recognition. IEEE Trans Image Process 29:476–488
Article MathSciNet MATH Google Scholar
Yang L, Zhang R-Y, Li L, Xie X (2021) Simam: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning. PMLR, pp 11863–11874
Woo S, Park J, Lee J, Kweon IS (2018) CBAM: convolutional block attention module. CoRR abs/1807.06521 1807.06521
Hu J, Shen L, Sun G (2017) Squeeze-and-excitation networks. CoRR abs/1709.01507 1709.01507
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561
Yang L, Luo P, Change Loy C, Tang X (2015) A large-scale car dataset for fine-grained categorization and verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3973–3981
Ali M, Tahir MA, Durrani MN (2022) Vehicle images dataset for make and model recognition. Data Brief 42:108107
Article MATH Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, Part IV 14. Springer, pp 630–645
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Xu S, Chang D, Xie J, Ma Z (2021) Grad-cam guided channel-spatial attention module for fine-grained visual classification. In: 2021 IEEE 31st international workshop on machine learning for signal processing (MLSP), pp 1–6. https://doi.org/10.1109/MLSP52302.2021.9596481
Xu K, Lai R, Gu L, Li Y (2021) Multiresolution discriminative mixup network for fine-grained visual categorization. IEEE Trans Neural Netw Learn Syst 34:3488
Article MATH Google Scholar
Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and construction learning for fine-grained image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5157–5166
Luo W, Zhang H, Li J, Wei X-S (2020) Learning semantically enhanced feature for fine-grained image classification. IEEE Signal Process Lett 27:1545–1549
Article MATH Google Scholar
Wang P, Cao Y, Lu L (2022) A novel part feature integration and fusion method for fine-grained vehicle recognition. In: ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1990–1994
Yu F, Wang D, Shelhamer E, Darrell T (2018) Deep layer aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2403–2412
Xu X, Mo J, Chen M (2021) Application optimization of fine-grained vehicle classification based on backbone network. In: 2021 27th international conference on mechatronics and machine vision in practice (M2VIP). IEEE, pp 54–59
Hassan A, Ali M, Durrani NM, Tahir MA (2021) An empirical analysis of deep learning architectures for vehicle make and model recognition. IEEE Access 9:91487–91499
Article MATH Google Scholar
Hu Q, Wang H, Li T, Shen C (2017) Deep cnns with spatially weighted pooling for fine-grained car recognition. IEEE Trans Intell Transp Syst 18(11):3147–3156
Article MATH Google Scholar
Dai X, Southall B, Trinh N, Matei B (2017) Efficient fine-grained classification and part localization using one compact network. In: Proceedings of the ieee international conference on computer vision (ICCV) workshops
Ma Z, Chang D, Li X (2019) Channel max pooling layer for fine-grained vehicle classification. arXiv preprint arXiv:1902.11107
Zheng H, Fu J, Zha Z-J, Luo J (2019) Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5012–5021
Ma X, Boukerche A (2020) An AI-based visual attention model for vehicle make and model recognition. In: 2020 IEEE symposium on computers and communications (ISCC). IEEE, pp 1–6
Ji R, Wen L, Zhang L, Du D, Wu Y, Zhao C, Liu X, Huang F (2020) Attention convolutional binary neural tree for fine-grained visual categorization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Boukerche A, Ma X (2021) A novel smart lightweight visual attention model for fine-grained vehicle recognition. IEEE Trans Intell Transp Syst 23(8):13846–13862
Article MATH Google Scholar
Li M, Zhou G, Cai W, Li J, Li M, He M, Hu Y, Li L (2022) Multi-scale sparse network with cross-attention mechanism for image-based butterflies fine-grained classification. Appl Soft Comput 117:108419
Article MATH Google Scholar
Dai X, Southall B, Trinh N, Matei B (2017) Efficient fine-grained classification and part localization using one compact network. In: Proceedings of the IEEE international conference on computer vision workshops, pp 996–1004
Lu L, Cai Y, Huang H, Wang P (2023) An efficient fine-grained vehicle recognition method based on part-level feature optimization. Neurocomputing 536:40–49
Article MATH Google Scholar
Tan SH, Chuah JH, Chow C-O, Kanesan J (2023) Coarse-to-fine context aggregation network for vehicle make and model recognition. IEEE Access
Tian Y, Zhang W, Zhang Q, Lu G, Wu X (2018) Selective multi-convolutional region feature extraction based iterative discrimination cnn for fine-grained vehicle model recognition. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 3279–3284
Amirkhani A, Barshooi AH (2022) Deepcar 5.0: vehicle make and model recognition under challenging conditions. IEEE Trans Intell Transp Syst 24(1):541–553
Article Google Scholar
Fang J, Zhou Y, Yu Y, Du S (2016) Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture. IEEE Trans Intell Transp Syst 18(7):1782–1792
Article MATH Google Scholar
Zhang Q, Zhuo L, Zhang S, Li J, Zhang H, Li X (2018) Fine-grained vehicle recognition using lightweight convolutional neural network with combined learning strategy. In: 2018 IEEE fourth international conference on multimedia big data (BigMM). IEEE, pp 1–5
Yu Y, Jin Q, Chen CW (2018) Ff-cmnet: a cnn-based model for fine-grained classification of car models based on feature fusion. In: 2018 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6
Selvaraju R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2022) Grad-cam: visual explanations from deep networks via gradient-based localization. arxiv 2016. arXiv preprint arXiv:1610.02391
Singh PK, Sarkar R, Nasipuri M (2016) Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets. Int J Comput Sci Math 7(5):410–442
Article MathSciNet MATH Google Scholar
Singh PK, Sarkar R, Nasipuri M (2015) Statistical validation of multiple classifiers over multiple datasets in the field of pattern recognition. Int J Appl Pattern Recognit 2(1):1–23
Article MATH Google Scholar
Hossain A, Willan AR, Beyene J (2013) An improved method on Wilcoxon rank sum test for gene selection from microarray experiments. Commun Stat Simul Comput 42(7):1563–1577
Article MathSciNet MATH Google Scholar
Tabassum S, Ullah S, Al-Nur NH, Shatabda S (2020) Poribohon-bd: Bangladeshi local vehicle image dataset with annotation for classification. Data Brief 33:106465
Article MATH Google Scholar

Download references

Acknowledgements

We are grateful to the CMATER Research Laboratory of the Department of Computer Science and Engineering, Jadavpur University, India, for providing indispensable help in furnishing essential infrastructure support.

Funding

Funding information is not applicable.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Jadavpur University, Kolkata, West Bengal, 700032, India
Soumyajit Gayen, Sourajit Maity & Ram Sarkar
Department of Information Technology, Jadavpur University, Jadavpur University Second Campus, Plot No. 8, Salt Lake Bypass, LB Block, Sector III, Salt Lake City, West Bengal, Kolkata, 700106, India
Pawan Kumar Singh

Authors

Soumyajit Gayen
View author publications
You can also search for this author inPubMed Google Scholar
Sourajit Maity
View author publications
You can also search for this author inPubMed Google Scholar
Pawan Kumar Singh
View author publications
You can also search for this author inPubMed Google Scholar
Ram Sarkar
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sourajit Maity.

Ethics declarations

Conflict of interest

There is no conflict of interest, according to the authors.

Ethical approval and consent to participate

This study does not involve any data that require ethical approval or informed consent.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gayen, S., Maity, S., Singh, P.K. et al. SimSANet: a simple sequential attention-aided deep neural network for vehicle make and model recognition. Neural Comput & Applic 37, 319–339 (2025). https://doi.org/10.1007/s00521-024-10480-z

Download citation

Received: 15 June 2024
Accepted: 19 September 2024
Published: 18 November 2024
Issue Date: January 2025
DOI: https://doi.org/10.1007/s00521-024-10480-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SimSANet: a simple sequential attention-aided deep neural network for vehicle make and model recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios

Multi-Vehicle Tracking Using Heterogeneous Neural Networks for Appearance And Motion Features

Explore related subjects

Data availibility

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval and consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now