Skip to main content
Log in

SimSANet: a simple sequential attention-aided deep neural network for vehicle make and model recognition

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Vehicle make and model recognition (VMMR) is a pivotal task for developing automatic vehicle recognition systems. In recent decades, this field has attracted significant attention from the computer vision and artificial intelligence communities. Previous research heavily emphasized improving recognition by focusing on implementing different types of attention mechanisms. The attention mechanism has demonstrated its effectiveness in considering features and uncovering distinctive local and global intricacies. However, one significant issue with this approach is that it increases complexity, which leads to a costly and pointless computational burden. To this end, we introduce a deep neural network model, called simple sequential attention network, which concurrently blends a sequential multi-kernel approach to achieve a trade-off between complexity and performance. This method reduces the computational load and adopts a faster approach while efficiently capturing essential information from feature maps of different scales, from local to global. To demonstrate the effectiveness of the proposed method, we conduct experiments on a variety of publicly accessible VMMR datasets, including Stanford Cars, Comprehensive Cars (CompCars), Comprehensive Cars Surveillance Nature (CompCarsSV), and the Vehicle Images dataset. The suggested approach performs better in the vehicle make and model recognition task than the most advanced models. With 94.47% accuracy on Stanford Cars, 98.34% on CompCars, 99.20% on CompCarsSV, and 97.20% on the Vehicle Images dataset, our model achieves state-of-the-art performance. The implementation details with the source code can be found at: https://github.com/JUVCSE/SIMSANET.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availibility

All datasets used in this work are publicly available.

Code availability

The implementation details and source code for this work are available via a GitHub link: https://github.com/JUVCSE/SIMSANET.

References

  1. Islam A, Mallik S, Roy A, Agrebi M, Singh PK (2023) A filter-based feature selection framework for vehicle/non-vehicle classification. Measurements and instrumentation for machine vision. Taylor, North Mankato, pp 677–684

    Google Scholar 

  2. Maity S, Chakraborty A, Singh PK, Sarkar R (2023) Performance comparison of various yolo models for vehicle detection: an experimental study. In: International conference on data analytics & management. Springer, 677–684

  3. Maity S, Bhattacharyya A, Singh PK, Kumar M, Sarkar R (2022) Last decade in vehicle detection and classification: a comprehensive survey. Arch Comput Methods Eng 29:1–38

    Article  MathSciNet  MATH  Google Scholar 

  4. Chougula B, Tigadi A, Manage P, Kulkarni S (2020) Road segmentation for autonomous vehicle: a review. In: 2020 3rd international conference on intelligent sustainable systems (ICISS). IEEE, pp 362–365

  5. Tian B, Morris BT, Tang M, Liu Y, Yao Y, Gou C, Shen D, Tang S (2014) Hierarchical and networked vehicle surveillance in its: a survey. IEEE Trans Intell Transp Syst 16(2):557–580

    Article  MATH  Google Scholar 

  6. Gayen S, Maity S, Singh PK, Geem ZW, Sarkar R (2023) Two decades of vehicle make and model recognition-survey, challenges and future directions. J King Saud Univ Comput Inf Sci 36:101885

    MATH  Google Scholar 

  7. Bhattacharyya A, Bhattacharya A, Maity S, Singh PK, Sarkar R (2023) Juvdsi v1: developing and benchmarking a new still image database in Indian scenario for automatic vehicle detection. Multimed Tools Appl 82:1–33

    Article  Google Scholar 

  8. Ali A, Sarkar R, Das DK (2023) Iruvd: a new still-image based dataset for automatic vehicle detection. Multimed Tools Appl 83:1–27

    MATH  Google Scholar 

  9. Lin H-Y, Tu K-C, Li C-Y (2020) Vaid: an aerial image dataset for vehicle detection and classification. IEEE Access 8:212209–212219

    Article  MATH  Google Scholar 

  10. Xu B, Wang B, Gu Y (2019) Vehicle detection in aerial images using modified yolo. In: 2019 IEEE 19th international conference on communication technology (ICCT). IEEE, pp 1669–1672

  11. Zhang X, Zhu X (2019) Vehicle detection in the aerial infrared images via an improved yolov3 network. In: 2019 IEEE 4th international conference on signal and image processing (ICSIP), pp 372–376. https://doi.org/10.1109/SIPROCESS.2019.8868430

  12. Maity S, Saha D, Singh PK, Sarkar R (2024) Juivcdv1: development of a still-image based dataset for Indian vehicle classification. Multimed Tools Appl 83:1–28

    Article  MATH  Google Scholar 

  13. Maity S, Singh PK, Kaplun D, Sarkar R (2024) Current datasets and their inherent challenges for automatic vehicle classification. Machine learning for cyber physical system: advances and challenges. Springer, Berlin, pp 377–406

    Chapter  MATH  Google Scholar 

  14. Li X, Yu L, Chang D, Ma Z, Cao J (2019) Dual cross-entropy loss for small-sample fine-grained vehicle classification. IEEE Trans Veh Technol 68(5):4204–4212

    Article  MATH  Google Scholar 

  15. Yang K, Hu X, Stiefelhagen R (2021) Is context-aware cnn ready for the surroundings? Panoramic semantic segmentation in the wild. IEEE Trans Image Process 30:1866–1881

    Article  MATH  Google Scholar 

  16. Keswani M, Ramakrishnan S, Reddy N, Balasubramanian VN (2022) Proto2proto: can you recognize the car, the way i do? In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10233–10243

  17. Ajitha P, Sivasangari A et al (2021) Vehicle model classification using deep learning. In: 2021 5th international conference on trends in electronics and informatics (ICOEI). IEEE, pp 1544–1548

  18. Lu L, Wang P, Cao Y (2022) A novel part-level feature extraction method for fine-grained vehicle recognition. Pattern Recogn 131:108869

    Article  MATH  Google Scholar 

  19. Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp 5209–5217

  20. Ji R, Wen L, Zhang L, Du D, Wu Y, Zhao C, Liu X, Huang F (2020) Attention convolutional binary neural tree for fine-grained visual categorization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10468–10477

  21. Zhao B, Wu X, Feng J, Peng Q, Yan S (2017) Diversified visual attention networks for fine-grained object classification. IEEE Trans Multimed 19(6):1245–1256

    Article  MATH  Google Scholar 

  22. Zheng H, Fu J, Zha Z-J, Luo J, Mei T (2019) Learning rich part hierarchies with progressive attention networks for fine-grained image recognition. IEEE Trans Image Process 29:476–488

    Article  MathSciNet  MATH  Google Scholar 

  23. Yang L, Zhang R-Y, Li L, Xie X (2021) Simam: a simple, parameter-free attention module for convolutional neural networks. In: International conference on machine learning. PMLR, pp 11863–11874

  24. Woo S, Park J, Lee J, Kweon IS (2018) CBAM: convolutional block attention module. CoRR abs/1807.06521 1807.06521

  25. Hu J, Shen L, Sun G (2017) Squeeze-and-excitation networks. CoRR abs/1709.01507 1709.01507

  26. Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561

  27. Yang L, Luo P, Change Loy C, Tang X (2015) A large-scale car dataset for fine-grained categorization and verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3973–3981

  28. Ali M, Tahir MA, Durrani MN (2022) Vehicle images dataset for make and model recognition. Data Brief 42:108107

    Article  MATH  Google Scholar 

  29. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  30. He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, Part IV 14. Springer, pp 630–645

  31. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  32. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848

  33. Xu S, Chang D, Xie J, Ma Z (2021) Grad-cam guided channel-spatial attention module for fine-grained visual classification. In: 2021 IEEE 31st international workshop on machine learning for signal processing (MLSP), pp 1–6. https://doi.org/10.1109/MLSP52302.2021.9596481

  34. Xu K, Lai R, Gu L, Li Y (2021) Multiresolution discriminative mixup network for fine-grained visual categorization. IEEE Trans Neural Netw Learn Syst 34:3488

    Article  MATH  Google Scholar 

  35. Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and construction learning for fine-grained image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5157–5166

  36. Luo W, Zhang H, Li J, Wei X-S (2020) Learning semantically enhanced feature for fine-grained image classification. IEEE Signal Process Lett 27:1545–1549

    Article  MATH  Google Scholar 

  37. Wang P, Cao Y, Lu L (2022) A novel part feature integration and fusion method for fine-grained vehicle recognition. In: ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1990–1994

  38. Yu F, Wang D, Shelhamer E, Darrell T (2018) Deep layer aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2403–2412

  39. Xu X, Mo J, Chen M (2021) Application optimization of fine-grained vehicle classification based on backbone network. In: 2021 27th international conference on mechatronics and machine vision in practice (M2VIP). IEEE, pp 54–59

  40. Hassan A, Ali M, Durrani NM, Tahir MA (2021) An empirical analysis of deep learning architectures for vehicle make and model recognition. IEEE Access 9:91487–91499

    Article  MATH  Google Scholar 

  41. Hu Q, Wang H, Li T, Shen C (2017) Deep cnns with spatially weighted pooling for fine-grained car recognition. IEEE Trans Intell Transp Syst 18(11):3147–3156

    Article  MATH  Google Scholar 

  42. Dai X, Southall B, Trinh N, Matei B (2017) Efficient fine-grained classification and part localization using one compact network. In: Proceedings of the ieee international conference on computer vision (ICCV) workshops

  43. Ma Z, Chang D, Li X (2019) Channel max pooling layer for fine-grained vehicle classification. arXiv preprint arXiv:1902.11107

  44. Zheng H, Fu J, Zha Z-J, Luo J (2019) Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5012–5021

  45. Ma X, Boukerche A (2020) An AI-based visual attention model for vehicle make and model recognition. In: 2020 IEEE symposium on computers and communications (ISCC). IEEE, pp 1–6

  46. Ji R, Wen L, Zhang L, Du D, Wu Y, Zhao C, Liu X, Huang F (2020) Attention convolutional binary neural tree for fine-grained visual categorization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  47. Boukerche A, Ma X (2021) A novel smart lightweight visual attention model for fine-grained vehicle recognition. IEEE Trans Intell Transp Syst 23(8):13846–13862

    Article  MATH  Google Scholar 

  48. Li M, Zhou G, Cai W, Li J, Li M, He M, Hu Y, Li L (2022) Multi-scale sparse network with cross-attention mechanism for image-based butterflies fine-grained classification. Appl Soft Comput 117:108419

    Article  MATH  Google Scholar 

  49. Dai X, Southall B, Trinh N, Matei B (2017) Efficient fine-grained classification and part localization using one compact network. In: Proceedings of the IEEE international conference on computer vision workshops, pp 996–1004

  50. Lu L, Cai Y, Huang H, Wang P (2023) An efficient fine-grained vehicle recognition method based on part-level feature optimization. Neurocomputing 536:40–49

    Article  MATH  Google Scholar 

  51. Tan SH, Chuah JH, Chow C-O, Kanesan J (2023) Coarse-to-fine context aggregation network for vehicle make and model recognition. IEEE Access

  52. Tian Y, Zhang W, Zhang Q, Lu G, Wu X (2018) Selective multi-convolutional region feature extraction based iterative discrimination cnn for fine-grained vehicle model recognition. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 3279–3284

  53. Amirkhani A, Barshooi AH (2022) Deepcar 5.0: vehicle make and model recognition under challenging conditions. IEEE Trans Intell Transp Syst 24(1):541–553

    Article  Google Scholar 

  54. Fang J, Zhou Y, Yu Y, Du S (2016) Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture. IEEE Trans Intell Transp Syst 18(7):1782–1792

    Article  MATH  Google Scholar 

  55. Zhang Q, Zhuo L, Zhang S, Li J, Zhang H, Li X (2018) Fine-grained vehicle recognition using lightweight convolutional neural network with combined learning strategy. In: 2018 IEEE fourth international conference on multimedia big data (BigMM). IEEE, pp 1–5

  56. Yu Y, Jin Q, Chen CW (2018) Ff-cmnet: a cnn-based model for fine-grained classification of car models based on feature fusion. In: 2018 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6

  57. Selvaraju R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2022) Grad-cam: visual explanations from deep networks via gradient-based localization. arxiv 2016. arXiv preprint arXiv:1610.02391

  58. Singh PK, Sarkar R, Nasipuri M (2016) Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets. Int J Comput Sci Math 7(5):410–442

    Article  MathSciNet  MATH  Google Scholar 

  59. Singh PK, Sarkar R, Nasipuri M (2015) Statistical validation of multiple classifiers over multiple datasets in the field of pattern recognition. Int J Appl Pattern Recognit 2(1):1–23

    Article  MATH  Google Scholar 

  60. Hossain A, Willan AR, Beyene J (2013) An improved method on Wilcoxon rank sum test for gene selection from microarray experiments. Commun Stat Simul Comput 42(7):1563–1577

    Article  MathSciNet  MATH  Google Scholar 

  61. Tabassum S, Ullah S, Al-Nur NH, Shatabda S (2020) Poribohon-bd: Bangladeshi local vehicle image dataset with annotation for classification. Data Brief 33:106465

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We are grateful to the CMATER Research Laboratory of the Department of Computer Science and Engineering, Jadavpur University, India, for providing indispensable help in furnishing essential infrastructure support.

Funding

Funding information is not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sourajit Maity.

Ethics declarations

Conflict of interest

There is no conflict of interest, according to the authors.

Ethical approval and consent to participate

This study does not involve any data that require ethical approval or informed consent.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gayen, S., Maity, S., Singh, P.K. et al. SimSANet: a simple sequential attention-aided deep neural network for vehicle make and model recognition. Neural Comput & Applic 37, 319–339 (2025). https://doi.org/10.1007/s00521-024-10480-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-024-10480-z

Keywords