Skip to main content
Log in

Fine-grained traffic video vehicle recognition based orientation estimation and temporal information

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a method for fine-grained vehicle recognition in traffic surveillance video. Compared with general theory about single image fine-grained recognition, this method focuses on multi-frame information combination and the viewpoint changes across videos. Firstly, we detect vehicle instances and their local frames in input traffic video by vehicle tracking. For each vehicle instance, pose estimation is used to extract the 3D orientation in corresponding frame. We encode the 3D orientation as an extra supervising clue, and merge it with CNN feature to show the appearance information and changes in moving process. In addition, recurrent neural network (RNN) is proposed to select abundant information over traffic video and fuse CNN feature of each vehicle frames into comprehensive feature which includes not only spatial information but also temporal information for fine-grained recognition. We do our experiments on the personal CarVideo dataset which collected by surveillance cameras and the open dataset BoxCar116k for performance evaluation. The experiments show that our method outperforms the state-of-the-art methods for fine-grained recognition in traffic video application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bae S-H, Yoon K-J (2014) Robust Online Multi-Object Tracking based on Tracklet Confidence and Online Discriminative Appearance Learning. In 2014 IEEE conference on computer vision and pattern recognition, Jun. 2014. : https://doi.org/10.1109/CVPR.2014.159

  2. Biglari M, Soleimani A, Hassanpour H (2017) A cascaded part-based system for fine-grained vehicle classification. IEEE Trans Intell Transp Syst 19, 1 (2018), 273–283. : https://doi.org/10.1109/TITS.2017.2749961

  3. Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and Construction Learning for Fine-grained Image Recognition. In 2019 IEEE/CVF conference on computer vision and pattern recognition, Jun. 2019. : https://doi.org/10.1109/CVPR.2019.00530

  4. Chen Q, Liu W, Yu X (2020) A viewpoint aware multi-task learning framework for fine-grained vehicle recognition. IEEE Access 8(2020):171912–171923. https://doi.org/10.1109/ACCESS.2020.3024658

    Article  Google Scholar 

  5. Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078, Jun. 2014. https://arxiv.org/abs/1406.1078

  6. Duan K, Parikh D, Crandall D, Grauman K (2012) Discovering localized attributes for fine-grained recognition. In 2012 IEEE conference on computer vision and pattern recognition, 2012. https://doi.org/10.1109/CVPR.2012.6248089

  7. Fang J, Yu Z, Yu Y, Du S (2016) Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture. IEEE Trans Intell Transp Syst 18, 7 (2017), 1782–1792. : https://doi.org/10.1109/TITS.2016.2620495

  8. Ge W, Lin X, Yu Y (2019) Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up. In 2019 IEEE/CVF conference on computer vision and pattern recognition, Jun 2019. : https://doi.org/10.1109/CVPR.2019.00315

  9. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In 2016 IEEE conference on computer vision and pattern recognition, Jun. 2016. : https://doi.org/10.1109/CVPR.2016.90

  10. Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  11. Hu H-N, Cai Q-Z, Wang D, Lin J, Sun M, Kraehenbuehl P, Darrell T, Yu F (2019) Joint Monocular 3D Vehicle Detection and Tracking. In 2019 IEEE/CVF international conference on computer vision, Oct. 2019. : https://doi.org/10.1109/ICCV.2019.00549

  12. Huang S, Xu Z, Tao D, Zhang Y (2016) Part-Stacked CNN for Fine-Grained Visual Categorization. In 2016 IEEE conference on computer vision and pattern recognition, Jun. 2016. : https://doi.org/10.1109/CVPR.2016.132

  13. Jianlong F, Zheng H, Mei T (2017) Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition. In 2017 IEEE conference on computer vision and pattern recognition, Jul. 2017. : https://doi.org/10.1109/CVPR.2017.476

  14. Krause J, Jin H, Yang J, Fei-Fei L (2015) Fine-grained recognition without part annotations. In 2015 IEEE conference on computer vision and pattern recognition, Jun. 2015. : https://doi.org/10.1109/CVPR.2015.7299194

  15. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105 http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

    Google Scholar 

  16. Kumaran SK, Mohapatra S, Dogra DP, Roy PP, Kim B-G (2019) Computer vision-guided intelligent traffic signaling for isolated intersections. Expert Syst Appl 134(2019):267–278. https://doi.org/10.1016/j.eswa.2019.05.049

    Article  Google Scholar 

  17. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  18. Li X, Yu L, Chang D, Ma Z, Cao J (2019) Dual cross-entropy loss for small-sample fine-grained vehicle classification. IEEE Trans Veh Technol 68(5):4204–4212. https://doi.org/10.1109/TVT.2019.2895651

    Article  Google Scholar 

  19. Liang L, Hu R, Xiao J, Wang Q, Xiao J, Chen J (2015) Exploiting effects of parts in fine-grained categorization of vehicles. In 2015 IEEE international conference on image processing, Sept. 2015. : https://doi.org/10.1109/ICIP.2015.7350898

  20. Lin D, Shen X, Lu C, Jia J (2015) Deep LAC: Deep Localization, Alignment and Classification for Fine-grained Recognition. In 2015 IEEE conference on computer vision and pattern recognition, Jun. 2015. : https://doi.org/10.1109/CVPR.2015.7298775

  21. Lin T-Y, RoyChowdhury A, Maji S (2015) Bilinear CNN models for fine-grained visual recognition. In 2015 IEEE international conference on computer vision, Dec. 2015. : https://doi.org/10.1109/ICCV.2015.170

  22. Lin T-Y, RoyChowdhury A, Maji S (2017) Bilinear convolutional neural networks for fine-grained visual recognition. IEEE Trans Pattern Anal Mach Intell 40, 6 (2018), 1309–1322. : https://doi.org/10.1109/TPAMI.2017.2723400

  23. Milan A, Roth S, Schindler K (2013) Continuous energy minimization for multitarget tracking. IEEE Trans Pattern Anal Mach Intell 36, 1 (2014), 58–72. : https://doi.org/10.1109/TPAMI.2013.103

  24. Rachmadi RF, Uchimura K, Koutaki G, Ogata K (2018) Hierarchical Spatial Pyramid Pooling for Fine-Grained Vehicle Classification. In 2018 International workshop on big data and information security, May. 2018. : https://doi.org/10.1109/IWBIS.2018.8471695

  25. Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39, 6 (2017), 1137–1149. : https://doi.org/10.1109/TPAMI.2016.2577031

  26. Simon M, Rodner E (2015) Neural activation constellations: Unsupervised part model discovery with convolutional networks. In 2015 IEEE international conference on computer vision, Dec. 2015. : https://doi.org/10.1109/ICCV.2015.136

  27. Simonyan K, Zisserman A (2014) Very deep convolutional net-works for large-scale image recognition. CoRR abs/1409.1556. arXiv:1409.1556, Sep. 2014. https://arxiv.org/abs/1409.1556

  28. Sochor J, Špaňhel J, Herout A (2018) BoxCars: improving fine-grained recognition of vehicles using 3-D bounding boxes in traffic surveillance. IEEE Trans Intell Transp Syst 20, 1 (2019), 97–108. : https://doi.org/10.1109/TITS.2018.2799228

  29. Tabernik D, Skočaj D (2020) Deep learning for large-scale traffic-sign detection and recognition. IEEE Trans Intell Transp Syst 21(4):1427–1440. https://doi.org/10.1109/TITS.2019.2913588

    Article  Google Scholar 

  30. Tang S, Andres B, Andriluka M, Schiele B (2015) Subgraph Decomposition for Multi-Target Tracking. In 2015 IEEE conference on computer vision and pattern recognition, Jun 2015. : https://doi.org/10.1109/CVPR.2015.7299138

  31. Xu Z, Tao D, Huang S, Zhang Y (2016) Friend or foe: fine-grained categorization with weak supervision. IEEE Trans Image Process 26(1):135–146. https://doi.org/10.1109/TIP.2016.2621661

    Article  MathSciNet  MATH  Google Scholar 

  32. Yao B, Bradski G, Fei-Fei L (2012) A codebook-free and annotation-free approach for fine-grained image categorization. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, Jun 2012. : https://doi.org/10.1109/CVPR.2012.6248088

  33. Yuan Y, Xiong Z, Wang Q (2019) VSSA-NET: vertical spatial sequence attention network for traffic sign detection. IEEE Trans Image Process 28(7):3423–3434. https://doi.org/10.1109/TIP.2019.2896952

    Article  MathSciNet  MATH  Google Scholar 

  34. Zhan J, Zhang H, Luo X (2014) Fine-grained Vehicle Recognition via Detection-Classification-Tracking in Surveillance Video. In 2014 5th international conference on digital home, Nov. 2014. : https://doi.org/10.1109/ICDH.2014.10

  35. Zhang Q, Zhuo L, Hu X, Zhang J (2016) Fine-grained Vehicle Recognition Using Hierarchical Fine-Tuning Strategy for Urban Surveillance Videos. In 2016 International conference on Progress in informatics and computing, Dec. 2016. : https://doi.org/10.1109/PIC.2016.7949501

  36. Zhang Q, Zhuo L, Zhang S, Li J, Zhang H, Li X (2018) Fine-grained Vehicle Recognition Using Lightweight Convolutional Neural Network with Combined Learning Strategy. In 2018 IEEE fourth international conference on multimedia big data, Sept. 2018. : https://doi.org/10.1109/BigMM.2018.8499085

  37. Zhang H, Liptrott M, Bessis N, Cheng J (2019) Real-time Traffic Analysis Using Deep Learning Techniques and UAV Based Video. In 2019 IEEE international conference on advanced video and signal based surveillance, Sep. 2019. : https://doi.org/10.1109/AVSS.2019.8909879

  38. Zheng H, Jianlong F, Mei T, Luo J (2017) Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition. In 2017 IEEE international conference on computer vision, Oct. 2017. : https://doi.org/10.1109/ICCV.2017.557

  39. Zhu Y, Liao M, Yang M, Liu W (2018) Cascaded segmentation-detection networks for text-based traffic sign detection. IEEE Trans Intell Transp Syst 19(1):209–219. https://doi.org/10.1109/TITS.2017.2768827

    Article  Google Scholar 

Download references

Acknowledgements

Supported by:The National Natural Science Foundation of China No. 42075139,42077232, 61272219; The National High Technology Research and Development Program of China No. 2007AA01Z334; The Science and technology program of Jiangsu Province No. BE2020082, BE2010072, BE2011058, BY2012190; The Program for New Century Excellent Talents in University of China No. NCET-04-04605; The China Postdoctoral Science Foundation No. 2017 M621700 and Innovation Fund of State Key Laboratory for Novel Software Technology No. ZZKT2021A17.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhengxing Sun or Qian Li.

Ethics declarations

Conflicts of interests/Competing interests

There are no conflicts of interests/competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, A., Sun, Z., Li, Q. et al. Fine-grained traffic video vehicle recognition based orientation estimation and temporal information. Multimed Tools Appl 82, 13745–13763 (2023). https://doi.org/10.1007/s11042-022-13811-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13811-1

Keywords

Navigation