Abstract
Vehicle re-identification has become an important research topic for its application prospect in real-world, such as intelligent security system and intelligent traffic management. The current vehicle re-identification algorithms mainly run on image-based datasets, while video-based datasets are very rare in the community. Therefore we collect a dataset named as Veri-Video-763, including 763 vehicle IDs and 5828 tracks. In addition, we propose a channel decomposition saliency region network, including three modules to improve the video-based vehicle re-identification. The channel decomposition saliency region extraction (CDSRE) module generate significant masks to detect multiple significant local regions by channel decomposition. The global-local stacking module encode the convolutional features of the salient regions and the global pooling feature together into re-identification feature vectors. The distributed symmetric sampling (DSS) module propose a novel video clip sampling algorithm to improve the unity and difference of the video clips. Extensive experiments demonstrate the effectiveness of our proposed methods, and thus can be considered as one strong baseline. Dataset and code are available on https://github.com/wyf27/Veri-Video-763.
Similar content being viewed by others
References
Yang L, Luo P, Loy CC, Tang X (2015) A large-scale car dataset for fine-grained categorization and verification. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3973–3981
Liu X, Liu W, Ma H, Fu H (2016) Large-scale vehicle re-identification in urban surveillance videos. In: 2016 IEEE international conference on multimedia and expo (ICME), pp 1–6
Guo H, Zhao C, Liu Z, Wang J, Lu H (2018) Learning coarse-to-fine structured feature embedding for vehicle re-identification. In: AAAI
Kanaci A, Zhu X, Gong S (2018) Vehicle re-identification in context. arXiv:1809.09409
Liu X, Liu W, Mei T, Ma H (2018) Provid: progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia 20:645–658
Wang P, Jiao B, Yang L, Yang Y, Zhang S, Wei W, Zhang Y (2019) Vehicle re-identification in aerial imagery: dataset and approach. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 460–469
Sironi A, Brambilla M, Bourdis N, Lagorce X, Benosman R (2018) Hats: histograms of averaged time surfaces for robust event-based object classification. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 1731–1740
Yan K, Tian Y, Wang Y, Zeng W, Huang T (2017) Exploiting multi-grain ranking constraints for precisely searching visually-similar vehicles. In: 2017 IEEE international conference on computer vision (ICCV), pp 562–570
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: 2013 IEEE international conference on computer vision workshops, pp 554–561
Dong Z, Wu Y, Pei M, Jia Y (2015) Vehicle type classification using a semisupervised convolutional neural network. IEEE Transactions on Intelligent Transportation Systems 16:2247–2256
Sochor J, Herout A (2015) Unsupervised processing of vehicle appearance for automatic understanding in traffic surveillance. In: 2015 International conference on digital image computing: techniques and applications (DICTA), pp 1–8
Sochor J, Pǎnhel J, Herout A (2019) Boxcars: improving fine-grained recognition of vehicles using 3-d bounding boxes in traffic surveillance. IEEE Transactions on Intelligent Transportation Systems 20:97–108
Feris R, Siddiquie B, Petterson J, Zhai Y, Datta A, Brown L, Pankanti S (2012) Large-scale vehicle detection, indexing, and search in urban surveillance videos. IEEE Transactions on Multimedia 14:28–42
Liu X, Liu W, Mei T, Ma H (2016) A deep learning-based approach to progressive vehicle re-identification for urban surveillance. ECCV
Meng D, Li L, Liu X, Li Y, Yang S, Zha Z-J, Gao X, Wang S, Huang Q (2020) Parsing-based view-aware embedding network for vehicle re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7103–7112
Chen B, Deng W, Hu J (2019) Mixed high-order attention network for person re-identification. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 371–381
Khorramshahi P, Peri N, Chen J-C, Chellappa R (2020) The devil is in the details: self-supervised attention for vehicle re-identification. arXiv:2004.06271
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ji Z, Zou X, Lin X, Liu X, Huang T, Wu S (2020) An attention-driven two-stage clustering method for unsupervised person re-identification. In: ECCV
Khorramshahi P, Kumar A, Peri N, Rambhatla SS, Chen J-C, Chellappa R (2019) A dual-path model with adaptive attention for vehicle re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6132–6141
He B, Li J, Zhao Y, Tian Y (2019) Part-regularized near-duplicate vehicle re-identification. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3992–4000
Chu R, Sun Y, Li Y, Liu Z, Zhang C, Wei Y (2019) Vehicle re-identification with viewpoint-aware metric learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8282–8291
Kuma R, Weill E, Aghdasi F, Sriram P (2019) Vehicle re-identification: an efficient baseline using triplet embedding. In: 2019 International joint conference on neural networks (IJCNN). IEEE, pp 1–9
Wang Y, Li H, Wei Y, Wang C, Wang L (2020) Vehicle re-identification based on unsupervised local area detection and view discrimination. Image and Vision Computing 104:104008
Qiu Z, Yao T, Mei T (2017) Learning spatio-temporal representation with pseudo-3d residual networks. In: Proceedings of the IEEE international conference on computer vision, pp 5533–5541
Carreira J, Zisserman A (2017) Quo vadis, action recognition? a new model and the kinetics dataset. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR)
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Guo H, Zhu K, Tang M, Wang J (2019) Two-level attention network with multi-grain ranking loss for vehicle re-identification. IEEE Transactions on Image Processing 28(9):4328–4338
Liu X, Zhang S, Huang Q, Gao W (2018) Ram: a region-aware deep model for vehicle re-identification. In: 2018 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6
Gao C, Hu Y, Zhang Y, Yao R, Zhou Y, Zhao J (2020) Vehicle re-identification based on complementary features. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 590–591
Liu X, Liu W, Mei T, Ma H (2016) A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In: European conference on computer vision. Springer, pp 869–884
Liu H, Tian Y, Wang Y, Pang L, Huang T (2016) Deep relative distance learning: tell the difference between similar vehicles. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2167–2175
Lou Y, Bai Y, Liu J, Wang S, Duan L-Y (2019) Veri-wild: a large dataset and a new method for vehicle re-identification in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3235–3243
Zhang Z, Lan C, Zeng W, Chen Z (2020) Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10 407–10 416
Yan Y, Qin J, Chen J, Liu L, Zhu F, Tai Y, Shao L (2020) Learning multi-granular hypergraphs for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2899–2908
Ng A, Jordan M, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems 14:849–856
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. arXiv:1912.01703
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11):2579–2605
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI Conf Artif Intell 31(1):4278–4284
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
Li J, Zhang S, Huang T (2019) Multi-scale 3d convolution network for video based person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence 33(01):8618–8625
Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2019) Vrstc: occlusion-free video person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7183–7192
Hou R, Chang H, Ma B, Shan S, Chen X (2020) Temporal complementary learning for video person re-identification. In: European conference on computer vision. Springer, pp 388–405
Gu X, Chang H, Ma B, Zhang H, Chen X (2020) Appearance-preserving 3d convolution for video-based person re-identification. In: European conference on computer vision. Springer, pp 228–243
Zheng L, Bie Z, Sun Y, Wang J, Su C, Wang S, Tian Q (2016) Mars: a video benchmark for large-scale person re-identification. In: European conference on computer vision. Springer, pp 868–884
Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 369–378
Chen D, Li H, Xiao T, Yi S, Wang X (2018) Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1169–1178
Li J, Wang J, Tian Q, Gao W, Zhang S (2019) Global-local temporal representations for video person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3958–3967
Zhao Y, Shen X, Jin Z, Lu H, Hua X-s (2019) Attribute-driven feature disentangling and temporal aggregation for video person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4913–4922
Si J, Zhang H, Li C-G, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5363–5372
Liu C-T, Wu C-W, Wang Y-CF, Chien S-Y (2019) Spatially and temporally efficient non-local attention network for video-based person re-identification. arXiv:1908.01683
Wei XS, Luo JH, Wu J (2016) Selective convolutional descriptor aggregation for fine-grained image retrieval
Yu W, Lin Y, Dong X, Yan Y, Yi Y (2018) Exploit the unknown gradually: one-shot video-based person re-identification by stepwise learning. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant NO.61871106.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Y., Gong, B., Wei, Y. et al. Video-based vehicle re-identification via channel decomposition saliency region network. Appl Intell 52, 12609–12629 (2022). https://doi.org/10.1007/s10489-021-03096-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-03096-6