Abstract
Over the past decade, a more widespread area of computer vision research has been person re-identification (P-Reid). This technology is applied in fields such as pedestrian tracking, security, and video surveillance. Currently, person re-identification performs well when supervised with labeled data, but accuracy frequently suffers when learning unsupervised on unlabeled samples. Therefore, improving unlabeled samples model is a challenging endeavor. In order to solve this problem, we propose a progressive spatial–temporal transfer model (PSTT), which consists of three stages, including incremental tuning, spatial–temporal fusion and target domain learning. In the first stage, a high-performance multi-scale network that can initially cluster samples is obtained through triplet loss function. In the next stage, to mine spatial–temporal and visual semantic information, we introduce a fusion model that fuses the visual information extracted from the labeled dataset and the unlabeled dataset using a trained network with its spatial–temporal information. In the final stage, with the assistance of fusion model, we employ a strategy that extends learning from labeled to unlabeled samples. During the training, the fusion model is used to select labeled and unlabeled samples, and multiple meta loss function is used for transfer learning. During the testing, the fusion model is employed to enhance the accuracy of network. In the experiment, we evaluate our method on five standard P-Reid benchmarks: Market1501, DukeMTMC-ReID, CUHK03, MSMT17 and Occluded-DukeMTMC. Extensive experiments show that our proposed PSTT achieves state-of-the-art performance, exceeding the previous method by a certain margin. The source code is available at https://github.com/LiZX12/PSTT.
Similar content being viewed by others
Availability of data and materials
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Barz B, Rodner E, Garcia YG, Denzler J (2018) Detecting regions of maximal divergence for spatio-temporal anomaly detection. IEEE Trans Pattern Anal Mach Intell 41:1088–1101
Chen H, Lagadec B, Bremond F (2021a) Ice: inter-instance contrastive encoding for unsupervised person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 14960–14969
Chen L, Yang H, Gao Z (2019) Joint attentive spatial–temporal feature aggregation for video-based person re-identification. IEEE Access 7:41230–41240
Chen P, Liu W, Dai P, Liu J, Ye Q, Xu M, Chen Q, Ji R (2021b) Occlude them all: occlusion-aware attention network for occluded person re-id. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 11833–11842
Cho Y, Kim W.J, Hong S, Yoon SE (2022) Part-based pseudo label refinement for unsupervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7308–7318
Dai Y, Liu J, Sun Y, Tong Z, Zhang C, Duan LY, (2021) Idm: an intermediate domain module for domain adaptive person re-id. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 11864–11874
Ding G, Zhang S, Khan S, Tang Z, Zhang J, Porikli F (2019) Feature affinity-based pseudo labeling for semi-supervised person re-identification. IEEE Trans Multimedia 21:2891–2902
Fu Y, Wei Y, Zhou Y, Shi H, Huang G, Wang X, Yao Z, Huang T (2019) Horizontal pyramid matching for person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, pp 8295–8302
Ge Y, Chen D, Li H, (2020) Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526
Gómez-Silva MJ, Izquierdo E, Adl E, Armingol JM (2019) Transferring learning from multi-person tracking to person re-identification. Integr Comput Aided Eng 26:329–344
Gupta A, Pawade P, Balakrishnan R (2022) Deep residual network and transfer learning-based person re-identification. Intell Syst Appl 16:200137
Han J, Li YL, Wang S (2022) Delving into probabilistic uncertainty for unsupervised domain adaptive person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, pp 790–798
Han K, Huang Y, Chen Z, Wang L, Tan T (2020) Prediction and recovery for adaptive low-resolution person re-identification. In: Computer vision—ECCV 2020: 16th European conference Glasgow UK August 23–28 2020 proceedings part XXVI 16, Springer, pp 193–209
He S, Luo H, Wang P, Wang F, Li H, Jiang W (2021) Transreid: transformer-based object re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 15013–15022
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737
Hu R, Wang T, Zhou Y, Snoussi H, Cherouat A (2021) FT-MDnet: a deep-frozen transfer learning framework for person search. IEEE Trans Inf Forensics Secur 16:4721–4732
Huang Y, Fu X, Zha ZJ (2021) Attack-guided perceptual data generation for real-world re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 215–224
Huang Y, Peng P, Jin Y, Li Y, Xing J (2020) Domain adaptive attention learning for unsupervised person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, pp 11069–11076
Isobe T, Li D, Tian L, Chen W, Shan Y, Wang S (2021) Towards discriminative representation learning for unsupervised person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8526–8536
Jiao B, Liu L, Gao L, Lin G, Yang L, Zhang S, Wang P, Zhang Y (2022) Dynamically transformed instance normalization network for generalizable person re-identification. In: European conference on computer vision. Springer, pp 285–301
Khan SU, Haq IU, Khan N, Muhammad K, Hijji M, Baik SW (2022) Learning to rank: an intelligent system for person reidentification. Int J Intell Syst 37:5924–5948
Lejbølle AR, Nasrollahi K, Moeslund TB (2018) Enhancing person re-identification by late fusion of low-mid-and high-level features. Iet Biom 7:125–135
Li Q, Peng X, Qiao Y, Hao Q (2022) Unsupervised person re-identification with multi-label learning guided self-paced clustering. Pattern Recogn 125:108521
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 152–159
Lin X, Ren P, Yeh CH, Yao L, Song A, Chang X (2021) Unsupervised person re-identification: a systematic survey of challenges and solutions. arXiv preprint arXiv:2109.06057
Lin Y, Dong X, Zheng L, Yan Y, Yang Y (2019) A bottom-up clustering approach to unsupervised person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, pp 8738–8745
Lin Y, Xie L, Wu Y, Yan C, Tian Q (2020) Unsupervised person re-identification via softened similarity learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3390–3399
Liu Y, Yuan Z, Zhou W, Li H (2019) Spatial and temporal mutual promotion for video-based person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, pp 8786–8793
Lv J, Chen W, Li Q, Yang C (2018) Unsupervised cross-dataset person re-identification by transfer learning of spatial–temporal patterns. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7948–7956
Pang Z, Wang C, Wang J, Zhao L (2023) Reliability modeling and contrastive learning for unsupervised person re-identification. Knowl Based Syst 263:110263
Pu N, Chen W, Liu Y, Bakker EM, Lew MS (2020) Dual gaussian-based variational subspace disentanglement for visible-infrared person re-identification. In: Proceedings of the 28th ACM international conference on multimedia, pp 2149–2158
Pu N, Chen W, Liu Y, Bakker EM, Lew MS (2021) Lifelong person re-identification via adaptive knowledge accumulation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7901–7910
Pu N, Zhong Z, Sebe N, Lew MS (2023) A memorizing and generalizing framework for lifelong person re-identification. IEEE Trans Pattern Anal Mach Intell
Qi L, Wang L, Huo J, Shi Y, Gao Y (2020) Progressive cross-camera soft-label learning for semi-supervised person re-identification. IEEE Trans Circuits Syst Video Technol 30:2815–2829
Qian R, Meng T, Gong B, Yang MH, Wang H, Belongie S, Cui Y (2021)Spatiotemporal contrastive video representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6964–6974
Ren M, He L, Liao X, Liu W, Wang Y, Tan T (2021) Learning instance-level spatial-temporal patterns for person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14930–14939
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target multi-camera tracking. In: European conference on computer vision. Springer, pp 17–35
Sun J, Li Y, Chen H, Zhu X, Peng Y, Peng Y (2022) Inter-cluster and intra-cluster joint optimization for unsupervised cross-domain person re-identification. Knowl-Based Syst 251:109162
Walker W.I, Soulat H, Yu C, Sahani M (2023) Unsupervised representation learning with recognition-parametrised probabilistic models. In: International conference on artificial intelligence and statistics, PMLR, pp 4209–4230
Wang D, Zhang S (2020) Unsupervised person re-identification via multi-label classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10981–10990
Wang G, Lai J, Huang P, Xie X (2019) Spatial–temporal person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, pp 8933–8940
Wang M, Li J, Lai B, Gong X, Hua XS (2022) Offline-online associated camera-aware proxies for unsupervised person re-identification. IEEE Trans Image Process 31:6548–6561
Wang Y, Li X, Jiang M, Zhang H, Tang E (2020) Cross-view pedestrian clustering via graph convolution network for unsupervised person re-identification. J Intell Fuzzy Syst 39:4453–4462
Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer GAN to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 79–88
Wei W, Yang W, Zuo E, Qian Y, Wang L (2022) Person re-identification based on deep learning—an overview. J Vis Commun Image Represent 82:103418
Wu C, Ge W, Wu A, Chang X (2022a) Camera-conditioned stable feature generation for isolated camera supervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 20238–20248
Wu D, Wang C, Wu Y, Wang QC, Huang DS (2021) Attention deep model with multi-scale deep supervision for person re-identification. IEEE Trans Emerg Top Comput Intell 5:70–78
Wu Y, Huang T, Yao H, Zhang C, Shao Y, Han C, Gao C, Sang N (2022b) Multi-centroid representation network for domain adaptive person re-id. In: Proceedings of the AAAI conference on artificial intelligence, pp 2750–2758
Xi J, Zhou Q, Li X, Zheng S (2022) Momentum source-proxy guided initialization for unsupervised domain adaptive person re-identification. Neurocomputing 483:116–126
Xuan S, Zhang S (2021) Intra-inter camera similarity for unsupervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11926–11935
Yan C, Luo M, Liu W, Zheng Q (2018) Robust dictionary learning with graph regularization for unsupervised person re-identification. Multimedia Tools Appl 77:3553–3577
Yang F, Yan K, Lu S, Jia H, Xie D, Yu Z, Guo X, Huang F, Gao W (2020) Part-aware progressive unsupervised domain adaptation for person re-identification. IEEE Trans Multimedia 23:1681–1695
Yang J, Zheng W.S, Yang Q, Chen Y.C, Tian Q (2020b) Spatial–temporal graph convolutional network for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3289–3299
Yao Z, Wang Y, Long M, Wang J (2020) Unsupervised transfer learning for spatiotemporal predictive networks. In: International conference on machine learning, PMLR, pp 10778–10788
Ye M, Li H, Du B, Shen J, Shao L, Hoi SC (2021) Collaborative refining for person re-identification with label noise. IEEE Trans Image Process 31:379–391
Zhai Y, Ye Q, Lu S, Jia M, Ji R, Tian Y (2020) Multiple expert brainstorming for domain adaptive person re-identification. In: Computer vision—ECCV 2020: 16th European conference Glasgow UK August 23–28 2020 proceedings part VII 16. Springer, pp 594–611
Zhang G, Chen C, Chen Y, Zhang H, Zheng Y (2022) Fine-grained-based multi-feature fusion for occluded person re-identification. J Vis Commun Image Represent 87:103581
Zhang P, Dou H, Yu Y, Li X (2022b) Adaptive cross-domain learning for generalizable person re-identification. In: European conference on computer vision. Springer, pp 215–232
Zhang W, He X, Yu X, Lu W, Zha Z, Tian Q (2019) A multi-scale spatial–temporal attention model for person re-identification in videos. IEEE Trans Image Process 29:3365–3373
Zhang X, Ge Y, Qiao Y, Li H (2021) Refining pseudo labels with clustering consensus over generations for unsupervised object re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3436–3445
Zhang X, Li D, Wang Z, Wang J, Ding E, Shi JQ, Zhang Z, Wang J (2022c) Implicit sample extension for unsupervised person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7369–7378
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124
Zhou S, Wang J, Shu J, Meng D, Wang L, Zheng N (2021) Multinetwork collaborative feature learning for semisupervised person reidentification. IEEE Trans Neural Netw Learn Syst 33:4826–4839
Zhu H, Huang L, Wei Z, Zhang W, Cai H (2022) Learning camera invariant deep features for semi-supervised person re-identification. Multimedia Tools Appl 81:18671–18692
Zhu H, Ke W, Li D, Liu J, Tian L, Shan Y (2022b) Dual cross-attention learning for fine-grained visual categorization and object re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4692–4702
Zhuo J, Chen Z, Lai J, Wang G (2018) Occluded person re-identification. In: 2018 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6
Acknowledgements
This work was supported in part by the Humanities and Social Sciences Planning Fund Projects of Ministry of Education of China under Grant 23YJAZH226 and "Research on the Development Path of Artificial Intelligence Based on ChatGPT-like Generated Content", 2023-09 ~2026-08.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, S., Li, Z., Liu, J. et al. Progressive spatial–temporal transfer model for unsupervised person re-identification. Int J Multimed Info Retr 13, 17 (2024). https://doi.org/10.1007/s13735-024-00324-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13735-024-00324-w