skip to main content
10.1145/3581783.3613757acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Graph based Spatial-temporal Fusion for Multi-modal Person Re-identification

Published: 27 October 2023 Publication History

Abstract

As a challenging task, unsupervised person re-identification (Re-ID) aims to optimize the pedestrian matching model based on the unlabeled image frames from surveillance videos. Recently, the fusion with the spatio-temporal clues of pedestrians have been proven effective to improve the performance of classification. However, most of these methods adopt some hard combination approaches by multiplying the visual scores with the spatio-temporal scores, which are sensitive to the noise caused by imprecise estimation of the spatio-temporal patterns in unlabeled datasets and limit the advantage of the fusion model. In this paper, we propose a Graph based Spatio-Temporal Fusion model for high-performance multi-modal person Re-ID, namely G-Fusion, to mitigate the impact of noise. In particular, we construct a graph of pedestrian images by selecting neighboring nodes based on the visual information and the transition time between cameras. Then we use a randomly initialized two-layer GraphSAGE model to obtain the multi-modal affinity matrix between images, and deploy the distillation learning to optimize the visual model by learning the affinity between the nodes. Finally, a graph-based multi-modal re-ranking method is deployed to make the decision in the testing phase for precise person Re-ID. Comprehensive experiments are conducted on two large-scale Re-ID datasets, and the results show that our method achieves a significant improvement of the performance while combined with SOTA unsupervised person Re-ID methods. Specifically, the mAP scores can reach 92.2%, and 80.4% on the Market-1501, and MSMT17 datasets respectively.

Supplemental Material

MP4 File
Presentation video of the paper "Graph based Spatial-temporal Fusion for Multi-modal Person Re-identification"

References

[1]
Usman Ali, Bayram Bayramli, and Hongtao Lu. 2019. Temporal Continuity Based Unsupervised Learning for Person Re-Identification. In Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12-15, 2019, Proceedings, Part V 26. Springer, 770--778.
[2]
Hao Chen, Benoit Lagadec, and Francois Bremond. 2021. Ice: Inter-instance contrastive encoding for unsupervised person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14960--14969.
[3]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[4]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd, Vol. 96. 226--231.
[5]
Yang Fu, Yunchao Wei, Guanshuo Wang, Yuqian Zhou, Honghui Shi, and Thomas S Huang. 2019. Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification. In proceedings of the IEEE/CVF international conference on computer vision. 6112--6121.
[6]
Yixiao Ge, Dapeng Chen, and Hongsheng Li. 2020a. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526 (2020).
[7]
Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, et al. 2020b. Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. Advances in Neural Information Processing Systems, Vol. 33 (2020), 11309--11321.
[8]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).
[9]
Jian Han, Ya-Li Li, and Shengjin Wang. 2022. Delving into probabilistic uncertainty for unsupervised domain adaptive person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 790--798.
[10]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[11]
Pingting Hong, Dayan Wu, Bo Li, and Weipinng Wang. 2022. Camera-specific Informative Data Augmentation Module for Unbalanced Person Re-identification. In Proceedings of the 30th ACM International Conference on Multimedia. 501--510.
[12]
Takashi Isobe, Dong Li, Lu Tian, Weihua Chen, Yi Shan, and Shengjin Wang. 2021. Towards Discriminative Representation Learning for Unsupervised Person Re-Identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 8526--8536.
[13]
Jianing Li and Shiliang Zhang. 2020. Joint visual and temporal consistency for unsupervised domain adaptive person re-identification. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXIV 16. Springer, 483--499.
[14]
Jianing Li, Shiliang Zhang, Qi Tian, Meng Wang, and Wen Gao. 2019. Pose-guided representation learning for person re-identification. IEEE transactions on pattern analysis and machine intelligence, Vol. 44, 2 (2019), 622--635.
[15]
Minxian Li, Xiatian Zhu, and Shaogang Gong. 2018. Unsupervised person re-identification by deep learning tracklet association. In Proceedings of the European conference on computer vision (ECCV). 737--753.
[16]
Wei Li, Meibin Qi, Ning Yang, Guowu Zhou, and Yubing Yang. 2020. Unsupervised Spatial-Temporal Model Based on Region Alignment for Person Re-identification. In Journal of Physics: Conference Series, Vol. 1518. IOP Publishing, 012025.
[17]
Shaochuan Lin, Jianming Lv, Zhenguo Yang, Qing Li, and Wei-Shi Zheng. 2022. Heterogeneous graph driven unsupervised domain adaptation of person re-identification. Neurocomputing, Vol. 471 (2022), 1--11.
[18]
Yutian Lin, Xuanyi Dong, Liang Zheng, Yan Yan, and Yi Yang. 2019. A bottom-up clustering approach to unsupervised person re-identification. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 8738--8745.
[19]
Sen Ling, Hua Yang, Chuang Liu, Lin Chen, and Hongtian Zhao. 2022. Spatial-Temporal Constrained Pseudo-labeling for Unsupervised Person Re-identification via GCN Inference. In Digital TV and Wireless Multimedia Communications: 18th International Forum, IFTC 2021, Shanghai, China, December 3-4, 2021, Revised Selected Papers. Springer, 297--311.
[20]
Hao Luo, Pichao Wang, Yi Xu, Feng Ding, Yanxin Zhou, Fan Wang, Hao Li, and Rong Jin. 2021. Self-supervised pre-training for transformer-based person re-identification. arXiv preprint arXiv:2111.12084 (2021).
[21]
Jianming Lv, Weihang Chen, Qing Li, and Can Yang. 2018. Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7948--7956.
[22]
Munaga VNK Prasad, Ramadoss Balakrishnan, et al. 2022. Spatio-temporal association rule based deep annotation-free clustering (STAR-DAC) for unsupervised person re-identification. Pattern Recognition, Vol. 122 (2022), 108287.
[23]
Lei Qi, Lei Wang, Jing Huo, Luping Zhou, Yinghuan Shi, and Yang Gao. 2019. A novel unsupervised camera-aware domain adaptation framework for person re-identification. In Proceedings of the IEEE/CVF international conference on computer vision. 8080--8089.
[24]
Min Ren, Lingxiao He, Xingyu Liao, Wu Liu, Yunlong Wang, and Tieniu Tan. 2021. Learning Instance-level Spatial-Temporal Patterns for Person Re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14930--14939.
[25]
Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, and Qi Tian. 2021. Large-scale spatio-temporal person re-identification: Algorithms and benchmark. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 7 (2021), 4390--4403.
[26]
Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry Steven Davis, and Wen Gao. 2017. Multi-task learning with low rank attribute embedding for multi-camera person re-identification. IEEE transactions on pattern analysis and machine intelligence, Vol. 40, 5 (2017), 1167--1181.
[27]
Dongkai Wang and Shiliang Zhang. 2020. Unsupervised person re-identification via multi-label classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10981--10990.
[28]
Guangcong Wang, Jianhuang Lai, Peigen Huang, and Xiaohua Xie. 2019. Spatial-temporal person re-identification. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 8933--8940.
[29]
Guanshuo Wang, Yufeng Yuan, Xiong Chen, Jiwei Li, and Xi Zhou. 2018. Learning discriminative features with multiple granularities for person re-identification. In Proceedings of the 26th ACM international conference on Multimedia. 274--282.
[30]
Menglin Wang, Baisheng Lai, Jianqiang Huang, Xiaojin Gong, and Xian-Sheng Hua. 2021. Camera-aware proxies for unsupervised person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 2764--2772.
[31]
Wenhao Wang, Fang Zhao, Shengcai Liao, and Ling Shao. 2022. Attentive WaveBlock: Complementarity-enhanced mutual networks for unsupervised domain adaptation in person re-identification and beyond. IEEE Transactions on Image Processing, Vol. 31 (2022), 1532--1544.
[32]
Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. 2018. Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 79--88.
[33]
Chih-Wei Wu, Chih-Ting Liu, Wei-Chih Tu, Yu Tsao, Yu-Chiang Frank Wang, and Shao-Yi Chien. 2020. Space-time guided association learning for unsupervised person re-identification. In 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 2261--2265.
[34]
Yiming Wu, Xintian Wu, Xi Li, and Jian Tian. 2021. MGH: metadata guided hypergraph modeling for unsupervised person re-identification. In Proceedings of the 29th ACM International Conference on Multimedia. 1571--1580.
[35]
Hong-Xing Yu, Wei-Shi Zheng, Ancong Wu, Xiaowei Guo, Shaogang Gong, and Jian-Huang Lai. 2019. Unsupervised person re-identification by soft multilabel learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2148--2157.
[36]
Jongmin Yu and Hyeontaek Oh. 2022. Graph-structure based multi-label prediction and classification for unsupervised person re-identification. Applied Intelligence, Vol. 52, 12 (2022), 14281--14293.
[37]
Shengming Yu, Zhaopeng Dou, and Shengjin Wang. 2023. Prompting and Tuning: A Two-Stage Unsupervised Domain Adaptive Person Re-identification Method on Vision Transformer Backbone. Tsinghua Science and Technology, Vol. 28, 4 (2023), 799--810.
[38]
Minying Zhang, Kai Liu, Yidong Li, Shihui Guo, Hongtao Duan, Yimin Long, and Yi Jin. 2021. Unsupervised domain adaptation for person re-identification via heterogeneous graph alignment. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 3360--3368.
[39]
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE international conference on computer vision. 1116--1124.
[40]
Yingji Zhong, Xiaoyu Wang, and Shiliang Zhang. 2020. Robust partial matching for person search in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6827--6835.
[41]
Zhun Zhong, Liang Zheng, Shaozi Li, and Yi Yang. 2018a. Generalizing a person retrieval model hetero-and homogeneously. In Proceedings of the European conference on computer vision (ECCV). 172--188.
[42]
Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, and Yi Yang. 2019. Invariance matters: Exemplar memory for domain adaptive person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 598--607.
[43]
Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li, and Yi Yang. 2018b. Camera style adaptation for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5157--5166.

Cited By

View all
  • (2024)HGOE: Hybrid External and Internal Graph Outlier Exposure for Graph Out-of-Distribution DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681118(1544-1553)Online publication date: 28-Oct-2024

Index Terms

  1. Graph based Spatial-temporal Fusion for Multi-modal Person Re-identification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. graph
    2. re-ranking
    3. spatio-temporal
    4. unsupervised person re-id

    Qualifiers

    • Research-article

    Funding Sources

    • the Key-Area Research and Development Program of Guangzhou City
    • The Science and Technology Program of Guangzhou, China

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)125
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)HGOE: Hybrid External and Internal Graph Outlier Exposure for Graph Out-of-Distribution DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681118(1544-1553)Online publication date: 28-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media