research-article

Graph based Spatial-temporal Fusion for Multi-modal Person Re-identification

Authors:

Yaobin Zhang,

Jianming Lv,

Chen Liu,

Hongmin CaiAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 3736 - 3744

https://doi.org/10.1145/3581783.3613757

Published: 27 October 2023 Publication History

Get Access

Abstract

As a challenging task, unsupervised person re-identification (Re-ID) aims to optimize the pedestrian matching model based on the unlabeled image frames from surveillance videos. Recently, the fusion with the spatio-temporal clues of pedestrians have been proven effective to improve the performance of classification. However, most of these methods adopt some hard combination approaches by multiplying the visual scores with the spatio-temporal scores, which are sensitive to the noise caused by imprecise estimation of the spatio-temporal patterns in unlabeled datasets and limit the advantage of the fusion model. In this paper, we propose a Graph based Spatio-Temporal Fusion model for high-performance multi-modal person Re-ID, namely G-Fusion, to mitigate the impact of noise. In particular, we construct a graph of pedestrian images by selecting neighboring nodes based on the visual information and the transition time between cameras. Then we use a randomly initialized two-layer GraphSAGE model to obtain the multi-modal affinity matrix between images, and deploy the distillation learning to optimize the visual model by learning the affinity between the nodes. Finally, a graph-based multi-modal re-ranking method is deployed to make the decision in the testing phase for precise person Re-ID. Comprehensive experiments are conducted on two large-scale Re-ID datasets, and the results show that our method achieves a significant improvement of the performance while combined with SOTA unsupervised person Re-ID methods. Specifically, the mAP scores can reach 92.2%, and 80.4% on the Market-1501, and MSMT17 datasets respectively.

Supplemental Material

MP4 File

Presentation video of the paper "Graph based Spatial-temporal Fusion for Multi-modal Person Re-identification"

Download
3002.00 MB

References

[1]

Usman Ali, Bayram Bayramli, and Hongtao Lu. 2019. Temporal Continuity Based Unsupervised Learning for Person Re-Identification. In Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12-15, 2019, Proceedings, Part V 26. Springer, 770--778.

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Graph-structure based multi-label prediction and classification for unsupervised person re-identification

Unsupervised person re-identification with multi-label learning guided self-paced clustering

Heterogeneous graph driven unsupervised domain adaptation of person re-identification

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations