Temporal Correlation-Diversity Representations for Video-Based Person Re-Identification

Gong, Litong; Zhang, Ruize; Tang, Sheng; Cao, Juan

doi:10.1007/978-3-031-18907-4_8

Litong Gong^15,18,
Ruize Zhang^15,16,
Sheng Tang^15,17 &
…
Juan Cao¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13534))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

3182 Accesses

Abstract

Video-based person re-identification is a challenging task due to illuminations, occlusions, viewpoint changes, and pedestrian misalignment. Most previous works focus more on temporal correlation features, which leads to a lack of detailed information. In this paper, we emphasize the importance of keeping both correlation and diversity of multi-frame features simultaneously. Thus, we propose a Temporal Correlation-Diversity Representation (TCDR) network to enhance the representation of frame-level pedestrian features and the temporal feature aggregation abilities. Specifically, in order to capture correlated but diverse temporal features, we propose a Temporal-Guided Frame Feature Enhancement (TGFE) module, which explores the temporal correlation with a global perspective and enhances frame-level features to achieve the temporal diversity. Furthermore, we propose a Temporal Feature Integration (TFI) module to aggregate multi-frame features. Finally, we propose a novel progressive smooth loss to alleviate the influence of noisy frames. Extensive experiments show that our method achieves the state-of-the-art performance on MARS, DukeMTMC-VideoReID and LS-VID datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Combine Coarse and Fine Cues: Multi-grained Fusion Network for Video-Based Person Re-identification

Video-based person re-identification using a novel feature extraction and fusion technique

Article 16 January 2020

Temporal Coherence or Temporal Motion: Which Is More Critical for Video-Based Person Re-identification?

References

Chen, T., et al.: ABD-Net: attentive but diverse person re-identification. In: ICCV (2019)
Google Scholar
Eom, C., Lee, G., Lee, J., Ham, B.: Video-based person re-identification with spatial and temporal memory networks. In: ICCV (2021)
Google Scholar
Gu, X., Chang, H., Ma, B., Zhang, H., Chen, X.: Appearance-preserving 3D convolution for video-based person re-identification. In: ECCV (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)
Hou, R., Chang, H., Ma, B., Huang, R., Shan, S.: BiCnet-TKS: learning efficient spatial-temporal representation for video person re-identification. In: CVPR (2021)
Google Scholar
Hou, R., Chang, H., Ma, B., Shan, S., Chen, X.: Temporal complementary learning for video person re-identification. In: ECCV (2020)
Google Scholar
Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., Chen, X.: VRSTC: occlusion-free video person re-identification. In: CVPR (2019)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018)
Google Scholar
Li, J., Wang, J., Tian, Q., Gao, W., Zhang, S.: Global-local temporal representations for video person re-identification. In: ICCV (2019)
Google Scholar
Li, J., Zhang, S., Huang, T.: Multi-scale 3D convolution network for video based person re-identification. In: AAAI (2019)
Google Scholar
Liu, Y., Yan, J., Ouyang, W.: Quality aware network for set to set recognition. In: CVPR (2017)
Google Scholar
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: ICCV (2017)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV (2015)
Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: CVPR (2018)
Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: ECCV (2018)
Google Scholar
Wu, Y., Lin, Y., Dong, X., Yan, Y., Ouyang, W., Yang, Y.: Exploit the unknown gradually: one-shot video-based person re-identification by stepwise learning. In: CVPR (2018)
Google Scholar
Xu, S., Cheng, Y., Gu, K., Yang, Y., Chang, S., Zhou, P.: Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: ICCV (2017)
Google Scholar
Yan, Y., et al.: Learning multi-granular hypergraphs for video-based person re-identification. In: CVPR (2020)
Google Scholar
Zhang, L., et al.: Ordered or orderless: a revisit for video based person re-identification. Trans. Pattern Aanal. Mach. Intell. 43, 1460–1466 (2020)
Google Scholar
Zheng, L., et al.: Mars: a video benchmark for large-scale person re-identification. In: ECCV (2016)
Google Scholar
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: AAAI (2020)
Google Scholar
Zhou, Z., Huang, Y., Wang, W., Wang, L., Tan, T.: See the forest for the trees: Joint spatial and temporal recurrent neural networks for video-based person re-identification. In: CVPR (2017)
Google Scholar

Download references

Acknowledgment

The research work is supported by the National Key Research and Development Program of China (2021AAA0140203), the Zhejiang Provincial Key Research and Development Program of China (No. 2021C01164), the Project of Chinese Academy of Sciences (E141020). Juan Cao thanks the Nanjing Government Affairs and Public Opinion Research Institute for the support of “CaoJuan Studio” and thank Chi Peng, Jingjing Jiang, Qiang Liu, and Yu Dai for their help.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Litong Gong, Ruize Zhang, Sheng Tang & Juan Cao
University of Chinese Academy of Sciences, Beijing, China
Ruize Zhang
Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, Zhejiang, China
Sheng Tang
Hangzhou Zhongke Ruijian Technology Co., Ltd., Hangzhou, China
Litong Gong

Authors

Litong Gong
View author publications
You can also search for this author in PubMed Google Scholar
Ruize Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Tang
View author publications
You can also search for this author in PubMed Google Scholar
Juan Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sheng Tang .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi’an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gong, L., Zhang, R., Tang, S., Cao, J. (2022). Temporal Correlation-Diversity Representations for Video-Based Person Re-Identification. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13534. Springer, Cham. https://doi.org/10.1007/978-3-031-18907-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-18907-4_8
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18906-7
Online ISBN: 978-3-031-18907-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Temporal Correlation-Diversity Representations for Video-Based Person Re-Identification