skip to main content
10.1145/3653781.3653824acmotherconferencesArticle/Chapter ViewAbstractPublication PagescvdlConference Proceedingsconference-collections
research-article

MFEN: Multi-scale Feature Expansion Network for Visible-Infrared Person Re-Identification

Published: 01 June 2024 Publication History

Abstract

Abstract: One of the main challenges of visible-infrared person re-identification (Re-ID) is the huge discrepancy bet-ween the heterogeneous images. To alleviate this problem, many existing researches utilize a dual stream learning framework to learn discriminative features. However, the performance is limited by insufficient training data. Aiming at this problem, a feature level data augmentation network, named MFEN, is proposed in this paper. Through the designed Global-local Feature Augmentation (GFA) module, the network can generate more feature embeddings and combine the global and local features, which is beneficial for learning modality-invariant feature. Besides, to enhance the robustness, a Multi-stages Feature Integration (MFI) block is designed to mitigate information loss and aggregate multi-scale features. In addition, a multilayer perceptron (MLP) layer is added to the end of network for further improving feature representations. Extensive experiments have been conducted on the SYSU-MM01 and RegDB datasets. Experimental results validate the superior performance of our proposed method with existing methods.
Keywords: person re-identification; feature augmentation; visible-infrared person re-identification; multi-scale feature aggregation

References

[1]
A. Wu, W.S. Zheng, H.X. Yu, S. Gong, J Lai., 2017. Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 5380–5389.
[2]
P. DAI, R. JI, H. WANG and, 2018. Cross-modality person re-identification with generative adversarial training. In: Twenty-Seventh International Joint Conference on Artificial Intelligence IJCAI-18.
[3]
D. Li, X. Wei, X. Hong, Y. Gong., 2020. Infrared-Visible Cross-Modal Person Re-Identification with an X Modality. In: AAAI, 2020.
[4]
Y. Zhang, Y. Yan, L. Yang, H. Wang, 2021. Towards a Unified Middle Modality Learning for Visible-Infrared Person Re-Identification. In: ACM MM.
[5]
Ji. Liu, J. Wang, N. Huang, Q. Zhang, J. Han. (2022) Revisiting modality-specific feature compensation for visible-infrared person re-identification. IEEE Transactions on Circuits and Systems for Video Technology.
[6]
M. Ye, J.B. Shen, D. J Crandall, L. Shao, and J. Luo.,2020. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In Proceedings of the ECCV, pages 229–247.
[7]
CHEN C Q, YE M, QI M B, Structure-Aware Positional Transformer for Visible-Infrared Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 31, 2022.
[8]
H. Lu, X. Zou, P. Zhan. (2023). Learning Progressive Modality-shared Transformers for Effective Visible-Infrared. Association for the Advancement of Artificial Intelligence.
[9]
G. Anmol, Qin, J. Chiu, C. Cheng. Conformer: Convolution-augmented Transformer for Speech Recognition. 2020.10.48550/arXiv.2005.08100
[10]
SW Zamir, A Arora, S Khan, M Hayat, FS Khan, MH Yang, L Shao., 2021. Multi-Stage Progressive Image Restoration. In: CVPR.
[11]
X. Wang, R. Girshick, A. Gupta, and K. He., 2018. Non-local neural networks. In: Proceedings of the CVPR, pages 7794–7803, 2018. 4, 7
[12]
T. Chen, S. Kornblith, K. Swersky, M. Norouzi, G. Hinton. (2020) Big Self-Supervised Models are Strong Semi-Supervised Learners. arXiv:2006.10029v2.
[13]
Y. Zhang, H. Wang., 2023. Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification. In: CVPR.
[14]
S. Choi, S. Lee, Y. Kim, T. Kim, and C. Kim.,2020. Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the CVPR, pages 10257–10266, 2020.
[15]
J. Liu, Y. Sun, g. Zhu, H. Pei, Y. Yang, and W. Li. Learning memory-augmented unidirectional metrics for cross-modality person re-identification. In: Proceedings of the CVPR, pages 19366–19375, 2022. 1, 2.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CVDL '24: Proceedings of the International Conference on Computer Vision and Deep Learning
January 2024
506 pages
ISBN:9798400718199
DOI:10.1145/3653804
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2024

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Beijing Natural Science Foundation
  • National Science Foundation of China

Conference

CVDL 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 48
    Total Downloads
  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)6
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media