Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification

Liu, Yu-Jie; Shao, Wen-Bin; Sun, Xiao-Rui

doi:10.1007/s11390-022-2146-1

Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification

Regular Paper
Published: 31 May 2022

Volume 37, pages 641–651, (2022)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Yu-Jie Liu¹,
Wen-Bin Shao¹ &
Xiao-Rui Sun¹

227 Accesses
Explore all metrics

Abstract

Visible-infrared person re-identification has attracted extensive attention from the community due to its potential great application prospects in video surveillance. There are huge modality discrepancies between visible and infrared images caused by different imaging mechanisms. Existing studies alleviate modality discrepancies by aligning modality distribution or extracting modality-shared features on the original image. However, they ignore a key solution, i.e., converting visible images to gray images directly, which is efficient and effective to reduce modality discrepancies. In this paper, we transform the cross-modality person re-identification task from visible-infrared images to gray-infrared images, which is named as the minimal modality discrepancy. In addition, we propose a pyramid feature integration network (PFINet) which mines the discriminative refined features of pedestrian images and fuses high-level and semantically strong features to build a robust pedestrian representation. Specifically, PFINet first performs the feature extraction from concrete to abstract and the top-down semantic transfer to obtain multi-scale feature maps. Second, the multi-scale feature maps are inputted to the discriminative-region response module to emphasize the identity-discriminative regions by the spatial attention mechanism. Finally, the pedestrian representation is obtained by the feature integration. Extensive experiments demonstrate the effectiveness of PFINet which achieves the rank-1 accuracy of 81.95% and mAP of 74.49% on the multi-all evaluation mode of the SYSU-MM01 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A triple-path global–local feature complementary network for visible-infrared person re-identification

Article 20 October 2023

Combining information augmentation aggregation and dual-granularity feature fusion for visible-infrared person re-identification

Article 30 December 2024

Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification

Article 05 May 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi S C. Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2021.3054775.
Zeng M, Yao B, Wang Z J, Shen Y, Li F, Zhang J, Lin H, Guo M. CATIRI: An efficient method for content-and-text based image retrieval. Journal of Computer Science and Technology, 2019, 34(2): 287-304. https://doi.org/10.1007/s11390-019-1911-2.
Article MathSciNet Google Scholar
Sun Y, Zheng L, Yang Y, Tian Q, Wang S. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.480-496. https://doi.org/10.1007/978-3-030-01225-0_30.
Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q, Jiang W, Zhang C, Sun J. AlignedReID: Surpassing human-level performance in person re-identification. arXiv:1711.08184, 2017. https://arxiv.org/pdf/1711.08184.pdf, Jan. 2022.
Zhong Z, Zheng L, Cao D, Li S. Re-ranking person re-identification with k-reciprocal encoding. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.1318-1327. https://doi.org/10.1109/CVPR.2017.389.
Wu A, Zheng W S, Yu H X, Gong S, Lai J. RGB-infrared cross-modality person re-identification. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.5380-5389. https://doi.org/10.1109/ICCV.2017.575.
Dai P, Ji R, Wang H, Wu Q, Huang Y. Cross-modality person re-identification with generative adversarial training. In Proc. the 27th International Joint Conference on Artificial Intelligence, July 2018, pp.677-683. https://doi.org/10.24963/ijcai.2018/94.
Wang G A, Zhang T, Cheng J, Liu S, Yang Y, Hou Z. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27-Nov. 2, 2019, pp.3623-3632. https://doi.org/10.1109/ICCV.2019.00372.
Wang G A, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z G. Cross-modality paired-images generation for RGB-infrared person re-identification. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.12144-12151. https://doi.org/10.1609/aaai.v34i07.6894.
Zhao Z, Liu B, Chu Q, Lu Y, Yu N. Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification. In Proc. the 35th Conference on Artificial Intelligence, Feb. 2021, pp.3520-3528.
Chen Y, Wan L, Li Z, Jing Q, Sun Z. Neural feature search for RGB-infrared person re-identification. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021, pp.587-597. https://doi.org/10.1109/CVPR46437.2021.00065.
Lu Y, Wu Y, Liu B, Zhang T, Li B, Chu Q, Yu N. Cross-modality person re-identification with shared-specific feature transfer. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020, pp.13376-13386. https://doi.org/10.1109/CVPR42600.2020.01339.
Zhu Y, Yang Z, Wang L, Zhao S, Hu X, Tao D. Hetero-center loss for cross-modality person re-identification. Neurocomputing, 2020, 386: 97-109. https://doi.org/10.1016/j.neucom.2019.12.100.
Article Google Scholar
Wu Q, Dai P, Chen J, Lin C W, Wu Y, Huang F, Zhong B, Ji R. Discover cross-modality nuances for visible-infrared person re-identification. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021, pp.4330-4339. https://doi.org/10.1109/CVPR46437.2021.00431.
Ding S, Lin L, Wang G, Chao H. Deep feature learning with relative distance comparison for person re-identification. Pattern Recognition, 2015, 48(10): 2993-3003. https://doi.org/10.1016/j.patcog.2015.04.005.
Article Google Scholar
Chen W, Chen X, Zhang J, Huang K. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.403-412. https://doi.org/10.1109/CVPR.2017.145.
Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification. arXiv:1703.07737, 2017. https://arxiv.org/pdf/1703.07737.pdf, Jan. 2022.
Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q. Person re-identification in the wild. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.1367-1376. https://doi.org/10.1109/CVPR.2017.357.
Qian X, Fu Y, Jiang Y G, Xiang T, Xue X. Multi-scale deep learning architectures for person re-identification. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.5399-5408. https://doi.org/10.1109/ICCV.2017.577.
Sun Y, Zheng L, Deng W, Wang S. SVDNet for pedestrian retrieval. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.3800-3808. https://doi.org/10.1109/ICCV.2017.410.
Guo J, Yuan Y, Huang L, Zhang C, Yao J G, Han K. Beyond human parts: Dual part-aligned representations for person re-identification. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27-Nov. 2, 2019, pp.3642-3651. https://doi.org/10.1109/ICCV.2019.00374.
Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.1077-1085. https://doi.org/10.1109/CVPR.2017.103.
Gao S, Wang J, Lu H, Liu Z. Pose-guided visible part matching for occluded person ReID. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020, pp.11744-11752. https://doi.org/10.1109/CVPR42600.2020.01176.
Ge Y, Zhu F, Chen D et al. Self-paced contrastive learning with hybrid memory for domain adaptive object Re-ID. In Proc. the Annual Conference on Neural Information Processing Systems, Dec. 2020.
Ge Y, Chen D, Li H. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv:2001.01526, 2020. https://arxiv.org/pdf/2001.01526.pdf, Jan. 2022.
Chen H, Wang Y, Lagadec B, Dantcheva A, Bremond F. Joint generative and contrastive learning for unsupervised person re-identification. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021, pp.2004-2013. https://doi.org/10.1109/CVPR46437.2021.00204.
Wang Z, Wang Z, Zheng Y, Chuang Y Y, Satoh S. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019, pp.618-626. https://doi.org/10.1109/CVPR.2019.00071.
Ye M, Lan X, Wang Z, Yuen P C. Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Transactions on Information Forensics and Security, 2020, 15: 407-419. https://doi.org/10.1109/TIFS.2019.2921454.
Article Google Scholar
Hao Y, Wang N, Li J, Gao X. HSME: Hypersphere manifold embedding for visible thermal person re-identification. In Proc. the AAAI Conference on Artificial Intelligence, January 27-February 1, 2019, pp.8385-8392. https://doi.org/10.1609/aaai.v33i01.33018385.
Ye M, Lan X, Leng Q, Shen J. Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Transactions on Image Processing, 2020, 29: 9387-9399. https://doi.org/10.1109/TIP.2020.2998275.
Article Google Scholar
Jia M, Zhai Y, Lu S, Ma S, Zhang J. A similarity inference metric for RGB-infrared cross-modality person re-identification. In Proc. the 29th International Joint Conference on Artificial Intelligence, Jan. 2021, pp.1026-1032. https://doi.org/10.24963/ijcai.2020/143.
Ye M, Shen J, J Crandall D, Shao L, Luo J. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.229-247. https://doi.org/10.1007/978-3-030-58520-4_14.
Li D, Wei X, Hong X, Gong Y. Infrared-visible cross-modal person re-identification with an X modality. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.4610-4617. https://doi.org/10.1609/aaai.v34i04.5891.
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2018, pp.7132-7141. https://doi.org/10.1109/CVPR.2018.00745.
Woo S, Park J, Lee J Y, Kweon I S. Cbam: Convolutional block attention module. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.3-19. https://doi.org/10.1007/978-3-030-01234-2_1.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In Proc. the 2017 Annual Conference on Neural Information Processing Systems, Dec. 2017, pp.5998-6008.
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. Dual attention network for scene segmentation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019, pp.3146-3154. https://doi.org/10.1109/CVPR.2019.00326.
Lin T Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.2117-2125. https://doi.org/10.1109/CVPR.2017.106.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, June 2016, pp.770-778. https://doi.org/10.1109/CVPR.2016.90.
Nguyen D T, Hong H G, Kim K W, Park K R. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, 2017, 17(3): Article No. 605. https://doi.org/10.3390/s17030605.
Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014. https://arxiv.org/pdf/1412.6980.pdf, Jan. 2022.

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
Yu-Jie Liu, Wen-Bin Shao & Xiao-Rui Sun

Authors

Yu-Jie Liu
View author publications
You can also search for this author inPubMed Google Scholar
Wen-Bin Shao
View author publications
You can also search for this author inPubMed Google Scholar
Xiao-Rui Sun
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Wen-Bin Shao.

Supplementary Information

ESM 1

(PDF 261 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, YJ., Shao, WB. & Sun, XR. Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification. J. Comput. Sci. Technol. 37, 641–651 (2022). https://doi.org/10.1007/s11390-022-2146-1

Download citation

Received: 06 January 2022
Accepted: 14 April 2022
Published: 31 May 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11390-022-2146-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A triple-path global–local feature complementary network for visible-infrared person re-identification

Combining information augmentation aggregation and dual-granularity feature fusion for visible-infrared person re-identification

Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now