skip to main content
10.1145/3394171.3413821acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification

Published: 12 October 2020 Publication History

Abstract

Visible thermal person re-identification (VT-REID) is an important and challenging task in that 1) weak lighting environments are inevitably encountered in real-world settings and 2) the inter-modality discrepancy is serious. Most existing methods either aim at reducing the cross-modality gap in pixel- and feature-level or optimizing cross-modality network by metric learning techniques. However, few works have jointly considered these two aspects and studied their mutual benefits. In this paper, we design a novel framework to jointly bridge the modality gap in pixel- and feature-level without additional parameters, as well as reduce the inter- and intra-modalities variations by a center-guided metric learning constraint. Specifically, we introduce the Class-aware Modality Mix (CMM) to generate internal information of the two modalities for reducing the modality gap in pixel-level. In addition, we exploit the KL-divergence to further align modality distributions on feature-level. On the other hand, we propose an efficient Center-guided Metric Learning (CML) method for decreasing the discrepancy within the inter- and intra-modalities, by enforcing constraints on class centers and instances. Extensive experiments on two datasets show the mutual advantage of the proposed components and demonstrate the superiority of our method over the state of the art.

Supplementary Material

MP4 File (3394171.3413821.mp4)
Presentation video for paper,?Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification?\r\nIn this paper, we design a novel framework to jointly bridge the modality gap in pixel- and feature-level without additional parameters, as well as reduce the inter- and intramodalities variations by a center-guided metric learning constraint. Specifically, we introduce the Class-aware Modality Mix (CMM) to generate internal information of the two modalities for reducing the modality gap in pixel-level. In addition, we exploit the KLdivergence to further align modality distributions on feature-level. On the other hand, we propose an efficient Center-guided Metric Learning (CML) method for decreasing the discrepancy within the inter- and intra-modalities, by enforcing constraints on class centers and instances. Extensive experiments on two datasets show the mutual advantage of the proposed components and demonstrate the superiority of our method over the state of the art.

References

[1]
De Cheng, Yihong Gong, Sanping Zhou, Jinjun Wang, and Nanning Zheng. 2016. Person Re-Identification by Multi-Channel Parts-Based CNN With Improved Triplet Loss Function. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[2]
Pingyang Dai, Rongrong Ji, Haibin Wang, Qiong Wu, and Yuyu Huang. 2018. Cross-Modality Person Re-Identification with Generative Adversarial Training. In International Joint Conference on Artificial Intelligence.
[3]
Navneet Dalal and Bill Triggs. 2005. Histograms of Oriented Gradients for Human Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[4]
Zhanxiang Feng, Jianhuang Lai, and Xiaohua Xie. 2019. Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification. IEEE Transactions on Image Processing (2019).
[5]
Yi Hao, Nannan Wang, Xinbo Gao, Jie Li, and Xiaoyu Wang. 2019 a. Dual-Alignment Feature Embedding for Cross-Modality Person Re-Identification. In Proceedings of the 27th ACM International Conference on Multimedia.
[6]
Yi Hao, Nannan Wang, Jie Li, and Xinbo Gao. 2019 b. HSME: hypersphere manifold embedding for visible thermal person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence.
[7]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[8]
Elad Hoffer and Nir Ailon. 2015. Deep metric learning using triplet network. In International Workshop on Similarity-Based Pattern Recognition.
[9]
Yukun Huang, Zheng-Jun Zha, Xueyang Fu, and Wei Zhang. 2019. Illumination-Invariant Person Re-Identification. In Proceedings of the 27th ACM International Conference on Multimedia.
[10]
Jin Kyu Kang, Toan Minh Hoang, and Kang Ryoung Park. 2019. Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input. IEEE Access (2019).
[11]
Shengcai Liao, Yang Hu, Xiangyu Zhu, and Stan Z Li. 2015. Person re-identification by local maximal occurrence representation and metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[12]
Jian-Wu Lin and Hao Li. 2019. HPILN: A feature learning framework for cross-modality person re-identification. arXiv preprint arXiv:1906.03142 (2019).
[13]
Haijun Liu and Jian Cheng. 2019. Enhancing the Discriminative Feature Learning for Visible-Thermal Cross-Modality Person Re-Identification. arXiv preprint arXiv:1907.09659 (2019).
[14]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research (2008).
[15]
Dat Nguyen, Hyung Hong, Ki Kim, and Kang Park. 2017. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors (2017).
[16]
Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, and Yoshua Bengio. 2018. Manifold mixup: Better representations by interpolating hidden states. arXiv preprint arXiv:1806.05236 (2018).
[17]
Guanshuo Wang, Yufeng Yuan, Xiong Chen, Jiwei Li, and Xi Zhou. 2018. Learning Discriminative Features with Multiple Granularities for Person Re-Identification. In Proceedings of the 26th ACM International Conference on Multimedia.
[18]
Guan'an Wang, Tianzhu Zhang, Jian Cheng, Si Liu, Yang Yang, and Zengguang Hou. 2019 c. RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment. In Proceedings of the IEEE International Conference on Computer Vision.
[19]
Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, and Matthew R Scott. 2019 a. Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[20]
Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yung-Yu Chuang, and Shin'ichi Satoh. 2019 b. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[21]
Zheng Wang, Zhixiang Wang, Yinqiang Zheng, Yang Wu, Wenjun Zeng, and Shin'ichi Satoh. 2020. Beyond intra-modality: A survey of heterogeneous person re-identification. In International Joint Conferences on Artificial Intelligence.
[22]
Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A discriminative feature learning approach for deep face recognition. In European Conference on Computer Vision.
[23]
Ancong Wu, Wei-Shi Zheng, Hong-Xing Yu, Shaogang Gong, and Jianhuang Lai. 2017. Rgb-infrared cross-modality person re-identification. In Proceedings of the IEEE International Conference on Computer Vision.
[24]
Minghao Xu, Jian Zhang, Bingbing Ni, Teng Li, Chengjie Wang, Qi Tian, and Wenjun Zhang. 2019. Adversarial Domain Adaptation with Domain Mixup. arXiv preprint arXiv:1912.01805 (2019).
[25]
Fan Yang, Zheng Wang, Jing Xiao, and Shin'ichi Satoh. 2020. Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence.
[26]
Jiwei Yang, Xu Shen, Xinmei Tian, Houqiang Li, Jianqiang Huang, and Xian-Sheng Hua. 2018. Local convolutional neural networks for person re-identification. In Proceedings of the 26th ACM international conference on Multimedia.
[27]
Mang Ye, Xiangyuan Lan, and Qingming Leng. 2019. Modality-Aware Collaborative Learning for Visible Thermal Person Re-Identification. In Proceedings of the 27th ACM International Conference on Multimedia.
[28]
Mang Ye, Xiangyuan Lan, Jiawei Li, and Pong C Yuen. 2018a. Hierarchical discriminative learning for visible thermal person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence.
[29]
Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C Yuen. 2018b. Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking. In International Joint Conference on Artificial Intelligence.
[30]
Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).
[31]
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision.
[32]
Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020 a. Random Erasing Data Augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence.
[33]
Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, and Yi Yang. 2020 b. Learning to Adapt Invariance in Memory for Person Re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).
[34]
Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li, and Yi Yang. 2018. Camera Style Adaptation for Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[35]
Zhun Zhong, Linchao Zhu, Zhiming Luo, Shaozi Li, Yi Yang, and Nicu Sebe. 2020 c. OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in An Open World. arXiv preprint arXiv:2004.05551 (2020).
[36]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision.
[37]
Yuanxin Zhu, Zhao Yang, Li Wang, Sai Zhao, Xiao Hu, and Dapeng Tao. 2020. Hetero-center loss for cross-modality person re-identification. Neurocomputing (2020).

Cited By

View all
  • (2025)Disentangling Modality and Posture Factors: Memory-Attention and Orthogonal Decomposition for Visible-Infrared Person Re-IdentificationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.338402336:3(5494-5508)Online publication date: Mar-2025
  • (2025)CycleTrans: Learning Neutral Yet Discriminative Features via Cycle Construction for Visible- Infrared Person Re-IdentificationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.338293736:3(5469-5479)Online publication date: Mar-2025
  • (2025)Auxiliary Representation Guided Network for Visible-Infrared Person Re-IdentificationIEEE Transactions on Multimedia10.1109/TMM.2024.352177327(340-355)Online publication date: 2025
  • Show More Cited By

Index Terms

  1. Class-Aware Modality Mix and Center-Guided Metric Learning for Visible-Thermal Person Re-Identification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '20: Proceedings of the 28th ACM International Conference on Multimedia
    October 2020
    4889 pages
    ISBN:9781450379885
    DOI:10.1145/3394171
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-modality
    2. metric learning
    3. person re-identification

    Qualifiers

    • Research-article

    Funding Sources

    • The Central Universities Xiamen University
    • China Postdoctoral Science Foundation Grant
    • National Nature Science Foundation of China

    Conference

    MM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)39
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Disentangling Modality and Posture Factors: Memory-Attention and Orthogonal Decomposition for Visible-Infrared Person Re-IdentificationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.338402336:3(5494-5508)Online publication date: Mar-2025
    • (2025)CycleTrans: Learning Neutral Yet Discriminative Features via Cycle Construction for Visible- Infrared Person Re-IdentificationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.338293736:3(5469-5479)Online publication date: Mar-2025
    • (2025)Auxiliary Representation Guided Network for Visible-Infrared Person Re-IdentificationIEEE Transactions on Multimedia10.1109/TMM.2024.352177327(340-355)Online publication date: 2025
    • (2025)Advances in vehicle re-identification techniques: A surveyNeurocomputing10.1016/j.neucom.2024.128745614(128745)Online publication date: Jan-2025
    • (2024)Learning Semantic Polymorphic Mapping for Text-Based Person RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.341012926(10678-10691)Online publication date: 1-Jan-2024
    • (2024)Cooperative Separation of Modality Shared-Specific Features for Visible-Infrared Person Re-IdentificationIEEE Transactions on Multimedia10.1109/TMM.2024.337713926(8172-8183)Online publication date: 13-Mar-2024
    • (2024)Tri-Level Modality-Information Disentanglement for Visible-Infrared Person Re-IdentificationIEEE Transactions on Multimedia10.1109/TMM.2023.330213226(2700-2714)Online publication date: 2024
    • (2024)Semi-Supervised Learning With Heterogeneous Distribution Consistency for Visible Infrared Person Re-IdentificationIEEE Transactions on Image Processing10.1109/TIP.2024.341493833(3880-3892)Online publication date: 2024
    • (2024)DMA: Dual Modality-Aware Alignment for Visible-Infrared Person Re-IdentificationIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.335240819(2696-2708)Online publication date: 2024
    • (2024)Dual-Adversarial Representation Disentanglement for Visible Infrared Person Re-IdentificationIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.334428919(2186-2200)Online publication date: 2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media