skip to main content
10.1145/3503161.3547819acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

More is better: Multi-source Dynamic Parsing Attention for Occluded Person Re-identification

Published: 10 October 2022 Publication History

Abstract

Occluded person re-identification (re-ID) has been a long-standing challenge in surveillance systems. Most existing methods tackle this challenge by aligning spatial features of human parts according to external semantic cues, which are inferred from the off-the-shelf semantic models (e.g. human parsing and pose estimation). However, there is a significant domain gap between the images in re-ID datasets and the images used for training the semantic models, such that inevitably making those semantic cues unreliable and deteriorating the re-ID performance. Multi-source knowledge ensemble has been proved to be effective for domain adaptation. Inspired by this, we propose a multi-source dynamic parsing attention (MSDPA) mechanism that leverages knowledge learned from different source datasets to generate reliable semantic cues and dynamically integrate and adapt them in a self-supervised manner by attention mechanism. Specifically, we first design a parsing embedding module (PEM) to integrate and embed the multi-source semantic cues into the patch tokens through a voting procedure. To further exploit correlations among body parts with similar semantics, we design a dynamic parsing attention block (DPAB) to guide the patch sequences aggregation by prior attentions which are dynamically generated from human parsing results. Extensive experiments over occluded, partial, and holistic re-ID datasets show that the MSDPA achieves superior re-ID performance consistently and outperforms the state-of-the-art methods by large margins on occluded datasets.

References

[1]
2020. MindSpore. https://www.mindspore.cn/.
[2]
Karim Ahmed, Mohammad Haris Baig, and Lorenzo Torresani. 2016. Network of Experts for Large-Scale Image Categorization. In Proceedings of the European conference on computer vision (ECCV).
[3]
L. Breiman. 2004. Bagging predictors. Machine Learning 24 (2004), 123--140.
[4]
Nicolas Carion, F. Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End Object Detection with Transformers. In Proceedings of the European Conference on Computer Vision (ECCV). 213--229.
[5]
Kai-Wei Chang, Wen tau Yih, and Christopher Meek. 2013. Multi-Relational Latent Semantic Analysis. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP).
[6]
Guangyi Chen, Tianren Zhang, Jiwen Lu, and Jie Zhou. 2019. Deep meta metric learning. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 9547--9556.
[7]
Xianjie Chen, Roozbeh Mottaghi, Xiaobai Liu, Sanja Fidler, Raquel Urtasun, and Alan Yuille. 2014. Detect what you can: Detecting and representing objects using holistic models and body parts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1971--1978.
[8]
Yongxing Dai, Xiaotong Li, Jun Liu, Zekun Tong, and Ling yu Duan. 2021. Generalizable Person Re-identification with Relevance-aware Mixture of Experts. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 16140--16149.
[9]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of International Conference on Learning Representations (ICLR).
[10]
Yoav Freund and Robert E Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences 55, 1 (1997), 119--139.
[11]
Huan Fu, Mingming Gong, Chaohui Wang, and Dacheng Tao. 2018. MoE-SPNet: A Mixture-of-Experts Scene Parsing Network. Pattern Recognition (PR) 84 (2018), 226--236.
[12]
Shang Gao, Jingya Wang, Huchuan Lu, and Zimo Liu. 2020. Pose-guided visible part matching for occluded person ReID. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 11744--11752.
[13]
Yixiao Ge, Zhuowan Li, Haiyu Zhao, Guojun Yin, Shuai Yi, Xiaogang Wang, et al. 2018. Fd-gan: Pose-guided feature distilling gan for robust person reidentification. In Proceedings of the Advances in Neural Information Processing Systems (NIPS). 1230--1241.
[14]
Sam Gross, Marc'Aurelio Ranzato, and Arthur D. Szlam. 2017. Hard Mixtures of Experts for Large Scale Weakly Supervised Vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5085--5093.
[15]
Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, Jin-Ge Yao, and Kai Han. 2019. Beyond human parts: Dual part-aligned representations for person reidentification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 3642--3651.
[16]
Lingxiao He, Jian Liang, Haiqing Li, and Zhenan Sun. 2018. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7073--7082.
[17]
Lingxiao He and Wu Liu. 2020. Guided saliency feature learning for person re-identification in crowded scenes. In Proceedings of the European Conference on Computer Vision (ECCV). 357--373.
[18]
Lingxiao He, Z. Sun, Yuhao Zhu, and Yunbo Wang. 2018. Recognizing Partial Biometric Patterns. arXiv preprint arXiv:1810.07399 (2018).
[19]
Lingxiao He, Yinggang Wang, Wu Liu, He Zhao, Zhenan Sun, and Jiashi Feng. 2019. Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 8450--8459.
[20]
Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, and Wei Jiang. 2021. Transreid: Transformer-based object re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 15013--15022.
[21]
Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).
[22]
Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. 2019. Interaction-and-aggregation network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9317--9326.
[23]
Houjing Huang, Dangwei Li, Zhang Zhang, Xiaotang Chen, and Kaiqi Huang. 2018. Adversarially occluded samples for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5098--5107.
[24]
Mengxi Jia, Xinhua Cheng, Shijian Lu, and Jian Zhang. 2022. Learning Disentangled Representation Implicitly via Transformer for Occluded Person Re-Identification. IEEE Transactions on Multimedia (TMM) (2022).
[25]
Mengxi Jia, Xinhua Cheng, Yunpeng Zhai, Shijian Lu, Siwei Ma, Yonghong Tian, and Jian Zhang. 2021. Matching on sets: Conquer occluded person reidentification without alignment. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 1673--1681.
[26]
Mengxi Jia, Yunpeng Zhai, Shijian Lu, Siwei Ma, and Jian Zhang. 2020. A similarity inference metric for RGB-infrared cross-modality person re-identification. Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI).
[27]
Mahdi M Kalayeh, Emrah Basaran, Muhittin Gökmen, Mustafa E Kamasak, and Mubarak Shah. 2018. Human semantic parsing for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1062--1071.
[28]
Peike Li, Yunqiu Xu, YunchaoWei, and Yi Yang. 2020. Self-Correction for Human Parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2020), 1--1.
[29]
Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, and FengWu. 2021. Diverse part discovery: Occluded person re-identification with part-aware transformer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2898--2907.
[30]
Xiaodan Liang, Ke Gong, Xiaohui Shen, and Liang Lin. 2018. Look into person: Joint body parsing & pose estimation network and a new benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2018), 871--885.
[31]
Xiaodan Liang, Si Liu, Xiaohui Shen, Jianchao Yang, Luoqi Liu, Jian Dong, Liang Lin, and Shuicheng Yan. 2015. Deep human parsing with active template regression. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2015), 2402--2414.
[32]
Fangyi Liu and Lei Zhang. 2019. View confusion feature learning for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 6639--6648.
[33]
Jinxian Liu, Bingbing Ni, Yichao Yan, Peng Zhou, Shuo Cheng, and Jianguo Hu. 2018. Pose transferrable person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4099--4108.
[34]
Chuanchen Luo, Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang. 2019. Spectral feature transformation for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 4976--4985.
[35]
Hao Luo, Xing Fan, Chi Zhang, and Wei Jiang. 2020. STNReID: Deep Convolutional Networks With Pairwise Spatial Transformer Networks for Partial Person Re-Identification. IEEE Transactions on Multimedia (TMM) 22 (2020), 2905--2913.
[36]
Hao Luo, Wei Jiang, Youzhi Gu, Fuxu Liu, Xingyu Liao, Shenqi Lai, and Jianyang Gu. 2019. A strong baseline and batch normalization neck for deep person re-identification. IEEE Transactions on Multimedia (TMM) (2019).
[37]
Zhongxing Ma, Yifan Zhao, and Jia Li. 2021. Pose-guided Inter-and Intra-part Relational Transformer for Occluded Person Re-Identification. In Proceedings of the ACM International Conference on Multimedia (ACM MM). 1487--1496.
[38]
Jiaxu Miao, Yu Wu, Ping Liu, Yuhang Ding, and Yi Yang. 2019. Pose-guided feature alignment for occluded person re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 542--551.
[39]
Avo Muromägi, Kairit Sirts, and Sven Laur. 2017. Linear Ensembles of Word Embedding Models. In Proceedings of the Nordic Conference on Computational Linguistics (NoDaLiDa).
[40]
MSaquib Sarfraz, Arne Schumann, Andreas Eberle, and Rainer Stiefelhagen. 2018. A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 420--429.
[41]
Chunfeng Song, Yan Huang,Wanli Ouyang, and LiangWang. 2018. Mask-guided contrastive attention model for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1179--1188.
[42]
R Speer, J Chin, and C ConceptNet Havasi. 2017. Conceptnet 5.5: An open multilingual graph of general knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 4444--4451.
[43]
Yumin Suh, Jingdong Wang, Siyu Tang, Tao Mei, and Kyoung Mu Lee. 2018. Part-aligned bilinear representations for person re-identification. In Proceedings of the European conference on computer vision (ECCV). 402--419.
[44]
Han Sun, Zhiyuan Chen, Shiyang Yan, and Lin Xu. 2019. MVP Matching: A Maximum-Value Perfect Matching for Mining HardSamples, With Application to Person Re-Identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).
[45]
Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, and Yichen Wei. 2020. Circle loss: A unified perspective of pair similarity optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6398--6407.
[46]
Yifan Sun, Qin Xu, Yali Li, Chi Zhang, Yikang Li, Shengjin Wang, and Jian Sun. 2019. Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 393--402.
[47]
Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and ShengjinWang. 2018. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European Conference on Computer Vision (ECCV). 480--496.
[48]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2818--2826.
[49]
Chiat-Pin Tay, Sharmili Roy, and Kim-Hui Yap. 2019. Aanet: Attribute attention network for person re-identifications. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7134--7143.
[50]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems (NIPS).
[51]
Guan'an Wang, Shuo Yang, Huanyu Liu, Zhicheng Wang, Yang Yang, Shuliang Wang, Gang Yu, Erjin Zhou, and Jian Sun. 2020. High-order information matters: Learning relation and topology for occluded person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6449--6458.
[52]
David H. Wolpert. 1992. Stacked generalization. Neural Networks 5 (1992), 241--259.
[53]
Wenpeng Yin and Hinrich Schütze. 2016. Learning Word Meta-Embeddings. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).
[54]
Yunpeng Zhai, Shijian Lu, Qixiang Ye, Xuebo Shan, Jie Chen, Rongrong Ji, and Yonghong Tian. 2020. Ad-cluster: Augmented discriminative clustering for domain adaptive person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9021--9030.
[55]
Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, and Yonghong Tian. 2020. Multiple expert brainstorming for domain adaptive person reidentification. In Proceedings of the European conference on computer vision (ECCV). 594--611.
[56]
Xuan Zhang, Hao Luo, X. Fan, Weilai Xiang, Yixiao Sun, Qiqi Xiao, W. Jiang, C. Zhang, and Jian Sun. 2017. AlignedReID: Surpassing Human-Level Performance in Person Re-Identification. arXiv preprint arXiv:1711.08184 (2017).
[57]
Liming Zhao, Xi Li, Yueting Zhuang, and Jingdong Wang. 2017. Deeply-learned part-aligned representations for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 3219--3228.
[58]
Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Jiawei Liu, Zhizheng Zhang, and Zheng-Jun Zha. 2021. Pose-Guided Feature Learning with Knowledge Distillation for Occluded Person Re-Identification. In Proceedings of the ACM International Conference on Multimedia (ACM MM). 4537--4545.
[59]
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1116--1124.
[60]
Weishi Zheng, Xiang Li, Tao Xiang, Shengcai Liao, Jianhuang Lai, and Shaogang Gong. 2015. Partial Person Re-Identification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 4678--4686.
[61]
Wei-Shi Zheng, Shaogang Gong, and Tao Xiang. 2011. Person re-identification by probabilistic relative distance comparison. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 649--656.
[62]
Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. A discriminatively learned cnn embedding for person reidentification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) (2017).
[63]
Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 3754--3762.
[64]
Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020. Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 13001--13008.
[65]
Kuan Zhu, Haiyun Guo, Zhiwei Liu, M. Tang, and Jinqiao Wang. 2020. Identity-Guided Human Semantic Parsing for Person Re-Identification. In Proceedings of the European Conference on Computer Vision (ECCV). 346--363.
[66]
Jiaxuan Zhuo, Zeyu Chen, Jianhuang Lai, and Guangcong Wang. 2018. Occluded person re-identification. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). 1--6.

Cited By

View all
  • (2025) Former: Fusion of Camera-specific Class Activation Map matters for occluded person re-identification Information Fusion10.1016/j.inffus.2025.103011(103011)Online publication date: Feb-2025
  • (2025)Label-guided diversified learning model for occluded person re-identificationExpert Systems with Applications10.1016/j.eswa.2025.126745272(126745)Online publication date: May-2025
  • (2024)Person in Uniforms Re-IdentificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/370383921:2(1-23)Online publication date: 11-Nov-2024
  • Show More Cited By

Index Terms

  1. More is better: Multi-source Dynamic Parsing Attention for Occluded Person Re-identification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '22: Proceedings of the 30th ACM International Conference on Multimedia
    October 2022
    7537 pages
    ISBN:9781450392037
    DOI:10.1145/3503161
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. attention mechanism
    2. occlusion scene
    3. person re-ID

    Qualifiers

    • Research-article

    Funding Sources

    • Shenzhen General Research Project

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)64
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025) Former: Fusion of Camera-specific Class Activation Map matters for occluded person re-identification Information Fusion10.1016/j.inffus.2025.103011(103011)Online publication date: Feb-2025
    • (2025)Label-guided diversified learning model for occluded person re-identificationExpert Systems with Applications10.1016/j.eswa.2025.126745272(126745)Online publication date: May-2025
    • (2024)Person in Uniforms Re-IdentificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/370383921:2(1-23)Online publication date: 11-Nov-2024
    • (2024)ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-IdentificationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680958(1583-1592)Online publication date: 28-Oct-2024
    • (2024)Attention-Based Deep Spiking Neural Networks for Temporal Credit Assignment ProblemsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.324017635:8(10301-10311)Online publication date: Aug-2024
    • (2024)Occlusion-Aware Transformer With Second-Order Attention for Person Re-IdentificationIEEE Transactions on Image Processing10.1109/TIP.2024.339336033(3200-3211)Online publication date: 2024
    • (2024)Region Generation and Assessment Network for Occluded Person Re-IdentificationIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.331895619(120-132)Online publication date: 1-Jan-2024
    • (2024)Pedestrian 3D Shape Understanding for Person Re-Identification via Multi-View LearningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.335885034:7(5589-5602)Online publication date: Jul-2024
    • (2024)A Multi-Level Relation-Aware Transformer model for occluded person re-identificationNeural Networks10.1016/j.neunet.2024.106382177(106382)Online publication date: Sep-2024
    • (2024)Self-selective receptive field network for person re-identificationComplex & Intelligent Systems10.1007/s40747-024-01565-210:6(7777-7797)Online publication date: 5-Aug-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media