ABSTRACT
Learning identity-aware, domain-invariant representations is crucial in solving domain generalizable person ReID (DG-ReID). Existing methods commonly use augmentation techniques either in feature space by mixing instance and batch normalization layers or in pixel space by adversarially generating pseudo domains. However, neither of these techniques guarantee identity preservation. Apart from increasing training data diversity, the augmented positive pairs also encode rich semantic relations which have not been fully explored. To address the above issues, we propose a novel framework for Generalizable Person Re-identification using Domain Invariant Contrastive Techniques (G-PReDICT). Specifically, we use simple yet effective perturbation strategies to hallucinate positive samples across domains by realistically modelling domain variations while preserving the target identities. We harness rich sample-sample relations between the hallucinated positive-negative pairs to learn domain-invariant representations using supervised contrastive learning. We also use a domain independent auxiliary task, i.e. attribute prediction to learn robust representations and introduce attribute annotations for two large scale public benchmarks i.e. CUHK-03 and MSMT17. Extensive experiments on standard benchmarks demonstrate the effectiveness of the proposed method.
Supplemental Material
Available for Download
- Alexey Abramov, Christopher Bayer, and Claudio Heller. 2020. Keep it simple: Image statistics matching for domain adaptation. arXiv preprint arXiv:2005.12551(2020).Google Scholar
- Slawomir Bak, Peter Carr, and Jean-Francois Lalonde. 2018. Domain adaptation through synthesis for unsupervised person re-identification. In Proceedings of the European conference on computer vision (ECCV). 189–205.Google Scholar
- Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems 33 (2020), 9912–9924.Google Scholar
- Hao Chen, Yaohui Wang, Benoit Lagadec, Antitza Dantcheva, and Francois Bremond. 2021. Joint generative and contrastive learning for unsupervised person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2004–2013.Google ScholarCross Ref
- Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking Atrous Convolution for Semantic Image Segmentation. ArXiv abs/1706.05587(2017).Google Scholar
- Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.Google Scholar
- Seokeon Choi, Taekyung Kim, Minki Jeong, Hyoungseob Park, and Changick Kim. 2021. Meta batch-instance normalization for generalizable person re-identification. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 3425–3435.Google ScholarCross Ref
- Zuozhuo Dai, Guangyuan Wang, Weihao Yuan, Xiaoli Liu, Siyu Zhu, and Ping Tan. 2021. Cluster contrast for unsupervised person re-identification. arXiv preprint arXiv:2103.11568(2021).Google Scholar
- Weijian Deng, L. Zheng, Guoliang Kang, Yezhou Yang, Qixiang Ye, and Jianbin Jiao. 2018. Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 994–1003.Google Scholar
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. https://doi.org/10.48550/ARXIV.1406.2661Google ScholarCross Ref
- Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarCross Ref
- Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In defense of the triplet loss for person re-identification. arXiv 2017. arXiv preprint arXiv:1703.07737 4 (2017).Google Scholar
- Weiquan Huang, Yan Bai, Qiuyu Ren, Xinbo Zhao, Ming Feng, and Yin Wang. 2021. Large-Scale Unsupervised Person Re-Identification with Contrastive Learning. arXiv preprint arXiv:2105.07914(2021).Google Scholar
- Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. PMLR, 448–456.Google Scholar
- Jieru Jia, Qiuqi Ruan, and Timothy M Hospedales. 2019. Frustratingly easy person re-identification: Generalizing person re-id in practice. arXiv preprint arXiv:1905.03422(2019).Google Scholar
- Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen, and Li Zhang. 2020. Style normalization and restitution for generalizable person re-identification. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3143–3152.Google ScholarCross Ref
- Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems 33 (2020), 18661–18673.Google Scholar
- Vikash Kumar, Sarthak Srivastava, Rohit Lal, and Anirban Chakraborty. 2021. CAFT: Class Aware Frequency Transform for Reducing Domain Gap. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2525–2534.Google ScholarCross Ref
- Pan Li, Da Li, Wei Li, Shaogang Gong, Yanwei Fu, and Timothy M Hospedales. 2021. A simple feature augmentation for domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8886–8895.Google ScholarCross Ref
- Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. 2014. Deepreid: Deep filter pairing neural network for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 152–159.Google ScholarDigital Library
- Shengcai Liao and Ling Shao. 2020. Interpretable and generalizable person re-identification with query-adaptive convolution and temporal lifting. In European Conference on Computer Vision. Springer, 456–474.Google ScholarDigital Library
- Shan Lin, Haoliang Li, Chang-Tsun Li, and Alex Chichung Kot. 2018. Multi-task mid-level feature alignment network for unsupervised cross-dataset person re-identification. arXiv preprint arXiv:1807.01440(2018).Google Scholar
- Yutian Lin, Liang Zheng, Zhedong Zheng, Yu Wu, Zhilan Hu, Chenggang Yan, and Yi Yang. 2019. Improving Person Re-identification by Attribute and Identity Learning. Pattern Recognition (2019). https://doi.org/10.1016/j.patcog.2019.06.006Google ScholarDigital Library
- Lei Qi, Lei Wang, Jing Huo, Luping Zhou, Yinghuan Shi, and Yang Gao. 2019. A novel unsupervised camera-aware domain adaptation framework for person re-identification. In Proceedings of the IEEE/CVF international conference on computer vision. 8080–8089.Google ScholarCross Ref
- Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In European conference on computer vision. Springer, 17–35.Google ScholarCross Ref
- Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, and Sunita Sarawagi. 2018. Generalizing across domains via cross-gradient training. arXiv preprint arXiv:1804.10745(2018).Google Scholar
- Jifei Song, Yongxin Yang, Yi-Zhe Song, Tao Xiang, and Timothy M Hospedales. 2019. Generalizable person re-identification by domain-invariant mapping network. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 719–728.Google ScholarCross Ref
- Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. 2018. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European conference on computer vision (ECCV). 480–496.Google ScholarDigital Library
- Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2020. Contrastive multiview coding. In European conference on computer vision. Springer, 776–794.Google ScholarDigital Library
- Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022(2016).Google Scholar
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.Journal of machine learning research 9, 11 (2008).Google Scholar
- Riccardo Volpi, Hongseok Namkoong, Ozan Sener, John C Duchi, Vittorio Murino, and Silvio Savarese. 2018. Generalizing to unseen domains via adversarial data augmentation. Advances in neural information processing systems 31 (2018).Google Scholar
- Zheng Wang, Ruimin Hu, Chen Chen, Yi Yu, Junjun Jiang, Chao Liang, and Shin’ichi Satoh. 2018. Person Reidentification via Discrepancy Matrix and Matrix Metric. IEEE Transactions on Cybernetics 48, 10 (2018), 3006–3020. https://doi.org/10.1109/TCYB.2017.2755044Google ScholarCross Ref
- Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. 2018. Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 79–88.Google ScholarCross Ref
- Yanchao Yang and Stefano Soatto. 2020. Fda: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4085–4095.Google ScholarCross Ref
- Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, and Lei Zhang. 2022. Exact feature distribution matching for arbitrary style transfer and domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8035–8045.Google ScholarCross Ref
- Yuyang Zhao, Zhun Zhong, Fengxiang Yang, Zhiming Luo, Yaojin Lin, Shaozi Li, and Nicu Sebe. 2021. Learning to generalize unseen domains via memory-based multi-source meta-learning for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6277–6286.Google ScholarCross Ref
- Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable Person Re-identification: A Benchmark. In 2015 IEEE International Conference on Computer Vision (ICCV). 1116–1124. https://doi.org/10.1109/ICCV.2015.133Google ScholarCross Ref
- Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. A discriminatively learned cnn embedding for person reidentification. ACM transactions on multimedia computing, communications, and applications (TOMM) 14, 1 (2017), 1–20.Google Scholar
- Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1318–1327.Google ScholarCross Ref
- Zhun Zhong, Liang Zheng, Shaozi Li, and Yi Yang. 2018. Generalizing a person retrieval model hetero-and homogeneously. In Proceedings of the European conference on computer vision (ECCV). 172–188.Google ScholarDigital Library
- Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, and Yi Yang. 2019. Invariance matters: Exemplar memory for domain adaptive person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 598–607.Google ScholarCross Ref
- Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, and Tao Xiang. 2019. Omni-scale feature learning for person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3702–3712.Google ScholarCross Ref
- Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, and Tao Xiang. 2021. Learning generalisable omni-scale representations for person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).Google ScholarCross Ref
- Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, and Tao Xiang. 2020. Deep domain-adversarial image generation for domain generalisation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 13025–13032.Google ScholarCross Ref
- Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, and Tao Xiang. 2020. Learning to generate novel domains for domain generalization. In European conference on computer vision. Springer, 561–578.Google ScholarDigital Library
- Kaiyang Zhou, Yongxin Yang, Yu Qiao, and Tao Xiang. 2021. Domain generalization with mixstyle. arXiv preprint arXiv:2104.02008(2021).Google Scholar
Index Terms
- G-PReDICT: Generalizable Person Re-ID using Domain Invariant Contrastive Techniques✱
Recommendations
Attention-calibration based double-branch cross-domain person re-identification
AbstractExisting cross-domain pedestrian re-identification methods tend to utilize domain adaptation or domain generalization strategies to eliminate the differences between domains, but these methods fail to fully characterize the ...
Learning domain invariant and specific representation for cross-domain person re-identification
AbstractPerson re-identification (re-ID) aims to match person images under different cameras with disjoint views. Although supervised re-ID has achieved great progress, unsupervised cross-domain re-ID remains a challenging work due to domain bias. In this ...
Adversarial domain adaptation using contrastive learning
AbstractIn recent research, problems with biased datasets or domain shift have presented challenges to the practical applications of deep learning methods. In this paper, we propose a simple method using adversarial learning combined with contrastive ...
Highlights- We investigated an unsupervised domain adaptation task.
- Combining adversarial and contrastive learning can address the domain-shift issue.
- The proposed method is practical and implementation is very simple.
Comments