Skip to main content
Log in

Unsupervised Domain Adaptation with Background Shift Mitigating for Person Re-Identification

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Unsupervised domain adaptation has been a popular approach for cross-domain person re-identification (re-ID). There are two solutions based on this approach. One solution is to build a model for data transformation across two different domains. Thus, the data in source domain can be transferred to target domain where re-ID model can be trained by rich source domain data. The other solution is to use target domain data plus corresponding virtual labels to train a re-ID model. Constrains in both solutions are very clear. The first solution heavily relies on the quality of data transformation model. Moreover, the final re-ID model is trained by source domain data but lacks knowledge of the target domain. The second solution in fact mixes target domain data with virtual labels and source domain data with true annotation information. But such a simple mixture does not well consider the raw information gap between data of two domains. This gap can be largely contributed by the background differences between domains. In this paper, a Suppression of Background Shift Generative Adversarial Network (SBSGAN) is proposed to mitigate the gaps of data between two domains. In order to tackle the constraints in the first solution mentioned above, this paper proposes a Densely Associated 2-Stream (DA-2S) network with an update strategy to best learn discriminative ID features from generated data that consider both human body information and also certain useful ID-related cues in the environment. The built re-ID model is further updated using target domain data with corresponding virtual labels. Extensive evaluations on three large benchmark datasets show the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Abdulla, W. (2017). Mask r-cnn for object detection and instance segmentation on keras and tensorflow.

  • Ahmed, E., Jones, M., & Marks, T.K. (2015). An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3908–3916.

  • Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In: Proceedings of International Conference on Machine Learning, pp 214–223.

  • Arthur, D., & Vassilvitskii, S. (2006). k-means++: the advantages of careful seeding. In: Proceedings of the SIAM Conference on ACM-SIAM Symposium on Discrete Algorithms, pp 1027–1035.

  • Bak, S., Carr, P., & Lalonde, J.F. (2018). Domain adaptation through synthesis for unsupervised person re-identification. In: Proceedings of the European Conference on Computer Vision, pp 189–205.

  • Campello, R.J., Moulavi, D., & Sander, J. (2013). Density-based clustering based on hierarchical density estimates. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 160–172.

  • Chen, T., Ding, S., Xie, J., Yuan, Y., Chen, W., Yang, Y., Ren, Z., & Wang, Z. (2019b). Abd-net: attentive but diverse person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8351–8361.

  • Chen, G., Lin, C., Ren, L., Lu, J., & Zhou, J. (2019a). Self-critical attention learning for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9637–9646.

  • Chen, D., Zhang, S., Ouyang, W., Yang, J., & Tai, Y. (2018). Person search via a mask-guided two-stream cnn model. In: Proceedings of the European Conference on Computer Vision, pp 734–750.

  • Chen, Y., Zhu, X., & Gong, S. (2019c). Instance-guided context rendering for cross-domain person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 232–242.

  • Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., & Choo, J. (2018). Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8789–8797.

  • Courty, N., Flamary, R., Tuia, D., & Rakotomamonjy, A. (2016). Optimal transport for domain adaptation. Transactions on Pattern Analysis and Machine Intelligence, 39(9), 1853–1865.

    Article  Google Scholar 

  • Dai, Z., Chen, M., Gu, X., Zhu, S., & Tan, P. (2019). Batch dropblock network for person re-identification and beyond. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3691–3701.

  • Damodaran, B.B., Kellenberger, B., Flamary, R., Tuia, D., & Courty, N. (2018). Deepjdot: deep joint distribution optimal transport for unsupervised domain adaptation. In: Proceedings of the European Conference on Computer Vision, pp 447–463.

  • Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., & Jiao, J. (2018). Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 994–1003.

  • Ester, M., Kriegel, H.P., Sander, J., & Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 226–231.

  • Fan, H., Zheng, L., Yan, C., & Yang, Y. (2018). Unsupervised person re-identification: clustering and fine-tuning. Transactions on Multimedia Computing, Communications, and Applications, 14(4), 1–18.

    Article  Google Scholar 

  • Farenzena, M., Bazzani, L., Perina, A., Murino, V., & Cristani, M. (2010). Person re-identification by symmetry-driven accumulation of local features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2360–2367.

  • Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2009). Object detection with discriminatively trained part-based models. Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.

    Article  Google Scholar 

  • Fu, Y., Wei, Y., Wang, G., Zhou, Y., Shi, H., & Huang, T.S. (2019). Self-similarity grouping: a simple unsupervised cross domain adaptation approach for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 6112–6121.

  • Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., & Bengio, Y. (2014). Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, pp 2672–2680.

  • Gretton, A., Borgwardt, K., Rasch, M.J., Scholkopf, B., & Smola, A.J. (2006). A kernel method for the two-sample problem. In: Proceedings of Advances in Neural Information Processing Systems.

  • Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. (2017). Improved training of wasserstein gans. In: Proceedings of Advances in Neural Information Processing Systems.

  • Guo, J., Yuan, Y., Huang, L., Zhang, C., Yao, J.G., & Han, K. (2019). Beyond human parts: dual part-aligned representations for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3642–3651.

  • He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778.

  • Hermans, A., Beyer, L., & Leibe, B. (2017). In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737.

  • Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141.

  • Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K.Q. (2017a). Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708.

  • Huang, Y., Sheng, H., & Xiong, Z. (2016). Person re-identification based on hierarchical bipartite graph matching. In: Proceedings of the IEEE International Conference on Image Processing, pp 4255–4259.

  • Huang, Y., Wu, Q., Xu, J., & Zhong, Y. (2019). Sbsgan: suppression of inter-domain background shift for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9527–9536.

  • Huang, Y., Sheng, H., Zheng, Y., & Xiong, Z. (2017b). Deepdiff: learning deep difference features on human body parts for person re-identification. Neurocomputing, 241, 191–203.

    Article  Google Scholar 

  • Huang, Y., Xu, J., Wu, Q., Zheng, Z., Zhang, Z., & Zhang, J. (2018). Multi-pseudo regularized label for generated data in person re-identification. Transactions on Image Processing, 28(3), 1391–1403.

    Article  MathSciNet  Google Scholar 

  • Ioffe, S., & Szegedy, C. (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of International conference on machine learning, pp 448–456.

  • Isola, P., Zhu, J.Y., Zhou, T., & Efros, A.A. (2017). Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134.

  • Kalayeh, M.M., Basaran, E., Gökmen, M., Kamasak, M.E., & Shah, M. (2018). Human semantic parsing for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1062–1071.

  • Kingma, D.P., & Ba, J. (2015). Adam: A method for stochastic optimization. In: Proceedings of International Conference on Learning Representations.

  • Kurmi, V.K., Kumar, S., & Namboodiri, V.P. (2019). Attending to discriminative certainty for domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 491–500.

  • Li, Y.J., Lin, C.S., Lin, Y.B., & Wang, Y.C.F. (2019). Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 7919–7929.

  • Li, W., Zhao, R., Xiao, T., & Wang, X. (2014). Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 152–159.

  • Liang, X., Gong, K., Shen, X., & Lin, L. (2018). Look into person: joint body parsing and pose estimation network and a new benchmark. Transactions on Pattern Analysis and Machine Intelligence, 41(4), 871–885.

    Article  Google Scholar 

  • Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C.L. (2014). Microsoft coco: common objects in context. In: Proceedings of European Conference on Computer Vision, pp 740–755.

  • Liu, J., Zha, Z.J., Chen, D., Hong, R., & Wang, M. (2019). Adaptive transfer network for cross-domain person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7202–7211.

  • Long, M., Cao, Y., Wang, J., & Jordan, M. (2015). Learning transferable features with deep adaptation networks. In: Proceedings of International Conference on Machine Learning, pp 97–105.

  • Luo, C., Chen, Y., Wang, N., & Zhang, Z. (2019). Spectral feature transformation for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4976–4985.

  • Nair, V., & Hinton, G.E. (2010). Rectified linear units improve restricted boltzmann machines. In: Proceedings of International Conference on Machine Learning, pp 807–814.

  • Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., & Wang, B. (2019). Moment matching for multi-source domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1406–1415.

  • Qi, L., Wang, L., Huo, J., Zhou, L., Shi, Y., & Gao, Y. (2019). A novel unsupervised camera-aware domain adaptation framework for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8080–8089.

  • Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. In: Proceedings of the European conference on computer vision, pp 17–35.

  • Song, C., Huang, Y., Ouyang, W., & Wang, L. (2018). Mask-guided contrastive attention model for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1179–1188.

  • Song, J., Yang, Y., Song, Y.Z., Xiang, T., & Hospedales, T.M. (2019). Generalizable person re-identification by domain-invariant mapping network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 719–728.

  • Song, L., Wang, C., Zhang, L., Du, B., Zhang, Q., Huang, C., et al. (2020). Unsupervised domain adaptive re-identification: theory and practice. Pattern Recognition, 102, 107–173.

    Article  Google Scholar 

  • Sun, Y., Wang, X., & Tang, X. (2015). Deeply learned face representations are sparse, selective, and robust. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2892–2900.

  • Sun, Y., Zheng, L., Li, Y., Yang, Y., Tian, Q., & Wang, S. (2021). Learning part-based convolutional features for person re-identification. Transactions on Pattern Analysis and Machine Intelligence, 43(3), 902–917.

    Article  Google Scholar 

  • Taigman, Y., Polyak, A., & Wolf, L. (2017). Unsupervised cross-domain image generation. In: Proceedings of International Conference on Learning Representations.

  • Tian, M., Yi, S., Li, H., Li, S., Zhang, X., Shi, J., Yan, J., & Wang, X. (2018). Eliminating background-bias for robust person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5794–5803.

  • Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7167–7176.

  • Van Der Maaten, L. (2014). Accelerating t-sne using tree-based algorithms. The Journal of Machine Learning Research, 15(1), 3221–3245.

    MathSciNet  MATH  Google Scholar 

  • Wang, J., Zhu, X., Gong, S., & Li, W. (2018). Transferable joint attribute-identity deep learning for unsupervised person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2275–2284.

  • Wang, X., Li, L., Ye, W., Long, M., & Wang, J. (2019). Transferable attention for domain adaptation. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 5345–5352.

    Article  Google Scholar 

  • Wei, L., Zhang, S., Gao, W., & Tian, Q. (2018). Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 79–88.

  • Wu, A., Zheng, W.S., & Lai, J.H. (2019). Unsupervised person re-identification by camera-aware similarity consistency learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp 6922–6931.

  • Yang, Q., Yu, H.X., Wu, A., Zheng, W.S. (2019). Patch-based discriminative feature learning for unsupervised person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3633–3642.

  • Yu, T., Li, D., Yang, Y., Hospedales, T.M., & Xiang, T. (2019). Robust person re-identification by modelling feature uncertainty. In: Proceedings of the IEEE International Conference on Computer Vision, pp 552–561.

  • Zhang, X., Cao, J., Shen, C., & You, M. (2019). Self-training with progressive augmentation for unsupervised cross-domain person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8222–8231.

  • Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., & Tian, Q. (2015). Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124.

  • Zheng, L., Yang, Y., & Hauptmann, A.G. (2016). Person re-identification: past, present and future. arXiv preprint arXiv:1610.02984

  • Zheng, Z., Yang, X., Yu, Z., Zheng, L., Yang, Y., & Kautz, J. (2019). Joint discriminative and generative learning for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2138–2147.

  • Zheng, Z., Zheng, L., & Yang, Y. (2017b). Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3754–3762.

  • Zheng, Z., Zheng, L., & Yang, Y. (2017a). A discriminatively learned cnn embedding for person reidentification. Transactions on Multimedia Computing, Communications, and Applications, 14(1), 1–20.

    Google Scholar 

  • Zhong, Z., Zheng, L., Cao, D., & Li, S. (2017). Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1318–1327.

  • Zhong, Z., Zheng, L., Li, S., & Yang, Y. (2018). Generalizing a person retrieval model hetero-and homogeneously. In: Proceedings of the European Conference on Computer Vision, pp 172–188.

  • Zhong, Z., Zheng, L., Luo, Z., Li, S., & Yang, Y. (2019). Invariance matters: exemplar memory for domain adaptive person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 598–607.

  • Zhou, K., Yang, Y., Cavallaro, A., & Xiang, T. (2019). Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3702–3712.

  • Zhu, J.Y., Park, T., Isola, P., & Efros, A.A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2223–2232.

Download references

Acknowledgements

This research was supported in part by the Australian Government Research Training Program Scholarship and in part by the Beijing Institute of Technology Research Fund Program for Young Scholars.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Wu.

Additional information

Communicated by Subhasis Chaudhuri.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, Y., Wu, Q., Xu, J. et al. Unsupervised Domain Adaptation with Background Shift Mitigating for Person Re-Identification. Int J Comput Vis 129, 2244–2263 (2021). https://doi.org/10.1007/s11263-021-01474-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-021-01474-8

Keywords

Navigation