Abstract
Unsupervised domain adaptation has been a popular approach for cross-domain person re-identification (re-ID). There are two solutions based on this approach. One solution is to build a model for data transformation across two different domains. Thus, the data in source domain can be transferred to target domain where re-ID model can be trained by rich source domain data. The other solution is to use target domain data plus corresponding virtual labels to train a re-ID model. Constrains in both solutions are very clear. The first solution heavily relies on the quality of data transformation model. Moreover, the final re-ID model is trained by source domain data but lacks knowledge of the target domain. The second solution in fact mixes target domain data with virtual labels and source domain data with true annotation information. But such a simple mixture does not well consider the raw information gap between data of two domains. This gap can be largely contributed by the background differences between domains. In this paper, a Suppression of Background Shift Generative Adversarial Network (SBSGAN) is proposed to mitigate the gaps of data between two domains. In order to tackle the constraints in the first solution mentioned above, this paper proposes a Densely Associated 2-Stream (DA-2S) network with an update strategy to best learn discriminative ID features from generated data that consider both human body information and also certain useful ID-related cues in the environment. The built re-ID model is further updated using target domain data with corresponding virtual labels. Extensive evaluations on three large benchmark datasets show the effectiveness of the proposed method.
Similar content being viewed by others
References
Abdulla, W. (2017). Mask r-cnn for object detection and instance segmentation on keras and tensorflow.
Ahmed, E., Jones, M., & Marks, T.K. (2015). An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3908–3916.
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In: Proceedings of International Conference on Machine Learning, pp 214–223.
Arthur, D., & Vassilvitskii, S. (2006). k-means++: the advantages of careful seeding. In: Proceedings of the SIAM Conference on ACM-SIAM Symposium on Discrete Algorithms, pp 1027–1035.
Bak, S., Carr, P., & Lalonde, J.F. (2018). Domain adaptation through synthesis for unsupervised person re-identification. In: Proceedings of the European Conference on Computer Vision, pp 189–205.
Campello, R.J., Moulavi, D., & Sander, J. (2013). Density-based clustering based on hierarchical density estimates. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 160–172.
Chen, T., Ding, S., Xie, J., Yuan, Y., Chen, W., Yang, Y., Ren, Z., & Wang, Z. (2019b). Abd-net: attentive but diverse person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8351–8361.
Chen, G., Lin, C., Ren, L., Lu, J., & Zhou, J. (2019a). Self-critical attention learning for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9637–9646.
Chen, D., Zhang, S., Ouyang, W., Yang, J., & Tai, Y. (2018). Person search via a mask-guided two-stream cnn model. In: Proceedings of the European Conference on Computer Vision, pp 734–750.
Chen, Y., Zhu, X., & Gong, S. (2019c). Instance-guided context rendering for cross-domain person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 232–242.
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., & Choo, J. (2018). Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8789–8797.
Courty, N., Flamary, R., Tuia, D., & Rakotomamonjy, A. (2016). Optimal transport for domain adaptation. Transactions on Pattern Analysis and Machine Intelligence, 39(9), 1853–1865.
Dai, Z., Chen, M., Gu, X., Zhu, S., & Tan, P. (2019). Batch dropblock network for person re-identification and beyond. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3691–3701.
Damodaran, B.B., Kellenberger, B., Flamary, R., Tuia, D., & Courty, N. (2018). Deepjdot: deep joint distribution optimal transport for unsupervised domain adaptation. In: Proceedings of the European Conference on Computer Vision, pp 447–463.
Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., & Jiao, J. (2018). Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 994–1003.
Ester, M., Kriegel, H.P., Sander, J., & Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 226–231.
Fan, H., Zheng, L., Yan, C., & Yang, Y. (2018). Unsupervised person re-identification: clustering and fine-tuning. Transactions on Multimedia Computing, Communications, and Applications, 14(4), 1–18.
Farenzena, M., Bazzani, L., Perina, A., Murino, V., & Cristani, M. (2010). Person re-identification by symmetry-driven accumulation of local features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2360–2367.
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2009). Object detection with discriminatively trained part-based models. Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.
Fu, Y., Wei, Y., Wang, G., Zhou, Y., Shi, H., & Huang, T.S. (2019). Self-similarity grouping: a simple unsupervised cross domain adaptation approach for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 6112–6121.
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., & Bengio, Y. (2014). Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, pp 2672–2680.
Gretton, A., Borgwardt, K., Rasch, M.J., Scholkopf, B., & Smola, A.J. (2006). A kernel method for the two-sample problem. In: Proceedings of Advances in Neural Information Processing Systems.
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. (2017). Improved training of wasserstein gans. In: Proceedings of Advances in Neural Information Processing Systems.
Guo, J., Yuan, Y., Huang, L., Zhang, C., Yao, J.G., & Han, K. (2019). Beyond human parts: dual part-aligned representations for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3642–3651.
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778.
Hermans, A., Beyer, L., & Leibe, B. (2017). In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737.
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K.Q. (2017a). Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708.
Huang, Y., Sheng, H., & Xiong, Z. (2016). Person re-identification based on hierarchical bipartite graph matching. In: Proceedings of the IEEE International Conference on Image Processing, pp 4255–4259.
Huang, Y., Wu, Q., Xu, J., & Zhong, Y. (2019). Sbsgan: suppression of inter-domain background shift for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9527–9536.
Huang, Y., Sheng, H., Zheng, Y., & Xiong, Z. (2017b). Deepdiff: learning deep difference features on human body parts for person re-identification. Neurocomputing, 241, 191–203.
Huang, Y., Xu, J., Wu, Q., Zheng, Z., Zhang, Z., & Zhang, J. (2018). Multi-pseudo regularized label for generated data in person re-identification. Transactions on Image Processing, 28(3), 1391–1403.
Ioffe, S., & Szegedy, C. (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of International conference on machine learning, pp 448–456.
Isola, P., Zhu, J.Y., Zhou, T., & Efros, A.A. (2017). Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134.
Kalayeh, M.M., Basaran, E., Gökmen, M., Kamasak, M.E., & Shah, M. (2018). Human semantic parsing for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1062–1071.
Kingma, D.P., & Ba, J. (2015). Adam: A method for stochastic optimization. In: Proceedings of International Conference on Learning Representations.
Kurmi, V.K., Kumar, S., & Namboodiri, V.P. (2019). Attending to discriminative certainty for domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 491–500.
Li, Y.J., Lin, C.S., Lin, Y.B., & Wang, Y.C.F. (2019). Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 7919–7929.
Li, W., Zhao, R., Xiao, T., & Wang, X. (2014). Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 152–159.
Liang, X., Gong, K., Shen, X., & Lin, L. (2018). Look into person: joint body parsing and pose estimation network and a new benchmark. Transactions on Pattern Analysis and Machine Intelligence, 41(4), 871–885.
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C.L. (2014). Microsoft coco: common objects in context. In: Proceedings of European Conference on Computer Vision, pp 740–755.
Liu, J., Zha, Z.J., Chen, D., Hong, R., & Wang, M. (2019). Adaptive transfer network for cross-domain person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7202–7211.
Long, M., Cao, Y., Wang, J., & Jordan, M. (2015). Learning transferable features with deep adaptation networks. In: Proceedings of International Conference on Machine Learning, pp 97–105.
Luo, C., Chen, Y., Wang, N., & Zhang, Z. (2019). Spectral feature transformation for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4976–4985.
Nair, V., & Hinton, G.E. (2010). Rectified linear units improve restricted boltzmann machines. In: Proceedings of International Conference on Machine Learning, pp 807–814.
Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., & Wang, B. (2019). Moment matching for multi-source domain adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1406–1415.
Qi, L., Wang, L., Huo, J., Zhou, L., Shi, Y., & Gao, Y. (2019). A novel unsupervised camera-aware domain adaptation framework for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8080–8089.
Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. In: Proceedings of the European conference on computer vision, pp 17–35.
Song, C., Huang, Y., Ouyang, W., & Wang, L. (2018). Mask-guided contrastive attention model for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1179–1188.
Song, J., Yang, Y., Song, Y.Z., Xiang, T., & Hospedales, T.M. (2019). Generalizable person re-identification by domain-invariant mapping network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 719–728.
Song, L., Wang, C., Zhang, L., Du, B., Zhang, Q., Huang, C., et al. (2020). Unsupervised domain adaptive re-identification: theory and practice. Pattern Recognition, 102, 107–173.
Sun, Y., Wang, X., & Tang, X. (2015). Deeply learned face representations are sparse, selective, and robust. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2892–2900.
Sun, Y., Zheng, L., Li, Y., Yang, Y., Tian, Q., & Wang, S. (2021). Learning part-based convolutional features for person re-identification. Transactions on Pattern Analysis and Machine Intelligence, 43(3), 902–917.
Taigman, Y., Polyak, A., & Wolf, L. (2017). Unsupervised cross-domain image generation. In: Proceedings of International Conference on Learning Representations.
Tian, M., Yi, S., Li, H., Li, S., Zhang, X., Shi, J., Yan, J., & Wang, X. (2018). Eliminating background-bias for robust person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5794–5803.
Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7167–7176.
Van Der Maaten, L. (2014). Accelerating t-sne using tree-based algorithms. The Journal of Machine Learning Research, 15(1), 3221–3245.
Wang, J., Zhu, X., Gong, S., & Li, W. (2018). Transferable joint attribute-identity deep learning for unsupervised person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2275–2284.
Wang, X., Li, L., Ye, W., Long, M., & Wang, J. (2019). Transferable attention for domain adaptation. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 5345–5352.
Wei, L., Zhang, S., Gao, W., & Tian, Q. (2018). Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 79–88.
Wu, A., Zheng, W.S., & Lai, J.H. (2019). Unsupervised person re-identification by camera-aware similarity consistency learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp 6922–6931.
Yang, Q., Yu, H.X., Wu, A., Zheng, W.S. (2019). Patch-based discriminative feature learning for unsupervised person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3633–3642.
Yu, T., Li, D., Yang, Y., Hospedales, T.M., & Xiang, T. (2019). Robust person re-identification by modelling feature uncertainty. In: Proceedings of the IEEE International Conference on Computer Vision, pp 552–561.
Zhang, X., Cao, J., Shen, C., & You, M. (2019). Self-training with progressive augmentation for unsupervised cross-domain person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8222–8231.
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., & Tian, Q. (2015). Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124.
Zheng, L., Yang, Y., & Hauptmann, A.G. (2016). Person re-identification: past, present and future. arXiv preprint arXiv:1610.02984
Zheng, Z., Yang, X., Yu, Z., Zheng, L., Yang, Y., & Kautz, J. (2019). Joint discriminative and generative learning for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2138–2147.
Zheng, Z., Zheng, L., & Yang, Y. (2017b). Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3754–3762.
Zheng, Z., Zheng, L., & Yang, Y. (2017a). A discriminatively learned cnn embedding for person reidentification. Transactions on Multimedia Computing, Communications, and Applications, 14(1), 1–20.
Zhong, Z., Zheng, L., Cao, D., & Li, S. (2017). Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1318–1327.
Zhong, Z., Zheng, L., Li, S., & Yang, Y. (2018). Generalizing a person retrieval model hetero-and homogeneously. In: Proceedings of the European Conference on Computer Vision, pp 172–188.
Zhong, Z., Zheng, L., Luo, Z., Li, S., & Yang, Y. (2019). Invariance matters: exemplar memory for domain adaptive person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 598–607.
Zhou, K., Yang, Y., Cavallaro, A., & Xiang, T. (2019). Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3702–3712.
Zhu, J.Y., Park, T., Isola, P., & Efros, A.A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2223–2232.
Acknowledgements
This research was supported in part by the Australian Government Research Training Program Scholarship and in part by the Beijing Institute of Technology Research Fund Program for Young Scholars.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Subhasis Chaudhuri.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, Y., Wu, Q., Xu, J. et al. Unsupervised Domain Adaptation with Background Shift Mitigating for Person Re-Identification. Int J Comput Vis 129, 2244–2263 (2021). https://doi.org/10.1007/s11263-021-01474-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-021-01474-8