Hierarchical contrastive adaptation for cross-domain object detection

Deng, Ziwei; Kong, Quan; Akira, Naoto; Yoshinaga, Tomoaki

doi:10.1007/s00138-022-01317-7

Hierarchical contrastive adaptation for cross-domain object detection

Original Paper
Published: 09 July 2022

Volume 33, article number 62, (2022)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Ziwei Deng ORCID: orcid.org/0000-0002-3989-1424¹,
Quan Kong¹,
Naoto Akira¹ &
…
Tomoaki Yoshinaga¹

475 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Object detection based on deep learning has been enormously developed in recent years. However, applying the detectors trained on a label-rich domain to an unseen domain results in performance drop due to the domain-shift. To deal with this problem, we propose a novel unsupervised domain adaptation method to adapt from a labeled source domain to an unlabeled target domain. Recent approaches based on adversarial learning show some effect for aligning the feature distributions of different domains, but the decision boundary would be strongly source-biased for the complex detection task when merely training with source labels and aligning in the entire feature distribution. In this paper, we suggest utilizing image translation to generate translated images of source and target domains to fill in the large domain gap and facilitate a paired adaptation. We propose a hierarchical contrastive adaptation method between the original and translated domains to encourage the detectors to learn domain-invariant but discriminative features. To attach importance to foreground instances and tackle the noises of translated images, we further propose foreground attention reweighting for instance-aware adaptation . Experiments are carried out on 3 cross-domain detection scenarios, and we achieve the state-of-the-art results against other approaches, showing the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Region Feature Disentanglement for Domain Adaptive Object Detection

FIT: Frequency-Based Image Translation for Domain Adaptive Object Detection

Spatial Attention Pyramid Network for Unsupervised Domain Adaptation

References

Arun, A., Jawahar, C., Kumar, M.: Dissimilarity coefficient based weakly supervised object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 9424–9433 (2019)
Bachman, P., Hjelm, R.D., Buchwalter, W.: Learning representations by maximizing mutual information across views. (2019) arXiv:1906.00910
Benenson, R., Popov, S., Ferrari, V.: Large-scale interactive object segmentation with human annotators. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11692–11701 (2019)
Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 95–104 (2017)
Cai, Q., Pan, Y., Ngo, C., Tian, X., Duan, L., Yao, T.: Exploring object relation in mean teacher for cross-domain detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11449–11458 (2019)
Chen, C., Xie, W., Xu, T., Huang, W., Rong, Y., Ding, X., Huang, Y., Huang, J.: Progressive feature alignment for unsupervised domain adaptation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 627–636 (2019)
Chen, C., Zheng, Z., Ding, X., Huang, Y., Dou, Q.: Harmonizing transferability and discriminability for adapting object detectors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for contrastive learning of visual representations. (2020) arXiv:2002.05709
Chen, Y., Li, W., Sakaridis, C., Dai, D., Gool, L.V.: Domain adaptive faster r-cnn for object detection in the wild. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 3339–3348 (2018)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 3213–3223 (2016)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR 2009 (2009)
Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2009)
Article Google Scholar
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: ICML (2015)
Guan, D., Huang, J., Xiao, A., Lu, S., Cao, Y.: Uncertainty-aware unsupervised domain adaptation in object detection. (2021) arXiv:2103.00236
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: CVPR (2006)
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.B.: Momentum contrast for unsupervised visual representation learning. (2019) arXiv:1911.05722
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 770–778 (2016)
He, Z., Zhang, L.: Multi-adversarial faster-rcnn for unrestricted object detection. 2019 IEEE/CVF International Conference on Computer Vision (ICCV) pp. 6667–6676 (2019)
Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Trischler, A., Bengio, Y.: Learning deep representations by mutual information estimation and maximization. (2019) arXiv:1808.06670
Hoffman, J., Tzeng, E., Park, T., Zhu, J.Y., Isola, P., Saenko, K., Efros, A.A., Darrell, T.: Cycada: Cycle-consistent adversarial domain adaptation. In: ICML (2018)
Hsu, H.K., Hung, W.C., Tseng, H.Y., Yao, C.H., Tsai, Y.H., Singh, M.K., Yang, M.H.: Progressive domain adaptation for object detection. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) pp. 738–746 (2020)
Huang, S.W., Lin, C.T., Chen, S., Wu, Y.Y., Hsu, P.H., Lai, S.: Auggan: Cross domain adaptation with gan-based data augmentation. In: ECCV (2018)
Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 5001–5009 (2018)
Jang, W.D., Kim, C.S.: Interactive image segmentation via backpropagating refinement scheme. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 5292–5301 (2019)
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? 2017 IEEE International Conference on Robotics and Automation (ICRA) pp. 746–753 (2017)
Kang, G., Jiang, L., Yang, Y., Hauptmann, A.: Contrastive adaptation network for unsupervised domain adaptation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4888–4897 (2019)
Kim, T., Jeong, M., Kim, S., Choi, S., Kim, C.: Diversify and match: A domain adaptive representation learning paradigm for object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 12448–12457 (2019)
Kim, Y., Yoo, B., Kwak, Y., Choi, C., Kim, J.: Deep generative-contrastive networks for facial expression recognition. (2017) arXiv:1703.07140
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. (2017) arXiv:1703.00848
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: ICML (2015)
Maaten, L.V.D., Hinton, G.E.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Majumder, S., Yao, A.: Content-aware multi-level guidance for interactive instance segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11594–11603 (2019)
Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. (2018) arXiv:1807.03748
Park, C., Lee, J., Yoo, J., Hur, M., Yoon, S.: Joint contrastive learning for unsupervised domain adaptation. (2020) arXiv:2006.10297
Racah, E., Beckham, C., Maharaj, T., Kahou, S., Prabhat, Pal, C.: Extremeweather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. In: NIPS (2017)
Radosavovic, I., Dollár, P., Girshick, R.B., Gkioxari, G., He, K.: Data distillation: Towards omni-supervised learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 4119–4128 (2018)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Patt. Anal. Mach. Intell. 39, 1137–1149 (2015)
Article Google Scholar
Rezaeianaran, F., Shetty, R., Aljundi, R., Reino, D.O., Zhang, S., Schiele, B.: Seeking similarities over differences: Similarity-based domain alignment for adaptive object detection. (2021) arXiv:2110.01428
Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 6949–6958 (2019)
Saito, K., Watanabe, K., Ushiku, Y., Harada, T.: Maximum classifier discrepancy for unsupervised domain adaptation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 3723–3732 (2018)
Sakaridis, C., Dai, D., Gool, L.: Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 126, 973–992 (2018)
Article Google Scholar
Sermanet, P., Lynch, C., Hsu, J., Levine, S.: Time-contrastive networks: Self-supervised learning from multi-view observation. In: CVPRW (2017)
Shen, J., Qu, Y., Zhang, W., Yu, Y.: Wasserstein distance guided representation learning for domain adaptation. In: AAAI (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR (2015) arXiv:1409.1556
Srivastava, N.: Unsupervised learning of visual representations using videos (2015)
Tang, Y., Wang, J., Gao, B., Dellandréa, E., Gaizauskas, R., Chen, L.: Large scale semi-supervised object detection using visual and semantic knowledge transfer. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2119–2128 (2016)
Tang, Y., Zou, W., Jin, Z., Chen, Y., Hua, Y., Li, X.: Weakly supervised salient object detection with spatiotemporal cascade neural networks. IEEE Trans. Circuits and Syst. Video Technol. 29, 1973–1984 (2019)
Article Google Scholar
Tsai, Y.H., Hung, W.C., Schulter, S., Sohn, K., Yang, M.H., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 7472–7481 (2018)
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2962–2971 (2017)
Vu, T.H., Jain, H., Bucher, M., Cord, M., Pérez, P.: Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2512–2521 (2019)
Wan, F., Liu, C., Ke, W., Ji, X., Jiao, J., Ye, Q.: C-mil: Continuation multiple instance learning for weakly supervised object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2194–2203 (2019)
Wu, Z., Xiong, Y., Yu, S., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 3733–3742 (2018)
Xu, C., Zhao, X., Jin, X., Wei, X.S.: Exploring categorical regularization for domain adaptive object detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 11721–11730 (2020)
Xu, M., Wang, H., Ni, B., Tian, Q., Zhang, W.: Cross-domain detection via graph-induced prototype alignment. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 12352–12361 (2020)
Yang, S., Wu, L., Wiliem, A., Lovell, B.C.: Unsupervised domain adaptive object detection using forward-backward cyclic adaptation. (2020) arXiv:2002.00575
Yang, Y., Soatto, S.: Fda: Fourier domain adaptation for semantic segmentation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 4084–4094 (2020)
Yang, Z., Mahajan, D., Ghadiyaram, D., Nevatia, R., Ramanathan, V.: Activity driven weakly supervised object detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2912–2921 (2019)
Yu, F., Wang, D., Chen, Y., Karianakis, N., Yu, P., Lymberopoulos, D., Chen, X.: Unsupervised domain adaptation for object detection via cross-domain semi-supervised learning. (2019) arXiv:1911.07158
Zellinger, W., Grubinger, T., Lughofer, E., Natschläger, T., Saminger-Platz, S.: Central moment discrepancy (cmd) for domain-invariant representation learning. (2017) arXiv:1702.08811
Zhang, H., Tian, Y., Wang, K., He, H., yue Wang, F.: Synthetic-to-real domain adaptation for object instance segmentation. 2019 International Joint Conference on Neural Networks (IJCNN) pp. 1–7 (2019)
Zhang, Q., Zhang, J., Liu, W., Tao, D.: Category anchor-guided unsupervised domain adaptation for semantic segmentation. (2019) arXiv:1910.13049
Zheng, Y., Huang, D., Liu, S., Wang, Y.: Cross-domain object detection through coarse-to-fine feature adaptation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 13763–13772 (2020)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. 2017 IEEE International Conference on Computer Vision (ICCV) pp. 2242–2251 (2017)
Zhu, X., Pang, J., Yang, C., Shi, J., Lin, D.: Adapting object detectors via selective cross-domain alignment. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 687–696 (2019)

Download references

Author information

Authors and Affiliations

Lumada Data Science Lab, Hitachi, Ltd., 1-280 Higashi-koigakubo, Kokubunji-shi, Tokyo, 185-8601, Japan
Ziwei Deng, Quan Kong, Naoto Akira & Tomoaki Yoshinaga

Authors

Ziwei Deng
View author publications
You can also search for this author in PubMed Google Scholar
Quan Kong
View author publications
You can also search for this author in PubMed Google Scholar
Naoto Akira
View author publications
You can also search for this author in PubMed Google Scholar
Tomoaki Yoshinaga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ziwei Deng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deng, Z., Kong, Q., Akira, N. et al. Hierarchical contrastive adaptation for cross-domain object detection. Machine Vision and Applications 33, 62 (2022). https://doi.org/10.1007/s00138-022-01317-7

Download citation

Received: 12 March 2021
Revised: 18 April 2022
Accepted: 15 June 2022
Published: 09 July 2022
DOI: https://doi.org/10.1007/s00138-022-01317-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical contrastive adaptation for cross-domain object detection

Abstract

Access this article

Similar content being viewed by others

Region Feature Disentanglement for Domain Adaptive Object Detection

FIT: Frequency-Based Image Translation for Domain Adaptive Object Detection

Spatial Attention Pyramid Network for Unsupervised Domain Adaptation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical contrastive adaptation for cross-domain object detection

Abstract

Access this article

Similar content being viewed by others

Region Feature Disentanglement for Domain Adaptive Object Detection

FIT: Frequency-Based Image Translation for Domain Adaptive Object Detection

Spatial Attention Pyramid Network for Unsupervised Domain Adaptation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation