Object detection based on semi-supervised domain adaptation for imbalanced domain resources

Li, Wei; Wang, Meng; Wang, Hongbin; Zhang, Yafei

doi:10.1007/s00138-020-01068-3

Object detection based on semi-supervised domain adaptation for imbalanced domain resources

Original Paper
Published: 25 March 2020

Volume 31, article number 18, (2020)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Wei Li^1,2,
Meng Wang ORCID: orcid.org/0000-0002-6245-2673^1,2,
Hongbin Wang^1,2 &
…
Yafei Zhang^1,2

882 Accesses
8 Citations
Explore all metrics

Abstract

On specified scenarios, models trained on specific datasets (source domain) can generalize well to novel scenes (target domain) via knowledge transfer. However, these source detectors might not be perfectly aligned with a low target resource due to the imbalanced and inconsistent domain shift involved. In this paper, we propose a semi-supervised detector that adapts the domain shifts on both appearance and semantic levels. Based on this, two components are introduced as appearance adaptation networks with instance and batch normalization, and semantic adaptation networks where an adversarial transferring procedure is embedded by re-weighting the discriminator loss to improve the feature alignments between the two domains with imbalanced scales. Furthermore, a self-paced training procedure is performed to re-train the detector by alternately generating pseudo-labels in the target domain from easy to hard. In our experiments, an empirical analysis of the proposed framework is conducted by evaluating performance in various datasets such as Cityscapes and VOC0712, and the results verify the higher accuracy and effectiveness of the proposed detector in comparison with state-of-the-art detectors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1106–1114 (2012)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Girshick, R.B.: Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 39, 1137–1149 (2015)
Article Google Scholar
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 3213–3223 (2016)
Sakaridis, C., Dai, D., Gool, L.V.: Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 126(9), 973–992 (2018)
Article Google Scholar
Bilen, H., Pedersoli, M., Tuytelaars, T.: Weakly supervised object detection with convex clustering. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1081–1089 (2015)
Song, H.O., Lee, Y.J., Jegelka, S., Darrell, T.: Weakly-supervised discovery of visual pattern configurations. In: Advances in Neural Information Processing Systems (NIPS), pp. 1637–1645 (2014)
Tang, P., Wang, X., Bai, X., Liu, W.: Multiple instance detection network with online instance classifier refinement. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3059–3067 (2017)
Li, D., Huang, J., Li, Y., Wang, S., Yang, M.: Weakly supervised object localization with progressive domain adaptation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3512–3520 (2016)
Kantorov, V., Oquab, M., Cho, M., Laptev, I.: Contextlocnet: context-aware deep network models for weakly supervised localization. In: ECCV 2016—-14th European Conference on Computer Vision, Amsterdam, 11–14 October 2016, Proceedings, Part V, pp. 350–365 (2016)
Bilen, H., Vedaldi, A.: Weakly supervised deep detection networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2846–2854 (2016)
Sun, B., Feng, J., Saenko, K.: Return of frustratingly easy domain adaptation. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 12–17 February 2016, Phoenix, Arizona, pp. 2058–2065 (2016)
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, 6–11 July 2015, pp. 97–105 (2015)
Long, M., Zhu, H., Wang, J., Jordan, M.I.: Deep transfer learning with joint adaptation networks. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, 6–11 August 2017, pp. 2208–2217 (2017)
Long, M., Zhu, H., Wang, J., Jordan, M.I.: Unsupervised domain adaptation with residual transfer networks. In: Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 5–10 December 2016, Barcelona, pp. 136–144 (2016)
Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance (2014). arXiv preprint arXiv:1412.3474
Peng, X., Usman, B., Saito, K., Kaushik, N., Hoffman, J., Saenko, K.: Syn2real: a new benchmark for synthetic-to-real visual domain adaptation (2018). arXiv preprint arXiv:1806.09755
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (NIPS), pp. 2672–2680 (2014)
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, 21–26 July 2017, pp. 2962–2971 (2017)
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.S.: Domain-adversarial training of neural networks. In: Domain Adaptation in Computer Vision Applications., pp. 189–209 (2017)
Pan, X., Luo, P., Shi, J., Tang, X.: Two at once: enhancing learning and generalization capacities via IBN-Net. In: ECCV 2018—15th European Conference on Computer Vision, Munich, 8–14 September 2018, Proceedings, Part IV, pp. 484–500 (2018)
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22–29 October 2017, pp. 2242–2251 (2017)
Inoue, N., Furuta, R., Yamasaki, T., Aizawa, K.: Cross-domain weakly-supervised object detection through progressive domain adaptation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, 18–22 June 2018, pp. 5001–5009 (2018)
Wang, K., Yan, X., Zhang, D., Lin, L.: Towards human-machine cooperation: self-supervised sample mining for object detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1605–1613 (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. In: ECCV 2016—14th European Conference on Computer Vision, Amsterdam, 11–14 October 2016, Proceedings, Part I, pp. 21–37 (2016)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018). arXiv preprint arXiv:1804.02767
Fleet, D.J., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.): ECCV 2014–13th European Conference on Computer Vision, Zurich, 6–12 September, 2014, Proceedings, Part V. Lecture Notes in Computer Science, vol. 8693. Springer (2014)
van de Sande, K.E.A., Uijlings, J.R.R., Gevers, T., Smeulders, A.W.M.: Segmentation as selective search for object recognition. In: IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, 6–13 November 2011, pp. 1879–1886 (2011)
Csurka, G.: A comprehensive survey on domain adaptation for visual applications. In: Domain Adaptation in Computer Vision Applications, pp. 1–35 (2017)
Ganin, Y., Lempitsky, V.S.: Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, 6–11 July 2015, pp. 1180–1189 (2015)
Chadha, A., Andreopoulos, Y.: Improving adversarial discriminative domain adaptation (2018). arXiv preprint arXiv:1809.03625
Shu, R., Bui, H.H., Narui, H., Ermon, S.: A DIRT-T approach to unsupervised domain adaptation. In: International Conference on Learning Representations (ICLR) (2018)
Romijnders, R., Meletis, P., Dubbelman, G.: A domain agnostic normalization layer for unsupervised adversarial domain adaptation. In: Winter Conference on Applications of Computer Vision (WACV), pp. 1866–1875 (2019)
Zhang, Y., David, P., Gong, B.: Curriculum domain adaptation for semantic segmentation of urban scenes. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22–29 October 2017, pp. 2039–2049 (2017)
Hoffman, J., Wang, D., Yu, F., Darrell, T.: Fcns in the wild: pixel-level adversarial and constraint-based adaptation (2016). arXiv preprint arXiv:1612.02649
Sankaranarayanan, S., Balaji, Y., Jain, A., Lim, S., Chellappa, R.: Unsupervised domain adaptation for semantic segmentation with gans (2017). arXiv preprint arxiv:1711.06969
Zhu, X., Zhou, H., Yang, C., Shi, J., Lin, D.: Penalizing top performers: conservative loss for semantic segmentation adaptation. In: ECCV 2018—15th European Conference on Computer Vision, Munich, 8–14 September 2018, Proceedings, Part VII, pp. 587–603 (2018)
Zhang, Y., Qiu, Z., Yao, T., Liu, D., Mei, T.: Fully convolutional adaptation networks for semantic segmentation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, 18–22 June 2018, pp. 6810–6818 (2018)
Hung, W., Tsai, Y., Liou, Y., Lin, Y., Yang, M.: Adversarial learning for semi-supervised semantic segmentation. In: British Machine Vision Conference 2018, BMVC 2018, Northumbria University, Newcastle, 3–6 September 2018, p. 65 (2018)
Tsai, Y., Hung, W., Schulter, S., Sohn, K., Yang, M., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, 18–22 June 2018, pp. 7472–7481 (2018)
Hoffman, J., Tzeng, E., Park, T., Zhu, J., Isola, P., Saenko, K., Efros, A.A., Darrell, T.: Cycada: cycle-consistent adversarial domain adaptation. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, 10–15 July 2018, pp. 1994–2003 (2018)
Hoffman, J., Guadarrama, S., Tzeng, E., Donahue, J., Girshick, R., Darrell, T., Saenko, K.: LSDA: Large scale detection through adaptation. In: Advances in Neural Information Processing Systems (NIPS), vol. 4, pp. 3536–3544 (2014)
Tang, Y., Wang, J., Wang, X., Gao, B., Dellandréa, E., Gaizauskas, R.J., Chen, L.: Visual and semantic knowledge transfer for large scale semi-supervised object detection. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 3045–3058 (2018). https://doi.org/10.1109/TPAMI.2017.2771779
Article Google Scholar
Shi, Z., Siva, P., Xiang, T.: Transfer learning by ranking for weakly supervised object annotation. In: British Machine Vision Conference, BMVC 2012, Surrey, 3–7 September 2012, pp. 1–11 (2012)
Hoffman, J., Pathak, D., Darrell, T., Saenko, K.: Detector discovery in the wild: joint multiple instance and representation learning (2014). arXiv preprint arXiv:1412.1135
Chen, Y., Li, W., Sakaridis, C., Dai, D., Gool, L.V.: Domain adaptive faster R-CNN for object detection in the wild. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, 18–22 June 2018, pp. 3339–3348 (2018)
Liu, L., Lin, W., Wu, L., Yu, Y., Yang, M.Y.: Unsupervised deep domain adaptation for pedestrian detection. In: Computer Vision—ECCV 2016 Workshops, Amsterdam, 8–10 and 15–16 October 2016, Proceedings, Part II, pp. 676–691 (2016)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 7–12 December 2015, Montreal, Quebec, pp. 91–99 (2015)
Carlucci, F.M., Porzi, L., Caputo, B., Ricci, E., Bul, S.R.: AutoDIAL: automatic domain alignment layers. In: International Conference on Computer Vision, ICCV , Venice, 22–29 October, pp. 5077–5085 (2017)
Li, Y., Wang, N., Shi, J., Hou, X., Liu, J.: Adaptive batch normalization for practical domain adaptation. Pattern Recognit. 80, 109–117 (2018)
Article Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, 6–11 July 2015, pp. 448–456 (2015)
Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, 21–26 July 2017, pp. 4105–4113 (2017)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: ICML (2009)
Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 1189–1197. Curran Associates, Inc. (2010)
Dong, X., Zheng, L., Ma, F., Yang, Y., Meng, D.: Few-shot object detection (2017). arXiv preprint arXiv:1706.0824
Ma, F., Meng, D., Xie, Q., Li, Z., Dong, X.: Self-paced co-training. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, 6–11 August 2017, pp. 2275–2284 (2017)
Zhang, D., Meng, D., Zhao, L., Han, J.: Bridging saliency detection to weakly supervised object detection based on self-paced curriculum learning (2017). arXiv preprint arXiv:1703.01290
Zou, Y., Yu, Z., Vijaya Kumar, B., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: The European Conference on Computer Vision (ECCV) (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22–29 October 2017, pp. 2999–3007 (2017)
Tang, K.D., Ramanathan, V., Li, F., Koller, D.: Shifting weights: adapting object detectors from image to video. In: Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held 3–6 December 2012, Lake Tahoe, Nevada, pp. 647–655 (2012)
Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: learning bounds and algorithms. In: COLT 2009—The 22nd Conference on Learning Theory, Montreal, Quebec, 18–21 June 2009 (2009)
Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010)
Article MathSciNet Google Scholar
Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, Vol. 30, pp. 180–191. VLDB Endowment (2004)
Odena, A.: Semi-supervised learning with generative adversarial networks (2016). arXiv preprint arXiv:1606.01583
Everingham, M., Gool, L.J.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Tokui, S., Oono, K., Hido, S., Clayton, J.: Chainer: a next-generation open source framework for deep learning. In: Advances in Neural Information Processing Systems (NIPS), pp. 1–6 (2015)
Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: The European Conference on Computer Vision (ECCV), pp. 340–353 (2012)

Download references

Acknowledgements

This research was funded by National Natural Science Foundation of China (61563025, 61562053, 61762056), Yunnan Science and Technology Department of Science and Technology Project (2016FB109, 2017FB094) and Scientific Research Foundation of Yunnan Education Department (2017ZZX149).

Author information

Authors and Affiliations

Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, China
Wei Li, Meng Wang, Hongbin Wang & Yafei Zhang
Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, 650500, China
Wei Li, Meng Wang, Hongbin Wang & Yafei Zhang

Authors

Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
Meng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yafei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, W., Wang, M., Wang, H. et al. Object detection based on semi-supervised domain adaptation for imbalanced domain resources. Machine Vision and Applications 31, 18 (2020). https://doi.org/10.1007/s00138-020-01068-3

Download citation

Received: 25 March 2019
Revised: 09 January 2020
Accepted: 18 February 2020
Published: 25 March 2020
DOI: https://doi.org/10.1007/s00138-020-01068-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection based on semi-supervised domain adaptation for imbalanced domain resources

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

A survey on Image Data Augmentation for Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Object detection based on semi-supervised domain adaptation for imbalanced domain resources

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

A survey on Image Data Augmentation for Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation