Skip to main content

BSAM: Bidirectional Scene-Aware Mixup for Unsupervised Domain Adaptation in Semantic Segmentation

  • Conference paper
  • First Online:
Artificial Intelligence (CICAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13604))

Included in the following conference series:

  • 1330 Accesses

Abstract

Unsupervised domain adaptation for semantic segmentation aims to transfer the knowledge from the labeled source domain to the unlabeled target domain. Existing mixup methods usually paste parts of the source domain images onto the target domain images. However, they often neglect the scene consistency of the generated images, which will result in wrong semantic relationships. To address this issue, we propose a Bidirectional Scene-Aware Mixup (BSAM) method in this paper. BSAM adopts a bi-directional pasting strategy to ensure scene awareness between the two domains, and takes the correctness of semantic relationships into account. Specifically, BSAM selects some contextually related classes from each domain to another domain, and generates bidirectional fused images for training. BSAM ensures the correct scene layout, facilitating the model to adapt to the different scenario characteristics. Extensive experiments on two benchmarks (GTA5 to Cityscapes and SYNTHIA to Cityscapes) demonstrate that BSAM achieves state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Caesar, H., Uijlings, J.R.R., Ferrari, V.: COCO-Stuff: thing and stuff classes in context. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018, pp. 1209–1218. Computer Vision Foundation/IEEE Computer Society (2018)

    Google Scholar 

  2. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp. 3213–3223. IEEE Computer Society (2016)

    Google Scholar 

  3. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA, pp. 248–255. IEEE Computer Society (2009)

    Google Scholar 

  4. Ettedgui, S., Hussein, S.A., Giryes, R.: Procst: boosting semantic segmentation using progressive cyclic style-transfer. CoRR abs/2204.11891 (2022)

    Google Scholar 

  5. French, G., Oliver, A., Salimans, T.: Milking cowmask for semi-supervised image classification. In: Farinella, G.M., Radeva, P., Bouatouch, K. (eds.) Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2022, Volume 5: VISAPP, Online Streaming, February 6–8, 2022, pp. 75–84. SCITEPRESS (2022)

    Google Scholar 

  6. French, G., Laine, S., Aila, T., Mackiewicz, M., On, G.D.F.: Semi-supervised semantic segmentation needs strong, varied perturbations. In: 31st British Machine Vision Conference 2020, BMVC 2020, Virtual Event, UK, September 7–10, 2020. BMVA Press (2020)

    Google Scholar 

  7. Gao, L., Zhang, J., Zhang, L., Tao, D.: DSP: dual soft-paste for unsupervised domain adaptive semantic segmentation. In: Shen, H.T., et al. (eds.) MM ’21: ACM Multimedia Conference, Virtual Event, China, October 20–24, 2021, pp. 2825–2833. ACM (2021)

    Google Scholar 

  8. Hoffman, J., et al.: Cycada: cycle-consistent adversarial domain adaptation. CoRR abs/1711.03213 (2017)

    Google Scholar 

  9. Hoffman, J., Wang, D., Yu, F., Darrell, T.: FCNs in the wild: pixel-level adversarial and constraint-based adaptation. CoRR abs/1612.02649 (2016)

    Google Scholar 

  10. Hoyer, L., Dai, D., Gool, L.V.: Daformer: improving network architectures and training strategies for domain-adaptive semantic segmentation. CoRR abs/2111.14887 (2021)

    Google Scholar 

  11. Lee, D.H., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML. vol. 3, p. 896 (2013)

    Google Scholar 

  12. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6–9, 2019. OpenReview.net (2019)

    Google Scholar 

  13. Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7

    Chapter  Google Scholar 

  14. Ros, G., Sellart, L., Materzynska, J., Vázquez, D., López, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp. 3234–3243. IEEE Computer Society (2016)

    Google Scholar 

  15. Saito, K., Watanabe, K., Ushiku, Y., Harada, T.: Maximum classifier discrepancy for unsupervised domain adaptation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018, pp. 3723–3732. Computer Vision Foundation/IEEE Computer Society (2018)

    Google Scholar 

  16. Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Workshop Track Proceedings. OpenReview.net (2017)

    Google Scholar 

  17. Tranheden, W., Olsson, V., Pinto, J., Svensson, L.: DACS: domain adaptation via cross-domain mixed sampling. CoRR abs/2007.08702 (2020)

    Google Scholar 

  18. Tsai, Y., Hung, W., Schulter, S., Sohn, K., Yang, M., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018, pp. 7472–7481. Computer Vision Foundation/IEEE Computer Society (2018)

    Google Scholar 

  19. Vu, T., Jain, H., Bucher, M., Cord, M., Pérez, P.: ADVENT: adversarial entropy minimization for domain adaptation in semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 2517–2526. Computer Vision Foundation/IEEE (2019)

    Google Scholar 

  20. Wang, Q., Dai, D., Hoyer, L., Gool, L.V., Fink, O.: Domain adaptive semantic segmentation with self-supervised depth estimation. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10–17, 2021, pp. 8495–8505. IEEE (2021)

    Google Scholar 

  21. Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6–14, 2021, virtual, pp. 12077–12090 (2021)

    Google Scholar 

  22. Xu, Y., et al.: Multi-task learning with multi-query transformer for dense prediction. arXiv (2022). https://doi.org/10.48550/ARXIV.2205.14354

  23. Zhang, P., Zhang, B., Zhang, T., Chen, D., Wang, Y., Wen, F.: Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, 2021, pp. 12414–12424. Computer Vision Foundation/IEEE (2021)

    Google Scholar 

  24. Zou, Y., Yu, Z., Kumar, B.V.K.V., Wang, J.: Domain adaptation for semantic segmentation via class-balanced self-training. CoRR abs/1810.07911 (2018)

    Google Scholar 

  25. Zou, Y., Yu, Z., Liu, X., Kumar, B.V.K.V., Wang, J.: Confidence regularized self-training. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 5981–5990. IEEE (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lefei Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xing, C., Li, G., Zhang, L. (2022). BSAM: Bidirectional Scene-Aware Mixup for Unsupervised Domain Adaptation in Semantic Segmentation. In: Fang, L., Povey, D., Zhai, G., Mei, T., Wang, R. (eds) Artificial Intelligence. CICAI 2022. Lecture Notes in Computer Science(), vol 13604. Springer, Cham. https://doi.org/10.1007/978-3-031-20497-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20497-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20496-8

  • Online ISBN: 978-3-031-20497-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics