Adversarial learning based intermediate feature refinement for semantic segmentation

Wang, Dongli; Yuan, Zhitian; Ouyang, Wanli; Li, Baopu; Zhou, Yan

doi:10.1007/s10489-022-04107-w

Adversarial learning based intermediate feature refinement for semantic segmentation

Published: 03 November 2022

Volume 53, pages 14775–14791, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Dongli Wang¹,
Zhitian Yuan¹,
Wanli Ouyang²,
Baopu Li³ &
…
Yan Zhou ORCID: orcid.org/0000-0002-2372-4947¹

391 Accesses
1 Altmetric
Explore all metrics

Abstract

Image semantic segmentation is a meaningful task that requires both accuracy and efficiency in computer vision. At present, most current deep learning based semantic segmentation methods needs extensive computational resources, and knowledge distillation may reduce such a computational burden due to its model compression ability. In this paper, different from previous knowledge distillation methods that directly transfer the knowledge of the teacher network to the student network, we propose a novel intermediate feature refinement method for semantic segmentation based on adversarial learning, which reduces the error and redundant information contained in the teacher network in the process of knowledge distillation, enhances the correct information contained in the teacher network and transfers it to the student network. Then we improve the conventional discriminator in adversarial learning to help the student network align more correct intermediate features in the teacher network. Our method can make the feature distribution of the student network closer to that of the teacher network, and finally improve the segmentation performance of the student network. Finally, we conducted experiments on three popular benchmarks to verify the effectiveness of our proposed method, including Pascal VOC, Cityscapes and CamVid. Compared with the competitive baseline, our proposed method can improve the performance of the student network by up to 1.43% (the mIOU increases from 67.14% to 68.57% on the Cityscapes val set).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of semi- and weakly supervised semantic segmentation of images

Article 06 December 2019

Domain Adaptive Semantic Segmentation Through Structure Enhancement

Local structure consistency and pixel-correlation distillation for compact semantic segmentation

Article 08 July 2022

References

Dai X, Yuan X, Wei X (2021) Tirnet: object detection in thermal infrared images for autonomous driving. Appl Intell 51(4):1–18
Google Scholar
Wang K, Liu M (2021) Yolov3-mt: a yolov3 using multi-target tracking for vehicle visual detection. Applied Intelligence (3)
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Romera E, Alvarez JM, Bergasa LM, Arroyo R (2017) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272
Article Google Scholar
Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. Adv Neur Inform Process Syst 28:1135–1143
Google Scholar
Courbariaux M, Bengio Y, David J-P (2015) Binaryconnect: training deep neural networks with binary weights during propagations. Adv Neur Inform Process Syst 28:3123–3131
Google Scholar
Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: imagenet classification using binary convolutional neural networks. In: European conference on computer vision, pp 525–542. Springer
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv:1503.02531
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: hints for thin deep nets. arXiv:1412.6550
Zagoruyko S, Komodakis N (2016) Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928
Michieli U, Zanuttigh P (2019) Incremental learning techniques for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 0–0
Huang Z, Hao W, Wang X, Tao M, Huang J, Liu W, Hua X-S (2021) Half-real half-fake distillation for class-incremental semantic segmentation. arXiv:2104.00875
Gülçehre Ç, Bengio Y (2016) Knowledge matters: importance of prior information for optimization. J Mach Learn Res 17(1):226–257
MathSciNet MATH Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Lee C-Y, Xie S, Gallagher P, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: Artificial intelligence and statistics, pp 562–570. PMLR
Liu Y, Shu C, Wang J, Shen C (2020) Structured knowledge distillation for dense prediction. IEEE Trans Pattern Anal Mach Intell, 1–1
Xie J, Shuai B, Hu J-F, Lin J, Zheng W-S (2018) Improving fast segmentation with teacher-student learning. arXiv:1810.08476
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neur Inform Process Syst 27:2672–2680
Google Scholar
Wang Y, Ye H, Cao F (2022) A novel multi-discriminator deep network for image segmentation. Appl Intell 52(1):1092–1109
Article Google Scholar
Shen K, Quan H, Han J, Wu M (2022) Uro-gan: an untrustworthy region optimization approach for adipose tissue segmentation based on adversarial learning. Appl Intell, 1–23
Tong H, Fang Z, Wei Z, Cai Q, Gao Y (2021) Sat-net: a side attention network for retinal image segmentation. Appl Intell 51(7):5146–5156
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Yuan Y, Wang J (2018) Ocnet: object context network for scene parsing. arXiv:1809.00916
Woo S, Park J, Lee J-Y, So Kweon I (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19
Tao A, Sapra K, Catanzaro B (2020) Hierarchical multi-scale attention for semantic segmentation. arXiv:2005.10821
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision, pp 325–341
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision, pp 552–568
Michieli U, Zanuttigh P (2021) Continual semantic segmentation via repulsion-attraction of sparse and disentangled latent representations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1114–1124
He T, Shen C, Tian Z, Gong D, Sun C, Yan Y (2019) Knowledge adaptation for efficient semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 578–587
Shu C, Liu Y, Gao J, Xu L, Shen C (2020) Channel-wise distillation for semantic segmentation. arXiv e-prints 2011
Wang Y, Zhou W, Jiang T, Bai X, Xu Y (2020) Intra-class feature variation distillation for semantic segmentation. In: European Conference on computer vision, pp 346–362. Springer
Wang H, Qin Z, Wan T (2018) Text generation based on generative adversarial nets with latent variables. In: Pacific-Asia conference on knowledge discovery and data mining, pp 92–103. Springer
Mirza M, Osindero S (2014) Conditional generative adversarial nets. Computer Science, 2672–2680
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision, pp 694–711. Springer
Liu Y, Qin Z, Wan T, Luo Z (2018) Auto-painter: cartoon image generation from sketch by using conditional wasserstein generative adversarial networks. Neurocomputing 311:78–87
Article Google Scholar
Luc P, Couprie C, Chintala S, Verbeek J (2016) Semantic segmentation using adversarial networks. arXiv:1611.08408
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning, pp 7354–7363. PMLR
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 603–612
Shen Z, Zhang M, Zhao H, Yi S, Li H (2021) Efficient attention: attention with linear complexities. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3531–3539
Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: 2019 IEEE/CVF International conference on computer vision workshop (ICCVW), pp 0–0
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: 2011 International conference on computer vision, pp 991–998. IEEE
Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: European conference on computer vision, pp 44–57. Springer
Tan M, Le QV (2019) Efficientnet: rethinking model scaling for convolutional neural networks. arXiv:1905.11946
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch 9
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Inverted residuals and linear bottlenecks: mobile networks for classification detection and segmentation
Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 593–602
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3146–3154
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587

Download references

Acknowledgment

This work is supported by the National Key R&D Program of China (2020YFA0713503), the Project of Department of Science and Technology of Hunan Province (2020GK2036), the National Key R&D Program of China (2020YFA0713503), the Project of Shanghai Municipal Science and Technology Commission (19511120900), the National Natural Science Foundation of China (61773330) and Aeronautical Science Foundation of China (20200020114004).

Author information

Authors and Affiliations

Xiangtan University, Xiangtan, People’s Republic of China
Dongli Wang, Zhitian Yuan & Yan Zhou
The University of Sydney, Sydney, Australia
Wanli Ouyang
Baidu USA, Sunnyvale, CA, USA
Baopu Li

Authors

Dongli Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhitian Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Wanli Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Baopu Li
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Zhou.

Ethics declarations

Conflict of Interests

The authors declare that there is no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, D., Yuan, Z., Ouyang, W. et al. Adversarial learning based intermediate feature refinement for semantic segmentation. Appl Intell 53, 14775–14791 (2023). https://doi.org/10.1007/s10489-022-04107-w

Download citation

Accepted: 23 August 2022
Published: 03 November 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10489-022-04107-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial learning based intermediate feature refinement for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

A survey of semi- and weakly supervised semantic segmentation of images

Domain Adaptive Semantic Segmentation Through Structure Enhancement

Local structure consistency and pixel-correlation distillation for compact semantic segmentation

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adversarial learning based intermediate feature refinement for semantic segmentation

Abstract

Access this article

Similar content being viewed by others

A survey of semi- and weakly supervised semantic segmentation of images

Domain Adaptive Semantic Segmentation Through Structure Enhancement

Local structure consistency and pixel-correlation distillation for compact semantic segmentation

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation