Skip to main content
Log in

Adversarial learning based intermediate feature refinement for semantic segmentation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Image semantic segmentation is a meaningful task that requires both accuracy and efficiency in computer vision. At present, most current deep learning based semantic segmentation methods needs extensive computational resources, and knowledge distillation may reduce such a computational burden due to its model compression ability. In this paper, different from previous knowledge distillation methods that directly transfer the knowledge of the teacher network to the student network, we propose a novel intermediate feature refinement method for semantic segmentation based on adversarial learning, which reduces the error and redundant information contained in the teacher network in the process of knowledge distillation, enhances the correct information contained in the teacher network and transfers it to the student network. Then we improve the conventional discriminator in adversarial learning to help the student network align more correct intermediate features in the teacher network. Our method can make the feature distribution of the student network closer to that of the teacher network, and finally improve the segmentation performance of the student network. Finally, we conducted experiments on three popular benchmarks to verify the effectiveness of our proposed method, including Pascal VOC, Cityscapes and CamVid. Compared with the competitive baseline, our proposed method can improve the performance of the student network by up to 1.43% (the mIOU increases from 67.14% to 68.57% on the Cityscapes val set).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Dai X, Yuan X, Wei X (2021) Tirnet: object detection in thermal infrared images for autonomous driving. Appl Intell 51(4):1–18

    Google Scholar 

  2. Wang K, Liu M (2021) Yolov3-mt: a yolov3 using multi-target tracking for vehicle visual detection. Applied Intelligence (3)

  3. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890

  4. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818

  5. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  6. Romera E, Alvarez JM, Bergasa LM, Arroyo R (2017) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272

    Article  Google Scholar 

  7. Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. Adv Neur Inform Process Syst 28:1135–1143

    Google Scholar 

  8. Courbariaux M, Bengio Y, David J-P (2015) Binaryconnect: training deep neural networks with binary weights during propagations. Adv Neur Inform Process Syst 28:3123–3131

    Google Scholar 

  9. Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: imagenet classification using binary convolutional neural networks. In: European conference on computer vision, pp 525–542. Springer

  10. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv:1503.02531

  11. Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: hints for thin deep nets. arXiv:1412.6550

  12. Zagoruyko S, Komodakis N (2016) Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928

  13. Michieli U, Zanuttigh P (2019) Incremental learning techniques for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 0–0

  14. Huang Z, Hao W, Wang X, Tao M, Huang J, Liu W, Hua X-S (2021) Half-real half-fake distillation for class-incremental semantic segmentation. arXiv:2104.00875

  15. Gülçehre Ç, Bengio Y (2016) Knowledge matters: importance of prior information for optimization. J Mach Learn Res 17(1):226–257

    MathSciNet  MATH  Google Scholar 

  16. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  17. Lee C-Y, Xie S, Gallagher P, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: Artificial intelligence and statistics, pp 562–570. PMLR

  18. Liu Y, Shu C, Wang J, Shen C (2020) Structured knowledge distillation for dense prediction. IEEE Trans Pattern Anal Mach Intell, 1–1

  19. Xie J, Shuai B, Hu J-F, Lin J, Zheng W-S (2018) Improving fast segmentation with teacher-student learning. arXiv:1810.08476

  20. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neur Inform Process Syst 27:2672–2680

    Google Scholar 

  21. Wang Y, Ye H, Cao F (2022) A novel multi-discriminator deep network for image segmentation. Appl Intell 52(1):1092–1109

    Article  Google Scholar 

  22. Shen K, Quan H, Han J, Wu M (2022) Uro-gan: an untrustworthy region optimization approach for adipose tissue segmentation based on adversarial learning. Appl Intell, 1–23

  23. Tong H, Fang Z, Wei Z, Cai Q, Gao Y (2021) Sat-net: a side attention network for retinal image segmentation. Appl Intell 51(7):5146–5156

    Article  Google Scholar 

  24. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  25. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  26. Yuan Y, Wang J (2018) Ocnet: object context network for scene parsing. arXiv:1809.00916

  27. Woo S, Park J, Lee J-Y, So Kweon I (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19

  28. Tao A, Sapra K, Catanzaro B (2020) Hierarchical multi-scale attention for semantic segmentation. arXiv:2005.10821

  29. Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147

  30. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  31. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision, pp 325–341

  32. Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision, pp 552–568

  33. Michieli U, Zanuttigh P (2021) Continual semantic segmentation via repulsion-attraction of sparse and disentangled latent representations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1114–1124

  34. He T, Shen C, Tian Z, Gong D, Sun C, Yan Y (2019) Knowledge adaptation for efficient semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 578–587

  35. Shu C, Liu Y, Gao J, Xu L, Shen C (2020) Channel-wise distillation for semantic segmentation. arXiv e-prints 2011

  36. Wang Y, Zhou W, Jiang T, Bai X, Xu Y (2020) Intra-class feature variation distillation for semantic segmentation. In: European Conference on computer vision, pp 346–362. Springer

  37. Wang H, Qin Z, Wan T (2018) Text generation based on generative adversarial nets with latent variables. In: Pacific-Asia conference on knowledge discovery and data mining, pp 92–103. Springer

  38. Mirza M, Osindero S (2014) Conditional generative adversarial nets. Computer Science, 2672–2680

  39. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision, pp 694–711. Springer

  40. Liu Y, Qin Z, Wan T, Luo Z (2018) Auto-painter: cartoon image generation from sketch by using conditional wasserstein generative adversarial networks. Neurocomputing 311:78–87

    Article  Google Scholar 

  41. Luc P, Couprie C, Chintala S, Verbeek J (2016) Semantic segmentation using adversarial networks. arXiv:1611.08408

  42. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777

  43. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434

  44. Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning, pp 7354–7363. PMLR

  45. Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 603–612

  46. Shen Z, Zhang M, Zhao H, Yi S, Li H (2021) Efficient attention: attention with linear complexities. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3531–3539

  47. Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: 2019 IEEE/CVF International conference on computer vision workshop (ICCVW), pp 0–0

  48. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223

  49. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  50. Hariharan B, Arbeláez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: 2011 International conference on computer vision, pp 991–998. IEEE

  51. Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: European conference on computer vision, pp 44–57. Springer

  52. Tan M, Le QV (2019) Efficientnet: rethinking model scaling for convolutional neural networks. arXiv:1905.11946

  53. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch 9

  54. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Inverted residuals and linear bottlenecks: mobile networks for classification detection and segmentation

  55. Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 593–602

  56. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3146–3154

  57. Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587

Download references

Acknowledgment

This work is supported by the National Key R&D Program of China (2020YFA0713503), the Project of Department of Science and Technology of Hunan Province (2020GK2036), the National Key R&D Program of China (2020YFA0713503), the Project of Shanghai Municipal Science and Technology Commission (19511120900), the National Natural Science Foundation of China (61773330) and Aeronautical Science Foundation of China (20200020114004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Zhou.

Ethics declarations

Conflict of Interests

The authors declare that there is no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, D., Yuan, Z., Ouyang, W. et al. Adversarial learning based intermediate feature refinement for semantic segmentation. Appl Intell 53, 14775–14791 (2023). https://doi.org/10.1007/s10489-022-04107-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04107-w

Keywords

Navigation