Skip to main content
Log in

SFSM: sensitive feature selection module for image semantic segmentation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

One of the great challenges for image semantic segmentation is the loss of object details caused by the extensive use of convolution and pooling operations, such as blurred edges and lines, ignoring small objects, etc. To address these problems, we propose the sensitive feature selection module (SFSM), which learns the distribution characteristics of each pixel on different channels at the same location by utilizing the feature maps from prior convolution layers. Then, the obtained weights are used to reweight each pixel on different channels, so that the object boundaries and small objects can be better focused by the network in the feature extraction process. At last, the information obtained by SFSM is combined with the original features to further improve the feature representation and help to obtain more accurate segmentation results. Experimental results show that our SFSM algorithm can improve the performance of semantic segmentation networks. By integrating our SFSM into FCN and DeepLabv3, we can get 0.21% and 0.6% accuracy improvement on the PASCAL VOC 2012 dataset respectively. For the Cityscapes dataset, although the segmentation task is relatively complicated, our improved networks still achieve excellent performance. To further verify our module is not restricted to specific networks or datasets, we embed it into DoubleU-Net to do medical image segmentation task on dataset ISIC-2018 and get 0.16% accuracy improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  2. Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: transformers make strong encoders for medical image segmentation. arXiv:2102.04306.

  3. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv:1412.7062

  4. Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587

  5. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  6. Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) (2018) encoder-decoder with atrous separable convolution for semantic image segmentation. Proceed European Conf Comput Vision (ECCV):801–818

  7. Codella NCF, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, Kalloo A, Liopyris K, Mishra N, Kittler H, Halpern A (2018) Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic). 2018 IEEE 15th international symposium on biomedical imaging, 2018: 168-172

  8. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 3213–3223

  9. Ding H, Jiang X, Shuai B, Liu AQ, Wang G (2019) semantic correlation promoted shape-variant context for segmentation. Proceed IEEE/CVF Conf Comp Vision Patt Recogn:8885–8894

  10. Ding H, Jiang X, Liu AQ, Thalmann NM (2019) Wang G (2019) boundary-aware feature propagation for scene segmentation. Proceed IEEE/CVF Int Conf Comp Vision:6819–6829

  11. Ding X, Zhang X, Ma N, Han J, Ding G, Sun J (2021) Repvgg: making vgg-style convnets great again. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 13733–13742.

  12. Ess A, Mueller T, Grabner H, Gool LJV (2009) Segmentation-based urban traffic scene understanding. In: 2009 20th British machine vision conference (BMVC). 84.1-84.11

  13. Everingham M, Van GL, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  14. Everingham M, Eslami S, Gool LV, Williams C, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vision (IJCV) 111(1):98–136

    Article  Google Scholar 

  15. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The Kitti vision benchmark suite. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 3354–3361

  16. Guo MH, Liu ZN, Mu TJ, Hu SM (2021) Beyond self-attention: external attention using two linear layers for visual tasks. arXiv:2105.02358

  17. Hariharan B, Arbelaez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. In: proceedings of the IEEE international conference on computer vision (ICCV). 991–998

  18. He J, Deng Z, Zhou L, Wang Y, Qiao Y (2019) Adaptive pyramid context network for semantic segmentation. In: proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). 7519–7528.

  19. He J, Deng Z, Zhou L, Wang Y, Qiao Y (2019) adaptive pyramid context network for semantic. Proceed IEEE/CVF Conf Comput Vision Pattern Recognition:7519–7528

  20. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 770–778

  21. He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: proceedings of the European conference computer vision (ECCV). 630–645

  22. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  23. Hong Y, Pan H, Sun W, Jia Y (2021) Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. arXiv:2101.06085.

  24. Hu J, Shen L, Sun G, Albanie S, Wu E (2018) Squeeze-and-excitation networks. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 7132–7141.

  25. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  26. Jaderberg M, Simonyan K, Zisserman A (2015) Spatial transformer networks. Advan Neural Inform Process Syst (NIPS) 28:2017–2025

    Google Scholar 

  27. Jha D, Smedsrud PH, Riegler MA, Johansen D, Lange TD, Halvorsen P, Johansen HD (2019) Resunet++: an advanced architecture for medical image segmentation. 2019 IEEE international symposium on multimedia (ISM). IEEE 2019:225–2255

    Google Scholar 

  28. Jha D, Riegler MA, Johansen D, Halvorsen P, Johansen HD (2020) Doubleu-net: a deep convolutional neural network for medical image segmentation. 2020 IEEE 33rd international symposium on computer-based medical systems (CBMS). IEEE 2020:558–564

    Google Scholar 

  29. Jia S, Zhang Y (2018) Saliency-based deep convolutional neural network for no-reference image quality assessmen. Multimed Tools Appl 77(12):14859–14872

    Article  Google Scholar 

  30. Ke TW, Hwang JJ, Liu Z, Yu SX (2018) Adaptive affinity fields for semantic segmentation. Proceed Eur Conf Comput Vision (ECCV):587–602

  31. Kim JH, Lee SW, Kwak D, Heo MO, Kim J, Ha JW, Zhang BT (2016) Multimodal residual learning for visual QA. In: advances in neural information processing systems (NIPS). 361–369

  32. Larochelle H, Hinton GE (2010) Learning to combine foveal glimpses with a third-order boltzmann machine. In: advances in neural information processing systems (NIPS), pp 1243–1251

  33. Li X, Liu Z, Luo P, Loy CC, Tang X (2017) Not all pixels are equal: difficulty-aware semantic segmentation via deep layer cascade. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6459–6468

  34. Li Z, Yang S, Song G, Cai L (2021) Hamnet: conformation-guided molecular representation with hamiltonian neural networks. arXiv:2105.03688.

  35. Lin G, Milan A, Shen C, Reid I (2017) Refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1925–1934

  36. Liu C, Chen LC, Schroff F, Adam H, Hua W, Yuille AL, Li LL (2019) Auto-deeplab: hierarchical neural architecture search for semantic image segmentation. Proceed IEEE/CVF Conf Comput Vision Patt Recogn:82–92

  37. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 3431–3440

  38. Maaz M, Shaker A, Cholakkal H, Khan S, Zamir SW, Anwer RM, Khan FS (2022) EdgeNeXt: efficiently amalgamated CNN-transformer architecture for Mobile vision applications. arXiv:2206.10589.

  39. Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: proceedings of the IEEE international conference on computer vision (ICCV), pp 1520–1528

  40. Oberweger M, Wohlhart P, Lepetit V (2015) Hands deep in deep learning for hand pose estimation. arXiv:1502.06807

  41. Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, Glocker B, Rueckert D (2018) Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999

  42. Peng C, Zhang X, Yu G, Luo G (2017) Sun J (2017) large kernel matters--improve semantic segmentation by global convolutional network. Proc IEEE Conf Comput Vis Pattern Recognit:4353–4361

  43. Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters-improve semantic segmentation by global convolutional network. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1743–1751

  44. Pohlen T, Hermans A, Mathias M, Leibe B (2017) Full-resolution residual networks for semantic segmentation in street scenes. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4151–4160

  45. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention, In, pp 234–241

    Google Scholar 

  46. Seichter D, Köhler M, Lewandowski B, Wengefeld T, Gross HM (2021) efficient rgb-d semantic segmentation for indoor scene analysis. IEEE Int Conf Robot Automation (ICRA) 2021:13525–13531

    Google Scholar 

  47. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  48. Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. Adv Neural Inf Proces Syst

  49. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Rabinovich A (2015) Going deeper with convolutions. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 1–9.

  50. Takahashi N, Mitsufuji Y (2021) Densely connected multi-dilated convolutional networks for dense prediction tasks. Proceed IEEE/CVF Conf Comput Vision Pattern Recogn:993–1002

  51. Valada A, Mohan R, Burgard W (2020) Self-supervised model adaptation for multimodal semantic segmentation. Int J Comput Vis 128(5):1239–1285

    Article  MATH  Google Scholar 

  52. Visin F, Ciccone M, Romero A, Kastner K, Cho K, Bengio Y, Matteucci M, Courville A (2016) ReSeg: a recurrent neural network-based model for semantic segmentation. In: proceedings of the IEEE international conference on computer vision (ICCV) workshops. 41–48.

  53. Wan J, Wang D, Hoi SCH, Wu P, Li J (2014) Deep learning for content-based image retrieval: a comprehensive study. In: proceedings of the 22nd ACM international conference on multimedia (ACM). 157–166.

  54. Wang F, Jiang M, Chen Q, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 3156–3164

  55. Wang P, Chen P, Yuan Y, Ding L, Huang Z, Hou X, Cottrell G (2018) Understanding convolution for semantic segmentation. In: proceedings of the IEEE winter conference on applications of computer vision (WACV), pp 1451–1460

  56. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 7794–7803

  57. Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) SegFormer: simple and efficient design for semantic segmentation with transformers. Adv Neural Inf Proces Syst 34:12077–12090

    Google Scholar 

  58. Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 5987–5995

  59. Yan H, Zhang C, Wu M (2022) Lawin transformer: improving semantic segmentation transformer with multi-scale representations via large window attention. arXiv:2201.01615

  60. Yang M, Yu K, Chi Z, Li Z, Yang K (2018) DenseASPP for semantic segmentation in street scenes. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 3684–3692

  61. Yang M, Yu K, Zhang C, Li Z, Yang K (2018) DenseASPP for semantic segmentation in street scenes. Proc IEEE Conf Comput Vis Pattern Recognit:3684–3692

  62. Yoon Y, Jeon HG, Yoo D, Lee JY, Kweon IS (2015) Learning a deep convolutional network for light-field image super-resolution. In: proceedings of the IEEE international conference on computer vision (ICCV) workshops. 24–32

  63. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018, 2018) learning a discriminative feature network for semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit:1857–1866

  64. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) learning a discriminative feature network for semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit:1857–1866

  65. Yu C, Wang J, Gao C, Yu G, Shen C, Sang N (2020) context prior for scene segmentation. Proceed IEEE/CVF Conf Comput Vision Pattern Recogn:12416–12425

  66. Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 7151–7160

  67. Zhang X, Xu H, Mo H, Tan J, Yang C, Wang L, Ren W (2021) DCNAS: densely connected neural architecture search for semantic image segmentation. Proceed IEEE/CVF Conf Comput Vision Pattern Recogn:13956–13967

  68. Zhang Z, Zhang X, Peng C, Xue X (2018) Sun J (2018) Exfuse: enhancing feature fusion for semantic segmentation. Proceed Eur Conf Comput Vision (ECCV):269–284

  69. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 6230–6239

  70. Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++: a nested u-net architecture for medical image segmentation. Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham 2018:3–11

    Google Scholar 

  71. Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. Proceed IEEE/CVF Int Conf Comput Vision:593–602

Download references

Acknowledgements

This work was financially supported by National Natural Science Foundation of China under Grant 62172184, Science and Technology Development Plan of Jilin Province of China under Grant 20200401077GX and 20200201292JC, Social Science Research of the Education Department of Jilin Province (JJKH20210901SK), Jilin Educational Scientific Research Leading Group (ZD21003), and Humanities and Social Science Foundation of Changchun Normal University (2020[011]).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangjiu Che.

Ethics declarations

Conflict of interest

The authors have declared that no conflict of interests or competing interests exist.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, Y., Che, X., Liu, Q. et al. SFSM: sensitive feature selection module for image semantic segmentation. Multimed Tools Appl 82, 13905–13927 (2023). https://doi.org/10.1007/s11042-022-13901-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13901-0

Keywords

Navigation