Skip to main content
Log in

Impact of data smoothing on semantic segmentation

  • S.I. : WorldCIST’20
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Semantic segmentation is the process to classify each pixel of an image. The current state-of-the-art semantic segmentation techniques use end-to-end trainable deep models. Generally, the training of these models is controlled by some external hyper-parameters rather to use the variation in data. In this paper, we investigate the impact of data smoothing on the training and generalization of deep semantic segmentation models. A mechanism is proposed to select the best level of smoothing to get better generalization of the deep semantic segmentation models. Furthermore, a smoothing layer is included in the deep semantic segmentation models to automatically adjust the level of smoothing. Extensive experiments are performed to validate the effectiveness of the proposed smoothing strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Atick JJ (2011) Could information theory provide an ecological theory of sensory processing? Netw Comput Neural Syst 22(1–4):4–44. https://doi.org/10.3109/0954898X.2011.638888

    Article  MATH  Google Scholar 

  2. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  3. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  4. Barlow HB (2012) Possible principles underlying the transformations of sensory messages. In: Sensory Communication, MIT Press. https://doi.org/10.7551/mitpress/9780262518420.003.0013

  5. Bergstra J, Cox DD (2013) Hyperparameter optimization and boosting for classifying facial expressions: How good can a “null” model be? arXiv preprint arXiv:1306.3476

  6. Bjerrum EJ, Glahder M, Skov T (2017) Data augmentation of spectral data for convolutional neural network (CNN) based deep chemometrics. arXiv preprint arXiv:1710.01927

  7. Burton GJ, Moorhead IR (1987) Color and spatial structure in natural scenes. Appl Opt 26(1):157–170. https://doi.org/10.1364/AO.26.000157

    Article  Google Scholar 

  8. Cai X, Chan R, Nikolova M, Zeng T (2017) A three-stage approach for segmenting degraded color images: smoothing, lifting and thresholding (SLAT). J Sci Comput 72(3):1313–1332

    Article  MathSciNet  Google Scholar 

  9. Cesarei AD, Loftus GR, Mastria S, Codispoti M (2017) Understanding natural scenes: contributions of image statistics. Neurosci Biobehav Rev 74:44–57. https://doi.org/10.1016/j.neubiorev.2017.01.012

    Article  Google Scholar 

  10. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  11. Ciresan D, Giusti A, Gambardella LM, Schmidhuber J (2012) Deep neural networks segment neuronal membranes in electron microscopy images. In: Advances in neural information processing systems, pp 2843–2851

  12. Couprie C, Farabet C, Najman L, LeCun Y (2013) Indoor semantic segmentation using depth information. arXiv preprint arXiv:1301.3572

  13. Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2019) Autoaugment: learning augmentation strategies from data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 113–123

  14. Dai J, He K, Sun J (2015) Boxsup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1635–1643

  15. Dai J, He K, Sun J (2016) Instance-aware semantic segmentation via multi-task network cascades. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3150–3158

  16. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159

    MathSciNet  MATH  Google Scholar 

  17. Dvornik N, Mairal J, Schmid C (2019) On the importance of visual context for data augmentation in scene understanding. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2961896

    Article  Google Scholar 

  18. Fan H, Zhu H (2018) Preservation of image edge feature based on snowfall model smoothing filter. EURASIP J Image Video Process 2018(1):67

    Article  Google Scholar 

  19. Farabet C, Couprie C, Najman L, LeCun Y (2012) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929

    Article  Google Scholar 

  20. Fawzi A, Samulowitz H, Turaga D, Frossard P (2016) Adaptive data augmentation for image classification. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 3688–3692

  21. Fergus R, Singh B, Hertzmann AP, Roweis ST, Roweis ST, Freeman WT (2006) Removing camera shake from a single photograph. ACM Trans Graph 25(3):787–794

    Article  Google Scholar 

  22. Field DJ (1987) Relations between the statistics of natural images and the response properties of cortical cells. J Opt Soc Am A 4(12):2379–2394. https://doi.org/10.1364/JOSAA.4.002379

    Article  Google Scholar 

  23. Field DJ, Hayes A, Hess RF (1993) Contour integration by the human visual system: evidence for a local “association field”. Vis Res 33(2):173–193. https://doi.org/10.1016/0042-6989(93)90156-Q

    Article  Google Scholar 

  24. Fu J, Liu J, Wang Y, Zhou J, Wang C, Lu H (2019) Stacked deconvolutional network for semantic segmentation. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2019.2895460

    Article  Google Scholar 

  25. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  26. Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 6645–6649. IEEE (2013)

  27. Grill-Spector K, Malach R (2004) The human visual cortex. Annu Rev Neurosci 27:649–677

    Article  Google Scholar 

  28. Gu Z, Ju M, Zhang D (2017) A novel retinex image enhancement approach via brightness channel prior and change of detail prior. Pattern Recognit Image Anal 27(2):234–242

    Article  Google Scholar 

  29. Guo L, Chen L, Chen CP, Zhou J (2018) Integrating guided filter into fuzzy clustering for noisy image segmentation. Digit Signal Process 83:235–248

    Article  Google Scholar 

  30. Gupta S, Girshick R, Arbeláez P, Malik J (2014) Learning rich features from RGB-D images for object detection and segmentation. In: European conference on computer vision. Springer, Berlin, pp 345–360

  31. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  32. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  33. Hugelier S, Vitale R, Ruckebusch C (2018) Edge-preserving image smoothing constraint in multivariate curve resolution-alternating least squares (MCR-ALS) of hyperspectral data. Appl Spectrosc 72(3):420–431

    Article  Google Scholar 

  34. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  35. Khan A, Jaffar MA (2015) Genetic algorithm and self organizing map based fuzzy hybrid intelligent method for color image segmentation. Appl Soft Comput 32:300–310

    Article  Google Scholar 

  36. Khan A, Jaffar MA, Choi TS (2013) Som and fuzzy based color image segmentation. Multimed Tools Appl 64(2):331–344

    Article  Google Scholar 

  37. Khan A, Jaffar MA, Shao L (2015) A modified adaptive differential evolution algorithm for color image segmentation. Knowl Inf Syst 43(3):583–597

    Article  Google Scholar 

  38. Khan A, ur Rehman Z, Jaffar MA, Ullah J, Din A, Ali A, Ullah N (2019) Color image segmentation using genetic algorithm with aggregation-based clustering validity index (CVI). Signal Image Video Process 13(5):833–841

    Article  Google Scholar 

  39. Khan A, Ullah J, Jaffar MA, Choi TS (2014) Color image segmentation: a novel spatial fuzzy genetic algorithm. Signal Image Video Process 8(7):1233–1243

    Article  Google Scholar 

  40. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  41. Levin A, Weiss Y, Durand F, Freeman WT (2009) Understanding and evaluating blind deconvolution algorithms. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1964–1971

  42. Li Y, Qi H, Dai J, Ji X, Wei Y (2017) Fully convolutional instance-aware semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2359–2367

  43. Liu D, Wen B, Liu X, Wang Z, Huang TS (2017) When image denoising meets high-level vision tasks: a deep learning approach. arXiv preprint arXiv:1706.04284

  44. Liu Q, Xiong B, Zhang M (2014) Adaptive sparse norm and nonlocal total variation methods for image smoothing. Math Probl Eng. https://doi.org/10.1155/2014/426125

    Article  Google Scholar 

  45. Liu S, Zhang J, Chen Y, Liu Y, Qin Z, Wan T (2019) Pixel level data augmentation for semantic image segmentation using generative adversarial networks. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1902–1906

  46. Liu W, Rabinovich A, Berg AC (2015) Parsenet: looking wider to see better. arXiv preprint arXiv:1506.04579

  47. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  48. Ma R, Tao P, Tang H (2019) Optimizing data augmentation for semantic segmentation on small-scale dataset. In: Proceedings of the 2nd international conference on control and computer vision, pp 77–81

  49. Marmanis D, Schindler K, Wegner J, Galliani S, Datcu M, Stilla U (2018) Classification with an edge: improving semantic image segmentation with boundary detection. ISPRS J Photogramm Remote Sens 135:158–172. https://doi.org/10.1016/j.isprsjprs.2017.11.009

    Article  Google Scholar 

  50. Michaeli T, Irani M (2014) Blind deblurring using internal patch recurrence. In: European conference on computer vision. Springer, Berlin, pp 783–798

  51. Mostajabi M, Yadollahpour P, Shakhnarovich G (2015) Feedforward semantic segmentation with zoom-out features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3376–3385

  52. Neuhold G, Ollmann T, Rota Bulo S, Kontschieder P (2017) The mapillary vistas dataset for semantic understanding of street scenes. In: Proceedings of the IEEE international conference on computer vision, pp 4990–4999

  53. Ning F, Delhomme D, LeCun Y, Piano F, Bottou L, Barbano PE (2005) Toward automatic phenotyping of developing embryos from videos. IEEE Trans Image Process 14(9):1360–1371

    Article  Google Scholar 

  54. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  55. Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147

  56. Perez L, Wang J (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621

  57. Qummar S, Khan FG, Shah S, Khan A, Shamshirband S, Rehman ZU, Khan IA, Jadoon W (2019) A deep learning ensemble approach for diabetic retinopathy detection. IEEE Access 7:150530–150539

    Article  Google Scholar 

  58. Rafeeq MJ, ur Rehman Z, Khan A, Khan IA, Jadoon W (2019) Ligature categorization based Nastaliq Urdu recognition using deep neural networks. Comput Math Organ Theory 25(2):184–195

    Article  Google Scholar 

  59. Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  60. Schaul T, Zhang S, LeCun Y (2013) No more pesky learning rates. In: International conference on machine learning, pp 343–351

  61. Shao L, Zhu F, Li X (2014) Transfer learning for visual categorization: a survey. IEEE Trans Neural Netw Learn Syst 26(5):1019–1034

    Article  MathSciNet  Google Scholar 

  62. Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Ann Hist Comput 04:640–651

    Google Scholar 

  63. Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):60

    Article  Google Scholar 

  64. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  65. Smith LN (2017) Cyclical learning rates for training neural networks. In: 2017 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 464–472

  66. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  67. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  68. Tang M, Valipour S, Zhang Z, Cobzas D, Jagersand M (2017) A deep level set method for image segmentation. In: Cardoso MJ, Arbel T, Carneiro G, Syeda-Mahmood T, Tavares JMR, Moradi M, Bradley A, Greenspan H, Papa JP, Madabhushi A, Nascimento JC, Cardoso JS, Belagiannis V, Lu Z (eds) Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham, pp 126–134

    Chapter  Google Scholar 

  69. Tolhurst DJ, Tadmor Y, Chao T (1992) Amplitude spectra of natural images. Ophthal Physiol Opt 12(2):229–232

    Article  Google Scholar 

  70. Wang C, Yang B, Liao Y (2017) Unsupervised image segmentation using convolutional autoencoder with total variation regularization as preprocessing. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1877–1881

  71. Watson DM, Hymers M, Hartley T, Andrews TJ (2016) Patterns of neural response in scene-selective regions of the human brain are affected by low-level manipulations of spatial frequency. NeuroImage 124:107–117. https://doi.org/10.1016/j.neuroimage.2015.08.058

    Article  Google Scholar 

  72. Wolfe JM, Horowitz TS (2004) What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci 5(6):495–501

    Article  Google Scholar 

  73. Wong SC, Gatt A, Stamatescu V, McDonnell MD (2016) Understanding data augmentation for classification: when to warp? In: 2016 International conference on digital image computing: techniques and applications (DICTA). IEEE, pp 1–6

  74. Xian Y, Choudhury S, He Y, Schiele B, Akata Z (2019) Semantic projection network for zero-and few-label semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8256–8265

  75. Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122

  76. Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701

  77. Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr PH (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537

  78. Zoran D, Weiss Y (2011) From learning models of natural image patches to whole image restoration. In: 2011 International conference on computer vision. IEEE, pp 479–486

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmad Khan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ul Haq, N., ur Rehman, Z., Khan, A. et al. Impact of data smoothing on semantic segmentation. Neural Comput & Applic 34, 8345–8354 (2022). https://doi.org/10.1007/s00521-020-05341-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05341-4

Keywords

Navigation