Skip to main content

Semantic Segmentation of High-Resolution Aerial Imagery with W-Net Models

  • Conference paper
  • First Online:
Book cover Progress in Artificial Intelligence (EPIA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11805))

Included in the following conference series:

Abstract

The semantic segmentation of high-resolution aerial images concerns the task of determining, for each pixel, the most likely class label from a finite set of possible labels (e.g., discriminating pixels referring to roads, buildings, or vegetation, in high-resolution images depicting urban areas). Following recent work in the area related to the use of fully-convolutional neural networks for semantic segmentation, we evaluated the performance of an adapted version of the W-Net architecture, which has achieved very good results on other types of image segmentation tasks. Through experiments with two distinct datasets frequently used in previous studies in the area, we show that the proposed W-Net architecture is quite effective in this task, outperforming a baseline corresponding to the U-Net model, and also some of the other recently proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www2.isprs.org/commissions/comm3/wg4/2d-sem-label-vaihingen.html.

  2. 2.

    http://www2.isprs.org/commissions/comm3/wg4/2d-sem-label-potsdam.html.

  3. 3.

    http://project.inria.fr/aerialimagelabeling.

  4. 4.

    http://keras.io.

  5. 5.

    http://github.com/martamaria96/deep-wnet.

  6. 6.

    http://www2.isprs.org/commissions/comm2/wg4/potsdam-2d-semantic-labeling.html.

  7. 7.

    http://www2.isprs.org/commissions/comm2/wg4/vaihingen-2d-semantic-labeling-contest.html.

References

  1. Xia, X., Kulis, B.: W-Net: a deep model for fully unsupervised image segmentation. arXiv preprint arXiv:1711.08506 (2017)

  2. Chen, W., et al.: W-Net: Bridged U-Net for 2D medical image segmentation. arXiv preprint arXiv:1807.04459 (2018)

  3. Audebert, N., Le Saux, B., Lefèvre, S.: Beyond RGB: very high resolution urban remote sensing with multimodal deep networks. ISPRS J. Photogramm. Remote Sens. 140, 20–32 (2018)

    Article  Google Scholar 

  4. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  5. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  6. Mou, L., Zhu, X.X.: RiFCN: recurrent network in fully convolutional network for semantic segmentation of high resolution remote sensing images. arXiv preprint arXiv:1805.02091 (2018)

  7. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  8. Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5188–5196 (2015)

    Google Scholar 

  9. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations (2015)

    Google Scholar 

  10. Chen, G., Zhang, X., Wang, Q., Dai, F., Gong, Y., Zhu, K.: Symmetrical dense-shortcut deep fully convolutional networks for semantic segmentation of very-high-resolution remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11(5), 1633–1644 (2018)

    Article  Google Scholar 

  11. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  12. Liu, Y., Fan, B., Wang, L., Bai, J., Xiang, S., Pan, C.: Semantic labeling in very high resolution images via a self-cascaded convolutional neural network. ISPRS J. Photogramm. Remote Sens. 145, 78–95 (2018)

    Article  Google Scholar 

  13. Mnih, V.: Machine learning for aerial image labeling. Ph.D. thesis, University of Toronto (2013)

    Google Scholar 

  14. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  15. Jain, A.K.: Fundamentals of Digital Image Processing. Prentice Hall, Upper Saddle River (1989)

    MATH  Google Scholar 

  16. Xu, B., Huang, R., Li, M.: Revise saturated activation functions. arXiv preprint arXiv:1602.05980 (2016)

  17. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the International Conference on Machine Learning, pp. 448–456 (2015)

    Google Scholar 

  18. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  19. Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Jorge Cardoso, M.: Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Cardoso, M.J., et al. (eds.) DLMIA/ML-CDS -2017. LNCS, vol. 10553, pp. 240–248. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67558-9_28

    Chapter  Google Scholar 

  20. Xie, J., He, T., Zhang, Z., Zhang, H., Zhang, Z., Li, M.: Bag of tricks for image classification with convolutional neural networks. arXiv preprint arXiv:1812.01187 (2018)

  21. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (2014)

    Google Scholar 

  22. Smith, L.N.: Cyclical learning rates for training neural networks. In: Proceeedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 464–472 (20170

    Google Scholar 

  23. Forbes, T., He, Y., Mudur, S., Poullis, C.: Aggregated residual convolutional neural network for multi-label pixel wise classification of geospatial features. In: Online Abstracts of the ISPRS Benchmark on Urban Object Classification and 3D Building Reconstruction (2018)

    Google Scholar 

  24. Lin, G., Shen, C., van den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3194–3203 (2016)

    Google Scholar 

  25. Nogueira, K., Mura, M.D., Chanussot, J., Schwartz, W.R., Santos, J.A.: Dynamic multi-scale segmentation of remote sensing images based on convolutional networks. arXiv preprint arXiv:1804.04020 (2018)

  26. Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: Proceedings of the Workshops at the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1175–1183 (2017)

    Google Scholar 

  27. Li, X., Chen, H., Qi, X., Dou, Q., Fu, C.-W., Heng, P.-A.: H-DenseUNet: Hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans. Med. Imaging 37(12), 2663–2674 (2018)

    Article  Google Scholar 

  28. Newell, A., Yang, K., Deng, J.: Stacked Hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    Chapter  Google Scholar 

  29. Sun, T., Chen, Z., Yang, W., Wang, Y.: Stacked U-Nets with multi-output for road extraction. In: Proceedings of the Workshops at the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 187–192 (2018)

    Google Scholar 

  30. Khalel, A., El-Saban, M.: Automatic pixelwise object labeling for aerial imagery using stacked U-Nets. arXiv preprint arXiv:1803.04953 (2018)

  31. Zhang, Z., Liu, Q., Wang, Y.: Road extraction by deep residual U-Net. IEEE Geosci. Remote Sens. Lett. 15(5), 749–753 (2018)

    Article  Google Scholar 

  32. Zhang, J., Jin, Y., Xu, J., Xu, X., Zhang, Y.: MDU-Net: multi-scale densely connected U-Net for biomedical image segmentation. arXiv preprint arXiv:1812.00352 (2018)

  33. Tang, Z., Peng, X., Geng, S., Zhu, Y., Metaxas, D.: CU-Net: coupled U-Nets. In: Proceedings of the British Machine Vision Conference, pp. 305–316 (2018)

    Google Scholar 

  34. Chen, Y., et al.: Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution. arXiv preprint arXiv:1904.05049 (2019)

  35. Bello, I., Zoph, B., Vaswani, A., Shlens, J., Le, Q.V.: Attention augmented convolutional networks. arXiv preprint arXiv:904.09925 (2019)

  36. Monteiro, J., Martins, B., Pires, J.M.: A hybrid approach for the spatial disaggregation of socio-economic indicators. Int. J. Data Sci. Anal. 5(2–3), 189–211 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported through Fundação para a Ciência e Tecnologia (FCT), through the project grants with references PTDC/EEI-SCR/1743/2014 (Saturn), PTDC/CTA-OHR/29360/2017 (RiverCure), and PTDC/CCI-CIF/32607/2017 (MIMU), as well as through the INESC-ID multi-annual funding from the PIDDAC programme (UID/CEC/50021/2019). We also gratefully acknowledge the support of NVIDIA Corporation, with the donation of two Titan Xp GPUs used in the experiments reported on the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Dias .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dias, M., Monteiro, J., Estima, J., Silva, J., Martins, B. (2019). Semantic Segmentation of High-Resolution Aerial Imagery with W-Net Models. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11805. Springer, Cham. https://doi.org/10.1007/978-3-030-30244-3_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30244-3_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30243-6

  • Online ISBN: 978-3-030-30244-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics