Semantic Segmentation of High-Resolution Aerial Imagery with W-Net Models

Dias, Maria; Monteiro, João; Estima, Jacinto; Silva, Joel; Martins, Bruno

doi:10.1007/978-3-030-30244-3_40

Maria Dias^11,12,
João Monteiro^11,12,
Jacinto Estima^12,13,
Joel Silva¹² &
…
Bruno Martins^11,12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11805))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

2037 Accesses
2 Altmetric

Abstract

The semantic segmentation of high-resolution aerial images concerns the task of determining, for each pixel, the most likely class label from a finite set of possible labels (e.g., discriminating pixels referring to roads, buildings, or vegetation, in high-resolution images depicting urban areas). Following recent work in the area related to the use of fully-convolutional neural networks for semantic segmentation, we evaluated the performance of an adapted version of the W-Net architecture, which has achieved very good results on other types of image segmentation tasks. Through experiments with two distinct datasets frequently used in previous studies in the area, we show that the proposed W-Net architecture is quite effective in this task, outperforming a baseline corresponding to the U-Net model, and also some of the other recently proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A novel W13 deep CNN structure for improved semantic segmentation of multiple objects in remote sensing imagery

Article Open access 03 January 2025

Learning to Segment Objects of Various Sizes in VHR Aerial Images

Semantic Segmentation of Aerial Image Using Fully Convolutional Network

Notes

References

Xia, X., Kulis, B.: W-Net: a deep model for fully unsupervised image segmentation. arXiv preprint arXiv:1711.08506 (2017)
Chen, W., et al.: W-Net: Bridged U-Net for 2D medical image segmentation. arXiv preprint arXiv:1807.04459 (2018)
Audebert, N., Le Saux, B., Lefèvre, S.: Beyond RGB: very high resolution urban remote sensing with multimodal deep networks. ISPRS J. Photogramm. Remote Sens. 140, 20–32 (2018)
Article Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Mou, L., Zhu, X.X.: RiFCN: recurrent network in fully convolutional network for semantic segmentation of high resolution remote sensing images. arXiv preprint arXiv:1805.02091 (2018)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5188–5196 (2015)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations (2015)
Google Scholar
Chen, G., Zhang, X., Wang, Q., Dai, F., Gong, Y., Zhu, K.: Symmetrical dense-shortcut deep fully convolutional networks for semantic segmentation of very-high-resolution remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11(5), 1633–1644 (2018)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Liu, Y., Fan, B., Wang, L., Bai, J., Xiang, S., Pan, C.: Semantic labeling in very high resolution images via a self-cascaded convolutional neural network. ISPRS J. Photogramm. Remote Sens. 145, 78–95 (2018)
Article Google Scholar
Mnih, V.: Machine learning for aerial image labeling. Ph.D. thesis, University of Toronto (2013)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Jain, A.K.: Fundamentals of Digital Image Processing. Prentice Hall, Upper Saddle River (1989)
MATH Google Scholar
Xu, B., Huang, R., Li, M.: Revise saturated activation functions. arXiv preprint arXiv:1602.05980 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the International Conference on Machine Learning, pp. 448–456 (2015)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article Google Scholar
Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Jorge Cardoso, M.: Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Cardoso, M.J., et al. (eds.) DLMIA/ML-CDS -2017. LNCS, vol. 10553, pp. 240–248. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67558-9_28
Chapter Google Scholar
Xie, J., He, T., Zhang, Z., Zhang, H., Zhang, Z., Li, M.: Bag of tricks for image classification with convolutional neural networks. arXiv preprint arXiv:1812.01187 (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (2014)
Google Scholar
Smith, L.N.: Cyclical learning rates for training neural networks. In: Proceeedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 464–472 (20170
Google Scholar
Forbes, T., He, Y., Mudur, S., Poullis, C.: Aggregated residual convolutional neural network for multi-label pixel wise classification of geospatial features. In: Online Abstracts of the ISPRS Benchmark on Urban Object Classification and 3D Building Reconstruction (2018)
Google Scholar
Lin, G., Shen, C., van den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3194–3203 (2016)
Google Scholar
Nogueira, K., Mura, M.D., Chanussot, J., Schwartz, W.R., Santos, J.A.: Dynamic multi-scale segmentation of remote sensing images based on convolutional networks. arXiv preprint arXiv:1804.04020 (2018)
Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: Proceedings of the Workshops at the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1175–1183 (2017)
Google Scholar
Li, X., Chen, H., Qi, X., Dou, Q., Fu, C.-W., Heng, P.-A.: H-DenseUNet: Hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans. Med. Imaging 37(12), 2663–2674 (2018)
Article Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked Hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Sun, T., Chen, Z., Yang, W., Wang, Y.: Stacked U-Nets with multi-output for road extraction. In: Proceedings of the Workshops at the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 187–192 (2018)
Google Scholar
Khalel, A., El-Saban, M.: Automatic pixelwise object labeling for aerial imagery using stacked U-Nets. arXiv preprint arXiv:1803.04953 (2018)
Zhang, Z., Liu, Q., Wang, Y.: Road extraction by deep residual U-Net. IEEE Geosci. Remote Sens. Lett. 15(5), 749–753 (2018)
Article Google Scholar
Zhang, J., Jin, Y., Xu, J., Xu, X., Zhang, Y.: MDU-Net: multi-scale densely connected U-Net for biomedical image segmentation. arXiv preprint arXiv:1812.00352 (2018)
Tang, Z., Peng, X., Geng, S., Zhu, Y., Metaxas, D.: CU-Net: coupled U-Nets. In: Proceedings of the British Machine Vision Conference, pp. 305–316 (2018)
Google Scholar
Chen, Y., et al.: Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution. arXiv preprint arXiv:1904.05049 (2019)
Bello, I., Zoph, B., Vaswani, A., Shlens, J., Le, Q.V.: Attention augmented convolutional networks. arXiv preprint arXiv:904.09925 (2019)
Monteiro, J., Martins, B., Pires, J.M.: A hybrid approach for the spatial disaggregation of socio-economic indicators. Int. J. Data Sci. Anal. 5(2–3), 189–211 (2018)
Article Google Scholar

Download references

Acknowledgements

This research was supported through Fundação para a Ciência e Tecnologia (FCT), through the project grants with references PTDC/EEI-SCR/1743/2014 (Saturn), PTDC/CTA-OHR/29360/2017 (RiverCure), and PTDC/CCI-CIF/32607/2017 (MIMU), as well as through the INESC-ID multi-annual funding from the PIDDAC programme (UID/CEC/50021/2019). We also gratefully acknowledge the support of NVIDIA Corporation, with the donation of two Titan Xp GPUs used in the experiments reported on the paper.

Author information

Authors and Affiliations

Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
Maria Dias, João Monteiro & Bruno Martins
INESC-ID, Lisbon, Portugal
Maria Dias, João Monteiro, Jacinto Estima, Joel Silva & Bruno Martins
Instituto Politécnico de Setúbal, Setúbal, Portugal
Jacinto Estima

Authors

Maria Dias
View author publications
You can also search for this author in PubMed Google Scholar
João Monteiro
View author publications
You can also search for this author in PubMed Google Scholar
Jacinto Estima
View author publications
You can also search for this author in PubMed Google Scholar
Joel Silva
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Martins
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maria Dias .

Editor information

Editors and Affiliations

INESC-TEC, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal
Paulo Moura Oliveira
University of Minho, Braga, Portugal
Paulo Novais
LIACC/UP, University of Porto, Porto, Portugal
Luís Paulo Reis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dias, M., Monteiro, J., Estima, J., Silva, J., Martins, B. (2019). Semantic Segmentation of High-Resolution Aerial Imagery with W-Net Models. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11805. Springer, Cham. https://doi.org/10.1007/978-3-030-30244-3_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-30244-3_40
Published: 30 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30243-6
Online ISBN: 978-3-030-30244-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics