HybridNet: Integrating Multiple Approaches for Aerial Semantic Segmentation

Chouhan, Avinash; Sur, Arijit; Chutia, Dibyajyoti; Aggarwal, Shiv Prasad

doi:10.1007/s42979-023-02434-4

HybridNet: Integrating Multiple Approaches for Aerial Semantic Segmentation

Original Research
Published: 27 December 2023

Volume 5, article number 133, (2024)
Cite this article

SN Computer Science Aims and scope Submit manuscript

Avinash Chouhan ORCID: orcid.org/0000-0002-8375-5374^1,2,
Arijit Sur²,
Dibyajyoti Chutia¹ &
…
Shiv Prasad Aggarwal¹

184 Accesses
Explore all metrics

Abstract

In recent times, semantic segmentation for VHR aerial images has become an emerging research topic due to its widespread applications in disaster management, environmental monitoring, natural resource mapping, etc. The problem of semantic segmentation can be modeled as an image-to-image mapping problem where pixel-level classification is required. Pixel level classification is challenging for the high-resolution aerial image due to the presence of the tiny objects in low-frequency and more information details for such tiny objects required for dense semantic labeling. In general, encoder–decoder based architecture for semantic segmentation suffers from information loss due to the up and downsampling process. To handle this, we extend a high-resolution network with dense connection integration to preserve the original resolution and better parameter sharing. We also incorporate a lightweight self-attention module for positional attention, which results in better segmentation maps. Additionally, we use a generalized Hough transform based deep voting module for pixel dependencies extraction. Experimental results reveal that the proposed model achieves the best mean intersection over union and overall accuracy in local and benchmark evaluation on the Vaihingen and Potsdam datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 5

High Resolution Remote Sensing Image Segmentation Method with Improved DeepLabv3+

Article 23 April 2024

Semantic Segmentation of Aerial Image Using Fully Convolutional Network

R2SN: Refined Semantic Segmentation Network of City Remote Sensing Image

Availability of Data and Materials

The authors have used all publicly available benchmark datasets in this work. The implementation code will be shared using the GitHub link.

Notes

https://github.com/chouhan-avinash/HybridNet.

References

Abdollahi J, Mahmoudi L. An artificial intelligence system for detecting the types of the epidemic from X-rays: artificial intelligence system for detecting the types of the epidemic from X-rays. In: 2022 27th International Computer Conference, Computer Society of Iran (CSICC), Tehran, Iran, Islamic Republic of, 2022. p. 1–6. https://doi.org/10.1109/CSICC55295.2022.9780523.
Li K, Wan G, Cheng G, Meng L, Han J. Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens. 2020;159:296–307. https://doi.org/10.1016/j.isprsjprs.2019.11.023.
Article Google Scholar
Cheng B, et al. HigherHRNet: scale-aware representation learning for bottom-up human pose estimation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, USA; 2020. p. 5385–94. https://doi.org/10.1109/CVPR42600.2020.00543
Chapter Google Scholar
Chouhan A, Sur A, Chutia D. Drmnet: difference image reconstruction enhanced multiresolution network for optical change detection. IEEE J Sel Top Appl Earth Obs Remote Sens. 2022;15:4014–26. https://doi.org/10.1109/JSTARS.2022.3174780.
Article Google Scholar
Fang S, Li K, Shao J, Li Z. Snunet-cd: a densely connected siamese network for change detection of vhr images. IEEE Geosci Remote Sens Lett. 2022;19:1–5. https://doi.org/10.1109/LGRS.2021.3056416.
Article Google Scholar
Noa Turnes J, Castro JDB, Torres DL, Vega PJS, Feitosa RQ, Happ PN. Atrous cgan for sar to optical image translation. IEEE Geosci Remote Sens Lett. 2022;19:1–5. https://doi.org/10.1109/LGRS.2020.3031199.
Article Google Scholar
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. CoRR. 2014. arXiv:1411.4038.
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention (MICCAI). LNCS, vol. 9351. Springer; 2015. p. 234–41. arXiv:1505.04597 [cs.CV]. http://lmb.informatik.uni-freiburg.de/Publications/2015/RFB15a.
Chaurasia K, Nandy R, Pawar O, Singh RR, Ahire M. Semantic segmentation of high-resolution satellite images using deep learning. Earth Sci Inform. 2021;14:1–10. https://doi.org/10.1007/s12145-021-00674-7.
Article Google Scholar
Yu F, Koltun V, Funkhouser T. Dilated residual networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA; 2017. p. 636–44. https://doi.org/10.1109/CVPR.2017.75
Google Scholar
Sun Y, Tian Y, Xu Y. Problems of encoder-decoder frameworks for high-resolution remote sensing image segmentation: structural stereotype and insufficient learning. Neurocomputing. 2019;330:297–304. https://doi.org/10.1016/j.neucom.2018.11.051.
Article Google Scholar
Volpi M, Tuia D. Dense semantic labeling of subdecimeter resolution images with convolutional neural networks. IEEE Trans Geosci Remote Sens. 2017;55(2):881–93.
Article Google Scholar
Liu Y, Minh Nguyen D, Deligiannis N, Ding W, Munteanu A. Hourglass-shapenetwork based semantic segmentation for high resolution aerial imagery. Remote Sens. 2017;9(6):522. https://doi.org/10.3390/rs9060522.
Article Google Scholar
Diakogiannis FI, Waldner F, Caccetta P, Wu C. Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J Photogramm Remote Sens. 2020;162:94–114. https://doi.org/10.1016/j.isprsjprs.2020.01.013.
Article Google Scholar
Fourure D, Emonet R, Fromont É, Muselet D, Trémeau A, Wolf C. Residual conv-deconv grid network for semantic segmentation. 2017. arXiv preprint arXiv:abs/1707.07958.
Pohlen T, Hermans A, Mathias M, Leibe B. Full-resolution residual networks for semantic segmentation in street scenes. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA; 2017. p. 3309–18. https://doi.org/10.1109/CVPR.2017.353.
Google Scholar
Sun K, Xiao B, Liu D, Wang J. Deep high-resolution representation learning for human pose estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA; 2019. p. 5686–96. https://doi.org/10.1109/CVPR.2019.00584.
Chapter Google Scholar
Wang J et al. Deep high-resolution representation learning for visual recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021;43(10):3349–64. https://doi.org/10.1109/TPAMI.2020.2983686.
Article Google Scholar
Zhang C, Liu J, Yu F, Wan S, Han Y, Wang J, Wang G. Segmentation model based on convolutional neural networks for extracting vegetation from Gaofen-2 images. J Appl Remote Sens. 2018;12(4):1–18. https://doi.org/10.1117/1.JRS.12.042804.
Article Google Scholar
Audebert N, Saux B, Lefèvre S. Semantic segmentation of earth observation data using multimodal and multi-scale deep networks. 2017. p. 180–96. https://doi.org/10.1007/978-3-319-54181-5_12.
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell. 2018;40(4):834–48. https://doi.org/10.1109/TPAMI.2017.2699184.
Google Scholar
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (2019).
Liu Q, Kampffmeyer M, Jenssen R, Salberg A-B. Dense dilated convolutions’ merging network for land cover classification. IEEE Trans Geosci Remote Sens. 2020;58(9):6309–20.
Article Google Scholar
Yue K, Sun M, Yuan Y, Zhou F, Ding E, Xu F. Compact generalized non-local network. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS'18). NY, USA: Curran Associates Inc.; 2018. p. 6511–20.
Google Scholar
Li X, Zhang L, You A, Yang M, Yang K, Tong Y. Global aggregation then local distribution in fully convolutional networks. In: 30th British machine vision conference 2019, BMVC 2019, Cardiff, UK, September 9–12, 2019. BMVA Press; 2019. p. 244. https://bmvc2019.org/wp-content/uploads/papers/0432-paper.pdf.
Liu S, Gao K, Qin J, Gong H, Wang H, Zhang L, Gong D. SE2Net: semantic segmentation of remote sensing images based on self-attention and edge enhancement modules. J Appl Remote Sens. 2021;15(2):1–16. https://doi.org/10.1117/1.JRS.15.026512.
Article Google Scholar
Chen G, Zhang X, Wang Q, Dai F, Gong Y, Zhu K. Symmetrical dense-shortcut deep fully convolutional networks for semantic segmentation of very-high-resolution remote sensing images. IEEE J Sel Top Appl Earth Obs Remote Sens. 2018;11(5):1633–44. https://doi.org/10.1109/JSTARS.2018.2810320.
Article Google Scholar
Liu Y, Piramanayagam S, Monteiro ST, Saber E. Semantic segmentation of multisensor remote sensing imagery with deep ConvNets and higher-order conditional random fields. J Appl Remote Sens. 2019;13(1):1–23. https://doi.org/10.1117/1.JRS.13.016501.
Article Google Scholar
Paisitkriangkrai S, Sherrah J, Janney P, Van-Den Hengel A. Effective semantic pixel labelling with convolutional networks and conditional random fields. In: 2015 IEEE conference on computer vision and pattern recognition workshops (CVPRW). 2015. p. 36–43. https://doi.org/10.1109/CVPRW.2015.7301381
Chen Y, Ming D, Lv X. Superpixel based land cover classification of vhr satellite image combining multi-scale cnn and scale parameter estimation. Earth Sci Inform. 2019;12(3):341–63. https://doi.org/10.1007/s12145-019-00383-2. (Communicated by: H. Babaie).
Article Google Scholar
Samet N, Hicsonmez S, Akbas E. HoughNet: integrating near and long-range evidence for bottom-up object detection. In: Vedaldi A, Bischof H, Brox T, Frahm JM (eds) Computer Vision–ECCV 2020. ECCV, Lecture Notes in Computer Science, vol. 12370. Cham: Springer; 2020. p. 2020. https://doi.org/10.1007/978-3-030-58595-2_25.
Google Scholar
Milletari F, Ahmadi S-A, Kroll C, Plate A, Rozanski V, Maiostre J, Levin J, Dietrich O, Ertl-Wagner B, Bötzel K, Navab N. Hough-cnn: deep learning for segmentation of deep brain regions in mri and ultrasound. Comput Vis Image Underst. 2017;164:92–102. https://doi.org/10.1016/j.cviu.2017.04.002. (Deep Learning for Computer Vision).
Article Google Scholar
Novotny D, Albanie S, Larlus D, Vedaldi A. Semi-convolutional operators for instance segmentation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y, editors. Computer Vision–ECCV 2018. ECCV, Lecture Notes in Computer Science, vol. 11205. Cham: Springer; 2018. p. 2018. https://doi.org/10.1007/978-3-030-01246-5_6.
Google Scholar
Qi CR, Litany O, He K, Guibas L. Deep hough voting for 3d object detection in point clouds. In: 2019 IEEE/CVF international conference on computer vision (ICCV). 2019. p. 9276–285. https://doi.org/10.1109/ICCV.2019.00937.
Sheshkus A, Ingacheva A, Arlazarov V, Nikolaev D. HoughNet: neural network architecture for vanishing points detection. In: 2019 International Conference on Document Analysis and Recognition (ICDAR). Sydney, NSW, Australia; 2019. p. 844–9. https://doi.org/10.1109/ICDAR.2019.00140.
Guo S, Pridmore T, Kong Y, Zhang X. An improved hough transform voting scheme utilizing surround suppression. Pattern Recognit Lett. 2009;30(13):1241–52. https://doi.org/10.1016/j.patrec.2009.05.003.
Article Google Scholar
Wollmann T, Rohr K. Deep residual Hough voting for mitotic cell detection in histopathology images. In: IEEE 14th International Symposium on Biomedical Imaging (ISBI2017). Melbourne, VIC, Australia; 2017. p. 341–4. https://doi.org/10.1109/ISBI.2017.7950533.
Google Scholar
Liu Y, Fan B, Wang L, Bai J, Xiang S, Pan C. Semantic labeling in very high resolution images via a self-cascaded convolutional neural network. ISPRS J Photogramm Remote Sens. 2018;145:78–95.
Article Google Scholar
Marcos D, Volpi M, Kellenberger B, Tuia D. Land cover mapping at very high resolution with rotation equivariant cnns: towards small yet accurate models. ISPRS J Photogramm Remote Sens. 2018;145:96–107. https://doi.org/10.1016/j.isprsjprs.2018.01.021. (Deep Learning RS Data).
Article Google Scholar
Yue K, Yang L, Li R, Hu W, Zhang F, Li W. Treeunet: adaptive tree convolutional neural networks for subdecimeter aerial image segmentation. ISPRS J Photogramm Remote Sens. 2019;156:1–13. https://doi.org/10.1016/j.isprsjprs.2019.07.007.
Article Google Scholar
Marmanis D, Schindler K, Wegner JD, Galliani S, Datcu M, Stilla U. Classification with an edge: improving semantic image segmentation with boundary detection. ISPRS J Photogramm Remote Sens. 2018;135:158–72. https://doi.org/10.1016/j.isprsjprs.2017.11.009.
Article Google Scholar
Audebert N, Saux BL, Lefèvre S. Beyond rgb: very high resolution urban remote sensing with multimodal deep networks. ISPRS J Photogramm Remote Sens. 2017. https://doi.org/10.1016/j.isprsjprs.2017.11.011.
Article Google Scholar
Maggiori E, Tarabalka Y, Charpiat G, Alliez P. High-resolution aerial image labeling with convolutional neural networks. IEEE Trans Geosci Remote Sens. 2017;55(12):7092–103. https://doi.org/10.1109/TGRS.2017.2740362.
Article Google Scholar
Sherrah J. Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery. 2016. arXiv:abs/1606.02585.
Bai H, Cheng J, Huang X, Liu S, Deng C. Hcanet: a hierarchical context aggregation network for semantic segmentation of high-resolution remote sensing images. IEEE Geosci Remote Sens Lett. 2021. https://doi.org/10.1109/LGRS.2021.3063799.
Article Google Scholar
Maggiori E, Tarabalka Y, Charpiat G, Alliez P. High-resolution aerial image labeling with convolutional neural networks. IEEE Trans Geosci Remote Sens. 2017;55(12):7092–103.
Article Google Scholar
Mou L, Hua Y, Zhu XX. Relation matters: relational context-aware fully convolutional network for semantic segmentation of high-resolution aerial images. IEEE Trans Geosci Remote Sens. 2020;58(11):7557–69. https://doi.org/10.1109/TGRS.2020.2979552.
Article Google Scholar
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S. Pytorch: An imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R, editors. Advances in neural information processing systems, vol. 32. Curran Associates, Inc.; 2019. https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf.
del Barrio E, Cuesta-Albertos JA, Matrán C. An optimal transportation approach for assessing almost stochastic order. In: Gil E, Gil E, Gil J, Gil M, editors. The Mathematics of the Uncertain. Studies in Systems, Decision and Control, vol. 142. Cham: Springer; 2018. https://doi.org/10.1007/978-3-319-73848-2_3.
Google Scholar
Ulmer D, Hardmeier C, Frellsen J. Deep-significance-easy and meaningful statistical significance testing in the age of neural networks. arXiv preprint arXiv:2204.06815 (2022).
Dror R, Shlomov S, Reichart R. Deep dominance—how to properly compare deep neural models. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics; 2019. p. 2773–785. https://doi.org/10.18653/v1/P19-1266. https://aclanthology.org/P19-1266
Wambugu N, Chen Y, Xiao Z, Wei M, Aminu Bello S, Marcato Junior J, Li J. A hybrid deep convolutional neural network for accurate land cover classification. Int J Appl Earth Obs Geoinf. 2021;103: 102515. https://doi.org/10.1016/j.jag.2021.102515.
Article Google Scholar
Li R, Zheng S, Zhang C, Duan C, Su J, Wang L, Atkinson PM. Multiattention network for semantic segmentation of fine-resolution remote sensing images. IEEE Trans Geosci Remote Sens. 2022;60:1–13. https://doi.org/10.1109/TGRS.2021.3093977.
Article Google Scholar
Li R, Wang L, Zhang C, Duan C, Zheng S. A2-fpn for semantic segmentation of fine-resolution remotely sensed images. Int J Remote Sens. 2022;43(3):1131–55. https://doi.org/10.1080/01431161.2022.2030071.
Article Google Scholar
Hazırbaş C, Ma L, Domokos C, Cremers D. Fusenet: incorporating depth into semantic segmentation via fusion-based cnn architecture. (2016). https://doi.org/10.1007/978-3-319-54181-5_14.
Zhang C, Jiang W, Zhao Q. Semantic segmentation of aerial imagery via split-attention networks with disentangled nonlocal and edge supervision. Remote Sens. 2021. https://doi.org/10.3390/rs13061176.
Article Google Scholar
Bokhovkin A, Burnaev E. Boundary loss for remote sensing imagery semantic segmentation. In: Lu H, Tang H, Wang Z, editors. Advances in neural networks-ISNN 2019. Cham: Springer; 2019. p. 388–401.
Chapter Google Scholar
Jampani V, Sun D, Liu MY, Yang MH, Kautz J. Superpixel sampling networks. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y, editors. Computer Vision–ECCV, Lecture Notes in Computer Science, ECCV 2018, vol. 11211. Cham: Springer; 2018. https://doi.org/10.1007/978-3-030-01234-2_22
Google Scholar
Zhao S, Wang Y, Yang Z, Cai D. Region mutual information loss for semantic segmentation. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article 997. NY, USA: Curran Associates Inc.; 2019. p. 11117–27.
Google Scholar
Mi L, Chen Z. Superpixel-enhanced deep neural forest for remote sensing image semantic segmentation. ISPRS J Photogramm Remote Sens. 2020;159:140–52. https://doi.org/10.1016/j.isprsjprs.2019.11.006.
Article Google Scholar

Download references

Acknowledgements

We would like to thank the International Society for Photogrammetry and Remote Sensing (ISPRS) for sharing 2D semantic segmentation benchmark datasets.

Funding

No funding was obtained for this study.

Author information

Authors and Affiliations

North Eastern Space Applications Centre, Umiam, Meghalaya, 793103, India
Avinash Chouhan, Dibyajyoti Chutia & Shiv Prasad Aggarwal
Department of Computer Science and Engineering, Indian Institute of Technology, Guwahati, Assam, 781039, India
Avinash Chouhan & Arijit Sur

Authors

Avinash Chouhan
View author publications
You can also search for this author in PubMed Google Scholar
Arijit Sur
View author publications
You can also search for this author in PubMed Google Scholar
Dibyajyoti Chutia
View author publications
You can also search for this author in PubMed Google Scholar
Shiv Prasad Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AC and AS have conceptualized the proposed scheme. The implementation and results are produced by AC. The validation of results is performed by AS and DJC. The manuscript has been prepared by AC and AS. The proofreading and finalization are completed by AS, DJC and SPA.

Corresponding author

Correspondence to Avinash Chouhan.

Ethics declarations

Conflict of interest

The authors do not have related financial or non-financial interests that need to be disclosed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chouhan, A., Sur, A., Chutia, D. et al. HybridNet: Integrating Multiple Approaches for Aerial Semantic Segmentation. SN COMPUT. SCI. 5, 133 (2024). https://doi.org/10.1007/s42979-023-02434-4

Download citation

Received: 10 October 2022
Accepted: 17 October 2023
Published: 27 December 2023
DOI: https://doi.org/10.1007/s42979-023-02434-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HybridNet: Integrating Multiple Approaches for Aerial Semantic Segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

High Resolution Remote Sensing Image Segmentation Method with Improved DeepLabv3+

Semantic Segmentation of Aerial Image Using Fully Convolutional Network

R2SN: Refined Semantic Segmentation Network of City Remote Sensing Image

Availability of Data and Materials

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

HybridNet: Integrating Multiple Approaches for Aerial Semantic Segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

High Resolution Remote Sensing Image Segmentation Method with Improved DeepLabv3+

Semantic Segmentation of Aerial Image Using Fully Convolutional Network

R2SN: Refined Semantic Segmentation Network of City Remote Sensing Image

Availability of Data and Materials

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation