Using contour loss constraining residual attention U-net on optical remote sensing interpretation

Yang, Peiqi; Wang, Mingjun; Yuan, Hao; He, Ci; Cong, Li

doi:10.1007/s00371-022-02590-3

Using contour loss constraining residual attention U-net on optical remote sensing interpretation

Original article
Published: 19 July 2022

Volume 39, pages 4279–4291, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Peiqi Yang¹,
Mingjun Wang¹,
Hao Yuan ORCID: orcid.org/0000-0002-9061-9805²,
Ci He^3,4 &
…
Li Cong⁵

267 Accesses
4 Citations
Explore all metrics

Abstract

Using deep learning in remote sensing interpretation could reduce a lot of human and material costs. Semantic segmentation is the main method for this task. It can automatically outline the objects and it has recently achieved great success in remote sensing images. However, in the appliance of remote sensing interpretation, the accuracy of contour largely determines the evaluation of remote sensing interpretation. Though the current loss functions reflect the segmentation performance, they could not guide the model to optimize itself toward a more precise contour. This paper proposed an exactly defined contour loss (CL) for remote sensing interpretation with Residual Attention U-Net (RA U-Net) as the main framework. The RA U-Net uses the residual attention module as the skip connection layer. It enhances the judgment of U-Net. In CL, image processing methods are used to extract the contours of the foreground. And elements-sum and elements-subtract operations are used to transfer the contour information to a matrix of the same size as label images. Then, these matrices would be the weights for CE. By assigning different weights for different elements in different regions, this function will guide the model to reach a balance between accurate segmentation results and precise contours. The experiment on open datasets shows a good performance. The proposed model was also trained on the Construction Disturbance Dataset collected from Jiang Xi Province, China. The dataset was labeled manually. The evaluation enhanced a lot on the Construction Disturbance Dataset and the IoU on two datasets increased \(1\%\) to \(2\%\) when using CL as the loss function. This paper also compared the proposed method with other state-of-the-art methods and the results showed extensive effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

CIM-WV: A 2D semantic segmentation dataset of rich window view contents in high-rise, high-density Hong Kong based on photorealistic city information models

Article Open access 28 March 2024

CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface

Article 03 April 2024

Data availability

The Mnih Massachusetts Building Dataset used or analyzed during the current study are available from the corresponding author on reasonable request. Another Construction Disturbance Dataset is not publicly available until permission is granted by the provider.

Code availability

The relevant codes are available.

References

Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017). https://doi.org/10.1109/TPAMI.2016.2644615
Article Google Scholar
Bao, Y., Liu, W., Gao, O., Lin, Z., Hu, Q.: A semantic segmentation method for remote sensing images. In: 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), vol. 4, pp. 1858–1862 (2021). https://doi.org/10.1109/IMCEC51613.2021.9482266
Chen, C., Jiange, J., Rufei, F., Lanlan, C., Cong, L., Shaohua, W.: An intelligent caching strategy considering time-space characteristics in vehicular nameddata networks. IEEE Trans. Intell. Transp. Syst. (2021). https://doi.org/10.1109/TITS.2021.3128012
Chen, C., Zhang, Y., Wang, Z., Wan, S., Pei, Q.: Distributed computation offloading method based on deep reinforcement learning in ICV. Appl. Soft Comput. 103, 107108 (2021)
Article Google Scholar
Chen, C., Jiang, J., Zhou, Y., Lv, N., Liang, X., Wan, S.: An edge intelligence empowered flooding process prediction using internet of things in smart city. J. Parallel Distrib. Comput. 165, 66–78 (2022)
Article Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article Google Scholar
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
Chen, Z., Zhou, H., Xie, X., Lai, J.: Contour loss: Boundary-aware learning for salient object segmentation. arXiv:1908.01975 (2019)
Cheng, Z., Qu, A., He, X.: Contour-aware semantic segmentation network with spatial attention mechanism for medical image. Vis. Comput. 38(3), 749–762 (2022)
Article Google Scholar
Cong, W., Chen, C., Qingqi, P., Zhiyuan, J., Shugong, X.: An information centric in-network caching scheme for 5g-enabled internet of connected vehicles. IEEE Trans. Mob. Comput. (2021). https://doi.org/10.1109/TMC.2021.3137219
Di Martino, T., Lenormand, M., Koeniguer, E.C.: Multi-branch deep learning model for detection of settlements without electricity. In: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pp. 1847–1850. https://doi.org/10.1109/IGARSS47720.2021.9554286 (2021)
Farhangfar, S., Rezaeian, M.: Semantic segmentation of aerial images using FCN-based network. In: 2019 27th Iranian Conference on Electrical Engineering (ICEE), pp. 1864–1868. https://doi.org/10.1109/IranianCEE.2019.8786455 (2019)
Feng, C., Liu, B., Yu, K., Goudos, S.K., Wan, S.: Blockchain-empowered decentralized horizontal federated learning for 5G-enabled UAVs. IEEE Trans. Ind. Inform. 1 (2021). https://doi.org/10.1109/TII.2021.3116132
Goel, A., Banerjee, B., Pizurica, A.: Hierarchical metric learning for optical remote sensing scene categorization. IEEE Geosci. Remote Sens. Lett. 1–5 (2018)
Goldberg, M., Shlien, S.: A clustering scheme for multispectral images. IEEE Trans. Syst. Man Cybern. 8(2), 86–92 (1978). https://doi.org/10.1109/TSMC.1978.4309905
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE
Hu, J., Chen, C., Cai, L., Khosravi, M.R., Pei, Q., Wan, S.: UAV-assisted vehicular edge computing for the 6g internet of vehicles: architecture, intelligence, and challenges. IEEE Commun. Stand. Mag. 5(2), 12–18 (2021)
Article Google Scholar
Iandola, F.N., Moskewicz, M.W., Karayev, S., Girshick, R.B., Darrell, T., Keutzer, K.: Densenet: Implementing efficient convnet descriptor pyramids. CoRR arXiv:1404.1869 (2014)
Jiang, M., Zhai, F., Kong, J.: Sparse attention module for optimizing semantic segmentation performance combined with a multi-task feature extraction network. Vis. Comput. 1–16 (2021)
Kai, Y., Jiahang, L., Lu, Z.: An adaptive multi-threshold image segmentation algorithm based on object-oriented classification for high-resolution remote sensing images. In: Optical Sensing and Imaging Technology and Applications (2017)
Karimi, D., Salcudean, S.E.: Reducing the hausdorff distance in medical image segmentation with convolutional neural networks. IEEE Trans. Med. Imaging 39(2), 499–513 (2020). https://doi.org/10.1109/TMI.2019.2930068
Article Google Scholar
Li, X., Du, Z., Huang, Y., Tan, Z.: A deep translation (GAN) based change detection network for optical and SAR remote sensing images. ISPRS J. Photogram. Remote Sens. 179, 14–34 (2021)
Article Google Scholar
Li, Z., Guo, Y.: Semantic segmentation of landslide images in Nyingchi region based on PSPNet network. In: 2020 7th International Conference on Information Science and Control Engineering (ICISCE), pp. 1269–1273. (2020). https://doi.org/10.1109/ICISCE50968.2020.00256
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017). https://doi.org/10.1109/ICCV.2017.324
Liu, Y., Fan, B., Wang, L., Bai, J., Xiang, S., Pan, C.: Semantic labeling in very high resolution images via a self-cascaded convolutional neural network. ISPRS J. Photogram. Remote Sens. 145, 78–95 (2018)
Article Google Scholar
Lu, L., Wang, C., Yin, X.: Incorporating texture into SLIC super-pixels method for high spatial resolution remote sensing image segmentation. In: 2019 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), pp. 1–5 (2019). https://doi.org/10.1109/Agro-Geoinformatics.2019.8820692
Lv, N., Ma, H., Chen, C., Pei, Q., Zhou, Y., Xiao, F., Li, J.: Remote sensing data augmentation through adversarial training. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 9318–9333 (2021). https://doi.org/10.1109/JSTARS.2021.3110842
Article Google Scholar
Ma, J.: Segmentation loss odyssey. arXiv:2005.13449 (2020)
Malik, R., Kheddam, R., Belhadj-Aissa, A.: Toward an optimal object-oriented image classification using SVM and MLLH approaches. In: 2015 First International Conference on New Technologies of Information and Communication (NTIC), pp. 1–6 (2015). https://doi.org/10.1109/NTIC.2015.7368750
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571 (2016). https://doi.org/10.1109/3DV.2016.79
Ming, D., Ci, T., Cai, H., Li, L., Qiao, C., Du, J.: Semivariogram-based spatial bandwidth selection for remote sensing image segmentation with mean-shift algorithm. IEEE Geosci. Remote Sens. Lett. 9(5), 813–817 (2012). https://doi.org/10.1109/LGRS.2011.2182604
Article Google Scholar
Mnih, V.: Mnih Massachusetts building dataset. http://www.cs.toronto.edu/~vmnih/data/ (2013)
Rahman, M.A., Yang, W.: Optimizing intersection-over-union in deep neural networks for image segmentation. In: International Symposium on Visual Computing (2016)
Rekkas, V.P., Sotiroudis, S., Sarigiannidis, P., Wan, S., Karagiannidis, G.K., Goudos, S.K.: Machine learning in beyond 5G/6G networks-state-of-the-art and future trends. Electronics 10(22), 2786 (2021)
Article Google Scholar
Ren, J., Tong, L., Li, Y., Yuan, L., Si, Y.: Improved unet combining dropout and acnet for remote sensing image change detection. In: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pp. 4380–4383 (2021). https://doi.org/10.1109/IGARSS47720.2021.9553666
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, pp. 234–241. Springer International Publishing, Cham (2015)
Google Scholar
Saxena, N., N K.B., Raman, B.: Semantic segmentation of multispectral images using res-seg-net model. In: 2020 IEEE 14th International Conference on Semantic Computing (ICSC), pp. 154–157 (2020). https://doi.org/10.1109/ICSC.2020.00030
Sw, A., Sd, B., Chen, C.C.: Edge computing enabled video segmentation for real-time traffic monitoring in internet of vehicles. Pattern Recognit. (2021)
Taghanaki, S.A., Zheng, Y., Kevin, Z.S., Georgescu, B., Sharma, P., Xu, D., Comaniciu, D., Hamarneh, G.: Combo loss: handling input and output imbalance in multi-organ segmentation. Comput. Med. Imaging Graphics 75, 24 (2019)
Article Google Scholar
Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, PMLR, pp. 6105–6114 (2019)
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6450–6458 (2017). https://doi.org/10.1109/CVPR.2017.683
Wu, G., Guo, Z., Shao, X., Shibasaki, R.: Geoseg: A computer vision package for automatic building segmentation and outline extraction. In: IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium, pp. 158–161 (2019). https://doi.org/10.1109/IGARSS.2019.8900475
Wu, Z., Shen, C., Hengel, A.: Bridging category-level and instance-level semantic image segmentation. arXiv:1605.06885 (2016)
Xiang, D., Tang, T., Hu, C., Li, Y., Su, Y.: A kernel clustering algorithm with fuzzy factor: Application to SAR image segmentation. IEEE Geosci. Remote Sens. Lett. 11(7), 1290–1294 (2014). https://doi.org/10.1109/LGRS.2013.2292820
Article Google Scholar
Xia, X., Xu, C., Nan, B.: Inception-v3 for flower classification. In: 2017 2nd International Conference on Image, Vision and Computing (ICIVC), pp. 783–787 (2017). https://doi.org/10.1109/ICIVC.2017.7984661
Xu, G., Yang, L., Liu, X., Li, R.: Research of road extraction based on hough transformation and morphology. In: 2012 International Conference on Computer Science and Service System, pp. 2261–2264 (2012). https://doi.org/10.1109/CSSS.2012.561
Zeng, X., Chen, I., Liu, P.: Improve semantic segmentation of remote sensing images with k-mean pixel clustering: A semantic segmentation post-processing method based on k-means clustering. In: 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE), pp. 231–235. (2021) https://doi.org/10.1109/CSAIEE54046.2021.9543336
Zhang, H., Zhu, Q., Guan, X.: Probe into image segmentation based on sobel operator and maximum entropy algorithm. In: 2012 International Conference on Computer Science and Service System, pp. 238–241 (2012). https://doi.org/10.1109/CSSS.2012.67
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 3–11. Springer (2018)

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2019YFE0196600), the National Natural Science Foundation of China (62072360, 61902292, 62001357, 62072359, 62172438), the key research and development plan of Shaanxi Province (2019ZDLGY13-07, 2019ZDLGY13-04, 2020JQ-844), the Natural Science Foundation of Guangdong Province of China (2022A1515010988), the Xi’an Science and Technology Plan (20RGZN0005), and the Xi’ an Key Laboratory of Mobile Edge Computing and Security (201805052-ZD3CG36).

Author information

Authors and Affiliations

The School of Automation and Information Engineering, Xi’an University of Technology, Xi’an, 710048, China
Peiqi Yang & Mingjun Wang
The School of Electronic Engineering, Xidian University, Xi’an, 710071, China
Hao Yuan
Science and Technology on Communication Networks Laboratory, Shijiazhuang, 050000, China
Ci He
The 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang, 050000, China
Ci He
State Grid JiLin Province Electric Power Company Limited Information Communication Company, Changchun, 130000, China
Li Cong

Authors

Peiqi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Mingjun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Ci He
View author publications
You can also search for this author in PubMed Google Scholar
Li Cong
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

PY performed the data analyses and wrote the manuscript. MW helped perform the analysis with constructive discussions. HY contributed to the conception of the study. CH contributed to analysis and manuscript preparation. LC contributed to analysis and manuscript review.

Corresponding author

Correspondence to Hao Yuan.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, P., Wang, M., Yuan, H. et al. Using contour loss constraining residual attention U-net on optical remote sensing interpretation. Vis Comput 39, 4279–4291 (2023). https://doi.org/10.1007/s00371-022-02590-3

Download citation

Accepted: 06 June 2022
Published: 19 July 2022
Issue Date: September 2023
DOI: https://doi.org/10.1007/s00371-022-02590-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using contour loss constraining residual attention U-net on optical remote sensing interpretation

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

CIM-WV: A 2D semantic segmentation dataset of rich window view contents in high-rise, high-density Hong Kong based on photorealistic city information models

CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using contour loss constraining residual attention U-net on optical remote sensing interpretation

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

CIM-WV: A 2D semantic segmentation dataset of rich window view contents in high-rise, high-density Hong Kong based on photorealistic city information models

CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation