Abstract
Deep neural network models are commonly used in computer vision problems, e.g., image segmentation. Convolutional neural networks have been state-of-the-art methods in image processing, but new architectures, such as Transformer-based approaches, have started outperforming previous techniques in many applications. However, those techniques are still not commonly used in urban analyses, mostly performed manually. This paper presents a framework for the residential building semantic segmentation architecture as a tool for automatic urban phenomena monitoring. The method could improve urban decision-making processes with automatic city analysis, which is predisposed to be faster and even more accurate than those made by human researchers. The study compares the application of new deep network architectures with state-of-the-art solutions. The analysed problem is urban functional zone segmentation for the urban sprawl evaluation using targeted land cover map construction. The proposed method monitors the expansion of the city, which, uncontrolled, can cause adverse effects. The method was tested on photos from three residential districts. The first district has been manually segmented by functional zones and used for model training and evaluation. The other two districts have been used for automated segmentation by models’ inference to test the robustness of the methodology. The test resulted in 98.2% accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bao, H., Ming, D., Guo, Y., Zhang, K., Zhou, K., Du, S.: DFCNN-based semantic recognition of urban functional zones by integrating remote sensing data and POI data. Remote Sens. 12(7), 1088 (2020). https://doi.org/10.3390/rs12071088
Chen, S., Zhang, H., Yang, H.: Urban functional zone recognition integrating multisource geographic data. Remote Sens. 13(23) (2021).https://doi.org/10.3390/rs13234732
Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask transformer for universal image segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition June 2022, pp. 1280–1289 (2022). https://doi.org/10.1109/CVPR52688.2022.00135
Chiguvi, D., Kgathi-Thite, D.: Analysis of the positive and negative effects of urban sprawl and dwelling transformation in urban cities: case study of Tati Siding Village in Botswana. J. Legal Ethical Regul. Issues 25(S2), 1–13 (2022)
Chu, X., et al.: Twins: revisiting the design of spatial attention in vision transformers. In: Advances in Neural Information Processing Systems, vol. 12(NeurIPS), pp. 9355–9366 (2021)
Cocheci, R.M., Petrisor, A.I.: Assessing the negative effects of suburbanization: the urban sprawl restrictiveness index in Romania’s metropolitan areas. Land 12(5) (2023) https://doi.org/10.3390/land12050966
Deng, Y., He, R.: Refined urban functional zone mapping by integrating open-source data. ISPRS Int. J. Geo-Inf. 11(8) (2022) https://doi.org/10.3390/ijgi11080421
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=YicbFdNTTy
Izzo, S., Prezioso, E., Giampaolo, F., Mele, V., Di Somma, V., Mei, G.: Classification of urban functional zones through deep learning. Neural Comput. Appl. 34(9), 6973–6990 (2022). https://doi.org/10.1007/s00521-021-06822-w
Jocher, G., Chaurasia, A., Qiu, J.: YOLO by Ultralytics (2023). https://github.com/ultralytics/ultralytics
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations. ICLR 2015 - Conference Track Proceedings, pp. 1–15 (2015)
Li, M., et al.: Method of building detection in optical remote sensing images based on SegFormer. Sensors 23(3) (2023). https://doi.org/10.3390/s23031258
Lityński, P.: The intensity of urban sprawl in Poland. ISPRS Int. J. Geo-Inf. 10(2) (2021). https://doi.org/10.3390/ijgi10020095
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9992–10002 (2021). https://doi.org/10.1109/ICCV48922.2021.00986
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: 7th International Conference on Learning Representations. ICLR 2019 (2019)
Mansour, D., Souiah, S.A., El Amin Larabi, M.: Built-up area extraction through deep learning. In: 2021 IEEE International Geoscience and Remote Sensing Symposium. IGARSS, pp. 6805–6808 (2021). https://doi.org/10.1109/IGARSS47720.2021.9554694
Niu, R., Sun, X., Tian, Y., Diao, W., Chen, K., Fu, K.: Hybrid multiple attention network for semantic segmentation in aerial images. IEEE Trans. Geosci. Remote Sens. 60, 1–18 (2022). https://doi.org/10.1109/TGRS.2021.3065112
Pan, Z., Xu, J., Guo, Y., Hu, Y., Wang, G.: Deep learning segmentation and classification for urban village using a worldview satellite image based on U-net. Remote Sens. 12(10), 1–17 (2020). https://doi.org/10.3390/rs12101574
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 12159–12168 (2021). https://doi.org/10.1109/ICCV48922.2021.01196
Renata, R.C., Barbara, C., Andrzej, S.: Which polish cities sprawl the most. Land 10(12) (2021). https://doi.org/10.3390/land10121291
Song, J., Zhu, A.X., Zhu, Y.: Transformer-based semantic segmentation for extraction of building footprints from very-high-resolution images. Sensors 23(11) (2023). https://doi.org/10.3390/s23115166
Spirkova, D., Adamuscin, A., Golej, J., Panik, M.: Negative effects of urban sprawl. In: Charytonowicz, J. (ed.) AHFE 2020. AISC, vol. 1214, pp. 222–228. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51566-9_30
Tao, J., et al.: Seg-road: a segmentation network for road extraction based on transformer and CNN with connectivity structures. Remote Sens. 15(6) (2023). https://doi.org/10.3390/rs15061602
Tian, T., Chu, Z., Hu, Q., Ma, L.: Class-wise fully convolutional network for semantic segmentation of remote sensing images. Remote Sens. 13(16), 200–215 (2021). https://doi.org/10.3390/rs13163211
Tsagkis, P., Bakogiannis, E., Nikitas, A.: Analysing urban growth using machine learning and open data: an artificial neural network modelled case study of five Greek cities. Sustain. Cities Soc. 89, 104337 (2023). https://doi.org/10.1016/j.scs.2022.104337
Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 548–558 (2021). https://doi.org/10.1109/ICCV48922.2021.00061
Wang, Y., et al.: Mask DeepLab: end-to-end image segmentation for change detection in high-resolution remote sensing images. Int. J. Appl. Earth Obs. Geoinf. 104, 102582 (2021). https://doi.org/10.1016/j.jag.2021.102582
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. In: Advances in Neural Information Processing Systems, vol. 15, 12077–12090. NeurIPS (2021)
Yi, S., Liu, X., Li, J., Chen, L.: UAVformer: a composite transformer network for urban scene segmentation of UAV images. Pattern Recogn. 133 (2023). https://doi.org/10.1016/j.patcog.2022.109019
Yin, B., et al.: How to accurately extract large-scale urban land? Establishment of an improved fully convolutional neural network model. Front. Earth Sci. 16(4) (2022). https://doi.org/10.1007/s11707-022-0985-2
Zhang, X., Aliaga, D.: RFCNet: enhancing urban segmentation using regularization, fusion, and completion. Comput. Vis. Image Underst. 220(April), 103435 (2022). https://doi.org/10.1016/j.cviu.2022.103435
Zhang, X., Li, W., Zhang, F., Liu, R., Du, Z.: Identifying urban functional zones using public bicycle rental records and point-of-interest data. ISPRS Int. J. Geo-Inf. 7(12) (2018). https://doi.org/10.3390/ijgi7120459
Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 6877–6886 (2021). https://doi.org/10.1109/CVPR46437.2021.00681
Zhou, B., et al.: Semantic understanding of scenes through the ADE20K dataset. Int. J. Comput. Vision 127(3), 302–321 (2019). https://doi.org/10.1007/s11263-018-1140-0
Zhou, W., Ming, D., Lv, X., Zhou, K., Bao, H., Hong, Z.: SO–CNN based urban functional zone fine division with VHR remote sensing image. Remote Sens. Environ. 236(November 2019), 111458 (2020).https://doi.org/10.1016/j.rse.2019.111458
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Łysak, A., Luckner, M. (2024). Deep Learning Residential Building Segmentation for Evaluation of Suburban Areas Development. In: Franco, L., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2024. ICCS 2024. Lecture Notes in Computer Science, vol 14838. Springer, Cham. https://doi.org/10.1007/978-3-031-63783-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-63783-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63785-8
Online ISBN: 978-3-031-63783-4
eBook Packages: Computer ScienceComputer Science (R0)