Abstract
Maintaining a public register by extracting urban objects from photos taken in the city is one of the most important tasks for municipal services. It is of great importance in the field of protection and shaping of the cultural landscape, protection of monuments, and registration of the urban tissue development. The current state of the art shows that deep learning models (DL models) can cope with the problem of extracting urban objects with the same or better performance than non-DL models, and can process video and photos automatically. This paper compares the three main DL models for facade instance detection and facade segmentation: Mask R-CNN, YOLACT, and Mask-Scoring R-CNN. The training and validation datasets used for transfer learning were created on the basis of spherical photos taken in an artificially generated virtual city. The test dataset, on the other hand, included spherical façade photos taken in a real city. The comparative analysis of the DL models was performed using parametric and nonparametric statistical tests for pairwise and multiple comparisons.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Caffe2 model zoo. https://caffe2.ai/docs/zoo.html. Accessed 6 Aug 2021
Pyprt - python bindings for cityengine sdk. https://github.com/Esri/pyprt. Accessed 6 Aug 2021
scikit-posthocs. https://scikit-posthocs.readthedocs.io/. Accessed 6 Aug 2021
Vgg image annotator (via). https://www.robots.ox.ac.uk/~vgg/software/via/via.html. Accessed 6 Aug 2021
py360convert. https://github.com/sunset1995/py360convert (2020). [Online; Accessed 15 Jan 2021
Bolya, D., Zhou, C., Xiao, F., Lee, Y.J.: YOLACT: real-time instance segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9157–9166 (2019)
Chen, X., Girshick, R., He, K., Dollár, P.: TensorMask: a foundation for dense object segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2061–2069 (2019)
Girshick, R.: Fast r-cnn. CoRR abs/1504.08083 (2015). http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.322
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Hernández, J., Marcotegui, B.: Morphological segmentation of building façade images. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 4029–4032 (2009). https://doi.org/10.1109/ICIP.2009.5413756
Huang, Z., Huang, L., Gong, Y., Huang, C., Wang, X.: Mask scoring R-CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6409–6418 (2019)
Kutrzyński, M., Żak, B., Telec, Z., Trawiński, B.: An approach to automatic detection of architectural façades in spherical images. In: Intelligent Information and Database Systems: 13th Asian Conference, ACIIDS 2021, Phuket, Thailand, 7–10 April 2021, Proceedings 13, pp. 494–504. Springer (2021). https://doi.org/10.1007/978-3-030-73280-6_39
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
Sümer, E., Türker, M.: An automatic region growing based approach to extract facade textures from single ground-level building images. J. Geodesy Geoinf. 2(1), 9–17 (2013)
Wendel, A., Donoser, M., Bischof, H.: Unsupervised facade segmentation using repetitive patterns, vol. 6376, pp. 51–60 (2010). https://doi.org/10.1007/978-3-642-15986-2_6
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kutrzyński, M., Żak, B., Telec, Z., Trawiński, B. (2021). Deep Learning Models for Architectural Façade Detection in Spherical Images. In: Nguyen, N.T., Iliadis, L., Maglogiannis, I., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2021. Lecture Notes in Computer Science(), vol 12876. Springer, Cham. https://doi.org/10.1007/978-3-030-88081-1_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-88081-1_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88080-4
Online ISBN: 978-3-030-88081-1
eBook Packages: Computer ScienceComputer Science (R0)