Skip to main content

Advertisement

Log in

An annotated image database of building facades categorized into land uses for object detection using deep learning

Case study for the city of Vila Velha-ES, Brazil

  • Short Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

This article presents a machine learning approach to automatic land use categorization based on a convolutional artificial neural network architecture. It is intended to support the detection and classification of building facades in order to associate each building with its respective land use. Replacing the time-consuming manual acquisition of images in the field and subsequent interpretation of the data with computer-aided techniques facilitates the creation of useful maps for urban planning. A specific future objective of this study is to monitor the commercial evolution in the city of Vila Velha, Brazil. The initial step is object detection based on a deep network architecture called Faster R-CNN. The model is trained on a collection of street-level photographs of buildings of desired land uses, from a database of annotated images of building facades. Images are extracted from Google Street View scenes. Furthermore, in order to save manual annotation time, a semi-supervised dual pipeline method is proposed that uses a pre-trained predictor model from the Places365 database to learn unannotated images. Several backbones were connected to the Faster R-CNN architecture for comparisons. The experimental results with the VGG backbone show an improvement over published works, with an average accuracy of 86.49%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. A sheet with class mappings will be available when the dataset is published at github.com/fredericodb/Faster R-CNN-Facades.

  2. The code is available at http://github.com/fredericodb/FasterR-CNN-Facades.

References

  1. Alhasoun, F., Gonzalez, M.: Streetify: using street view imagery and deep learning for urban streets development. In: IEEE Big Data 2019. Los Angeles, CA, USA (2019). arXiv:1911.08007

  2. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th International Conference on Machine Learning, pp. 41–48. Montreal, Canada (2009)

  3. Butt, A., Shabbir, R., Ahmad, S.S., Aziz, N.: Land use change mapping and analysis using Remote Sensing and GIS: a case study of Simly watershed, Islamabad, Pakistan. Egypt. J. Remote Sens. Space Sci. 18(2), 251–259 (2015). https://doi.org/10.1016/j.ejrs.2015.07.003

    Article  Google Scholar 

  4. Chen, X., Chen, J., Shi, Y., Yamaguchi, Y.: An automated approach for updating land cover maps based on integrated change detection and classification methods. ISPRS J. Photogramm. Remote Sens. 71, 86–95 (2012). https://doi.org/10.1016/j.isprsjprs.2012.05.006

    Article  Google Scholar 

  5. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195

  6. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: 30th Conference on Neural Information Processing Systems (NIPS 2016), Nips, pp. 379–387 (2016). http://papers.nips.cc/paper/6465-r-fcn-object-detection-via-region-based-fully-convolutional-networks.pdf

  7. Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2155–2162 (2014). https://doi.org/10.1109/CVPR.2014.276

  8. ESRI: ESRI Shapefile Technical Description. Technical Report, July, Environmental Systems Research Institute, Inc., Redlands, CA (1998). https://doi.org/10.1016/0167-9473(93)90138-J. http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf

  9. Fathalla, R., Vogiatzis, G.: A deep learning pipeline for semantic facade segmentation. In: British Machine Vision Conference 2017, pp. 1–13 (2017). http://www.aast.edu/cv.php?ser=36825, http://www.george-vogiatzis.org

  10. Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: Deconvolutional Single Shot Detector. UNC Chapel Hill, Amazon Inc, Technical report (2017). arXiv:1701.06659

  11. Ghrefat, H.A., Goodell, P.C.: Land cover mapping at Alkali Flat and Lake Lucero, White Sands, New Mexico, USA using multi-temporal and multi-spectral remote sensing data. Int. J. Appl. Earth Observ. Geoinform. 13(4), 616–625 (2011). https://doi.org/10.1016/j.jag.2011.03.009

    Article  Google Scholar 

  12. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2015 Inter, pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169

  13. Girshick, R., Donahue, J., Darrell, T., Malik, J., Berkeley, U.C., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, p. 5000 (2014). https://doi.org/10.1109/CVPR.2014.81. arxiv:1311.2524

  14. He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2017-Octob, pp. 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.322

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  16. Huang, X., Wang, Y., Li, J., Chang, X., Cao, Y., Xie, J., Gong, J.: High-resolution urban land-cover mapping and landscape analysis of the 42 major cities in China using ZY-3 satellite images. Sci. Bull. 65(12), 1039–1048 (2020). https://doi.org/10.1016/j.scib.2020.03.003

    Article  Google Scholar 

  17. IBGE (ed.): Produto interno bruto dos municípios 2017. IBGE, Rio de Janeiro (2019). https://cidades.ibge.gov.br/brasil/es/vila-velha/pesquisa/38/47001?tipo=ranking

  18. Jiao, L., Zhang, F., Liu, F., Yang, S., Li, L., Feng, Z., Qu, R.: A survey of deep learning-based object detection. IEEE Access 7(3), 128837–128868 (2019). https://doi.org/10.1109/ACCESS.2019.2939201

    Article  Google Scholar 

  19. Khazri, A.: Faster RCNN Object detection (2019). https://towardsdatascience.com/faster-rcnn-object-detection-f865e5ed7fc4

  20. Laupheimer, D., Haala, N.: Deep learning for the classification of building facades. In: 38. Wissenschaftlich-Technische Jahrestagung der DGPF und PFGK18 Tagung in München, vol. 19, pp. 701–709. DGPF, Band 27 (2018)

  21. Laupheimer, D., Tutzauer, P., Haala, N., Spicker, M.: Neural networks for the classification of building use from street-view imagery. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 4(2), 177–184 (2018). https://doi.org/10.5194/isprs-annals-IV-2-177-2018

    Article  Google Scholar 

  22. Law, S., Seresinhe, C.I., Shen, Y., Gutierrez-Roig, M.: Street-Frontage-Net: urban image classification using deep convolutional neural networks. Int. J. Geogr. Inf. Sci. (2018). https://doi.org/10.1080/13658816.2018.1555832

    Article  Google Scholar 

  23. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020). https://doi.org/10.1109/TPAMI.2018.2858826

    Article  Google Scholar 

  24. Liu, H., Zhang, J., Zhu, J., Hoi, S.C.H.: DeepFacade: A Deep Learning Approach to Facade Parsing. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 2301–2307. International Joint Conferences on Artificial Intelligence Organization, California (2017). https://doi.org/10.24963/ijcai.2017/320

  25. Liu, L., Silva, E.A., Wu, C., Wang, H.: A machine learning-based method for the large-scale evaluation of the qualities of the urban environment. Comput. Environ. Urban Syst. 65, 113–125 (2017). https://doi.org/10.1016/j.compenvurbsys.2017.06.003

    Article  Google Scholar 

  26. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9905 LNCS, pp. 21–37 (2016). https://doi.org/10.1007/978-3-319-46448-0_2

  27. Naik, N., Kominers, S.D., Raskar, R., Glaeser, E.L., Hidalgo, C.A.: Computer vision uncovers predictors of physical urban change. Proc. Natl. Acad. Sci. USA 114(29), 7571–7576 (2017). https://doi.org/10.1073/pnas.1619003114

    Article  Google Scholar 

  28. OpenStreetMap contributors: Planet dump retrieved from https://planet.osm.org, https://www.openstreetmap.org (2017)

  29. Pacifici, F., Chini, M., Emery, W.J.: A neural network approach using multi-scale textural metrics from very high-resolution panchromatic imagery for urban land-use classification. Remote Sens. Environ. 113(6), 1276–1292 (2009). https://doi.org/10.1016/j.rse.2009.02.014

    Article  Google Scholar 

  30. Petropoulos, G.P., Arvanitis, K., Sigrimis, N.: Hyperion hyperspectral imagery analysis combined with machine learning classifiers for land use/cover mapping. Expert Syst. Appl. 39(3), 3800–3809 (2012). https://doi.org/10.1016/j.eswa.2011.09.083

    Article  Google Scholar 

  31. QGIS Development Team: QGIS Geographic Information System. Open Source Geospatial Foundation (2020). http://qgis.org

  32. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem, pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91

  33. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 6517–6525 (2017). https://doi.org/10.1109/CVPR.2017.690

  34. Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement. Technical report, University of Washington (2018). arXiv:1804.02767

  35. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  36. Rujoiu-Mare, M.R., Mihai, B.A.: Mapping land cover using remote sensing data and GIS techniques: a case study of Prahova Subcarpathians. Procedia Environ. Sci. 32, 244–255 (2016). https://doi.org/10.1016/j.proenv.2016.03.029

    Article  Google Scholar 

  37. Shao, Y., Lunetta, R.S.: Comparison of support vector machine, neural network, and CART algorithms for the land-cover classification using limited training data points. ISPRS J. Photogramm. Remote Sens. 70, 78–87 (2012). https://doi.org/10.1016/j.isprsjprs.2012.04.001

    Article  Google Scholar 

  38. Shen, Z., Liu, Z., Li, J., Jiang, Y.G., Chen, Y., Xue, X.: DSOD: learning deeply supervised object detectors from scratch. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2017-October, pp. 1937–1945 (2017). https://doi.org/10.1109/ICCV.2017.212

  39. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings, pp. 1–14 (2015)

  40. Srivastava, S., Vargas Muñoz, J.E., Lobry, S., Tuia, D.: Fine-grained landuse characterization using ground-based pictures: a deep learning solution based on globally available data. Int. J. Geogr. Inf. Sci. (2018). https://doi.org/10.1080/13658816.2018.1542698

    Article  Google Scholar 

  41. Stefanski, J., Chaskovskyy, O., Waske, B.: Mapping and monitoring of land use changes in post-Soviet western Ukraine using remote sensing data. Appl. Geogr. 55, 155–164 (2014). https://doi.org/10.1016/j.apgeog.2014.08.003

    Article  Google Scholar 

  42. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 07–12-June, pp. 1–9 (2015). https://doi.org/10.1109/CVPR.2015.7298594

  43. Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 2553–2561. Curran Associates Inc, Red Hook (2013)

    Google Scholar 

  44. Szuster, B.W., Chen, Q., Borger, M.: A comparison of classification techniques to support land cover and land use analysis in tropical coastal zones. Appl. Geogr. 31(2), 525–532 (2011). https://doi.org/10.1016/j.apgeog.2010.11.007

    Article  Google Scholar 

  45. Tapiador, F.J., Casanova, J.L.: Land use mapping methodology using remote sensing for the regional planning directives in Segovia, Spain. Landsc. Urban Plan. 62(2), 103–115 (2003). https://doi.org/10.1016/S0169-2046(02)00126-3

    Article  Google Scholar 

  46. Thapa, R.B., Murayama, Y.: Urban mapping, accuracy, & image classification: a comparison of multiple approaches in Tsukuba City, Japan. Appl. Geogr. 29(1), 135–144 (2009). https://doi.org/10.1016/j.apgeog.2008.08.001

    Article  Google Scholar 

  47. Tong, X.Y., Xia, G.S., Lu, Q., Shen, H., Li, S., You, S., Zhang, L.: Land-cover classification with high-resolution remote sensing images using transferable deep models. Remote Sens. Environ. 237(June 2019), 111322 (2020). https://doi.org/10.1016/j.rse.2019.111322

    Article  Google Scholar 

  48. Tracewski, L., Bastin, L., Fonte, C.C.: Repurposing a deep learning network to filter and classify volunteered photographs for land cover and land use characterization. Geo-Spat. Inf. Sci. 20(3), 252–268 (2017). https://doi.org/10.1080/10095020.2017.1373955

    Article  Google Scholar 

  49. Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective Search for Object Recognition. Technical report, University of Trento, Italy; University of Amsterdam, the Netherlands (2012)

  50. de Vila Velha, P.M.: Perfil Socioeconômico por Bairros. Technical report, Prefeitura Municipal de Vila Velha, Vila Velha (2013)

  51. Wang, J., Wang, X., Liu, W.: Weakly- and Semi-supervised Faster R-CNN with Curriculum Learning. In: Proceedings of the International Conference on Pattern Recognition, vol. 2018-August, pp. 2416–2421 (2018). https://doi.org/10.1109/ICPR.2018.8546088

  52. Wu, X., Sahoo, D., Hoi, S.C.H.: Recent advances in deep learning for object detection. Neurocomputing 396, 39–64 (2020). https://doi.org/10.1016/j.neucom.2020.01.085. arxiv:1908.03673

    Article  Google Scholar 

  53. Yalniz, I.Z., Jégou, H., Chen, K., Paluri, M., Mahajan, D.: Billion-scale semi-supervised learning for image classification. Technical report, Facebook AI (2019). arXiv:1905.00546

  54. Yang, X., Song, Z., King, I., Xu, Z.: A Survey on Deep Semi-supervised Learning, pp. 1–24 (2021). arXiv:2103.00550

  55. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 4(January), 3320–3328 (2014)

    Google Scholar 

  56. Yu, Q., Szegedy, C., Stumpe, M.C., Yatziv, L., Shet, V., Ibarz, J., Arnoud, S.: Large Scale Business Discovery from Street Level Imagery. Technical report, Google StreetView (2015). arXiv:1512.05430

  57. Zhao, Y., Han, R., Rao, Y.: A new feature pyramid network for object detection. In: Proceedings of the 2019 International Conference on Virtual Reality and Intelligent Systems, ICVRIS 2019, pp. 428–431 (2019). https://doi.org/10.1109/ICVRIS.2019.00110

  58. Zhao, Z.Q., Zheng, P., Xu, S.T., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019). https://doi.org/10.1109/TNNLS.2018.2876865

    Article  Google Scholar 

  59. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018). https://doi.org/10.1109/TPAMI.2017.2723009

    Article  Google Scholar 

  60. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database—supplementary materials. In: NIPS’14 Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 1, 487–495 (2014)

  61. Zhou, B., Liu, L., Oliva, A., Torralba, A.: Recognizing city identity via attribute analysis of geo-tagged images. In: Fleet, D.E.A. (ed.) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8691 LNCS, pp. 519–534. Springer, Switzerland (2014). https://doi.org/10.1007/978-3-319-10578-9_34

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frederico Damasceno Bortoloti.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bortoloti, F.D., Tavares, J., Rauber, T.W. et al. An annotated image database of building facades categorized into land uses for object detection using deep learning. Machine Vision and Applications 33, 80 (2022). https://doi.org/10.1007/s00138-022-01335-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-022-01335-5

Keywords

Navigation