Facade Layout Completion with Long Short-Term Memory Networks

Hensel, Simon; Goebbels, Steffen; Kada, Martin

doi:10.1007/978-3-031-25477-2_2

Simon Hensel¹⁴,
Steffen Goebbels¹⁴ &
Martin Kada¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1691))

Included in the following conference series:

International Joint Conference on Computer Vision, Imaging and Computer Graphics

404 Accesses

Abstract

In a workflow creating 3D city models, facades of buildings can be reconstructed from oblique aerial images for which the extrinsic and intrinsic parameters are known. If the wall planes have already been determined, e.g., based on airborne laser scanning point clouds, facade textures can be computed by applying a perspective transform. These images given, doors and windows can be detected and then added to the 3D model. In this study, the “Scaled YOLOv4” neural network is applied to detect facade objects. However, due to occlusions and artifacts from perspective correction, in general not all windows and doors are detected. This leads to the necessity of automatically continuing the pattern of facade objects into occluded or distorted areas. To this end, we propose a new approach based on recurrent neural networks. In addition to applying the Multi-Dimensional Long Short-term Memory network and the Quasi Recurrent Neural Network, we also use a novel architecture, the Rotated Multi-Dimensional Long Short-term Memory network. This architecture combines four two-dimensional Multi-Dimensional Long Short-term Memory networks on rotated images. Independent of the 3D city model workflow, the three networks were additionally tested on the Graz50 dataset for which the Rotated Multi-Dimensional Long Short-term Memory network delivered better results than the other two networks. The facade texture regions, in which windows and doors are added to the set of initially detected facade objects, are likely to be occluded or distorted. Before equipping 3D models with these textures, inpainting should be applied to these regions which then serve as automatically obtained inpainting masks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

AtlantaNet: Inferring the 3D Indoor Layout from a Single $$360^\circ $$ Image Beyond the Manhattan World Assumption

A Deep Learning Application for Detecting Facade Tile Degradation

Aligning and Updating Cadaster Maps with Aerial Images by Multi-task, Multi-resolution Deep Learning

Notes

1.
https://cityjson.org/specs/ (accessed: 2023/01/21 09:34:06).
2.
https://github.com/SimonHensel/LSTM-Facade-Completion.

References

Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283. USENIX Association, Savannah (2016)
Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 214–223. PMLR (2017)
Google Scholar
Bradbury, J., Merity, S., Xiong, C., Socher, R.: Quasi-recurrent neural networks. arXiv arXiv:1611.01576 (2016)
Chen, J., Yi, J.S.K., Kahoush, M., Cho, E.S., Cho, Y.K.: Point cloud scene completion of obstructed building facades with generative adversarial inpainting. Sensors 20(18), 5029 (2020)
Article Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Dai, D., Riemenschneider, H., Schmitt, G., Van Gool, L.: Example-based facade texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1065–1072 (2013)
Google Scholar
Dehbi, Y., Staat, C., Mandtler, L., Pl, L., et al.: Incremental refinement of facade models with attribute grammar from 3D point clouds. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci. 3, 311 (2016)
Article Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014)
Google Scholar
Graves, A., Fernández, S., Schmidhuber, J.: Multi-dimensional recurrent neural networks. CoRR (2007)
Google Scholar
Gröger, G., Kolbe, T.H., Czerwinski, A.: OpenGIS CityGML Implementation Specification (City Geography Markup Language). Open Geospatial Consortium Inc., OGC (2007)
Google Scholar
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hensel, S., Goebbels, S., Kada, M.: Facade reconstruction for textured LoD2 CityGML models based on deep learning and mixed integer linear programming. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci., IV-2/W5, 37–44 (2019). https://doi.org/10.5194/isprs-annals-IV-2-W5-37-2019
Hensel, S., Goebbels, S., Kada, M.: LSTM architectures for facade structure completion. In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 1: GRAPP, pp. 15–24. INSTICC, SciTePress (2021). https://doi.org/10.5220/0010194400150024
Hu, H., Wang, L., Zhang, M., Ding, Y., Zhu, Q.: Fast and regularized reconstruction of building facades from street-view images using binary integer programming. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci. V-2-2020, 365–371 (2020). https://doi.org/10.5194/isprs-annals-V-2-2020-365-2020
Huang, J.B., Kang, S.B., Ahuja, N., Kopf, J.: Image completion using planar structure guidance. ACM Trans. Graph. (TOG) 33(4), 1–10 (2014)
Google Scholar
Kalchbrenner, N., Danihelka, I., Graves, A.: Grid long short-term memory. arXiv:1507.01526 (2015)
Kottler, B., Bulatov, D., Zhang, X.: Context-aware patch-based method for façade inpainting. In: Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 1: GRAPP, pp. 210–218 (2020)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Google Scholar
Mehra, S., Dogra, A., Goyal, B., Sharma, A.M., Chandra, R.: From textural inpainting to deep generative models: an extensive survey of image inpainting techniques. J. Comput. Sci. 16(1), 35–49 (2020)
Article Google Scholar
Mtibaa, F., Nguyen, K.K., Azam, M., Papachristou, A., Venne, J.S., Cheriet, M.: LSTM-based indoor air temperature prediction framework for HVAC systems in smart buildings. Neural Comput. Appl. 32, 1–17 (2020)
Google Scholar
Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. arXiv:1901.00212 (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Riemenschneider, H., et al.: Irregular lattices for complex shape grammar facade parsing. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1640–1647 (2012)
Google Scholar
Salehinejad, H., Sankar, S., Barfett, J., Colak, E., Valaee, S.: Recent advances in recurrent neural networks. arXiv:1801.01078 (2017)
Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D 404, 132306 (2020)
Article MATH Google Scholar
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Google Scholar
Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., Paragios, N.: Shape grammar parsing via reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2273–2280. IEEE (2011)
Google Scholar
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)
Google Scholar
Tyleček, R., Šára, R.: Spatial pattern templates for recognition of objects with regular structure. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 364–374. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40602-7_39
Chapter Google Scholar
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-YOLOv4: scaling cross stage partial network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13029–13038 (2021)
Google Scholar
Wonka, P., Wimmer, M., Sillion, F., Ribarsky, W.: Instant architecture. ACM Trans. Graph. (TOG) 22(3), 669–677 (2003)
Article Google Scholar
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5505–5514 (2018)
Google Scholar
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Yu, T., et al.: Region normalization for image inpainting. Proc. AAAI Conf. Artif. Intell. 34(07), 12733–12740 (2020). https://doi.org/10.1609/aaai.v34i07.6967
Article Google Scholar
Zhang, D., Wang, D.: Relation classification: CNN or RNN? In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 665–675. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_60
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Pattern Recognition, Niederrhein University of Applied Sciences, Reinarzstrasse 49, Krefeld, Germany
Simon Hensel & Steffen Goebbels
Institute of Geodesy and Geoinformation Science, Technische Universität Berlin, Kaiserin-Augusta-Allee 104-106, Berlin, Germany
Martin Kada

Authors

Simon Hensel
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Goebbels
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Hensel .

Editor information

Editors and Affiliations

University of Porto, Porto, Portugal
A. Augusto de Sousa
Czech Technical University in Prague, Prague, Czech Republic
Vlastimil Havran
Mines ParisTech, Paris, France
Alexis Paljic
Davidson College, Davidson, NC, USA
Tabitha Peck
French Civil Aviation University (ENAC), Toulouse, France
Christophe Hurter
Monash University, Melbourne, Australia
Helen Purchase
University of Catania, Catania, Italy
Giovanni Maria Farinella
University of Barcelona, Barcelona, Spain
Petia Radeva
IRISA, University of Rennes 1, Rennes, France
Kadi Bouatouch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hensel, S., Goebbels, S., Kada, M. (2023). Facade Layout Completion with Long Short-Term Memory Networks. In: de Sousa, A.A., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2021. Communications in Computer and Information Science, vol 1691. Springer, Cham. https://doi.org/10.1007/978-3-031-25477-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-25477-2_2
Published: 02 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25476-5
Online ISBN: 978-3-031-25477-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Facade Layout Completion with Long Short-Term Memory Networks