Abstract
In a workflow creating 3D city models, facades of buildings can be reconstructed from oblique aerial images for which the extrinsic and intrinsic parameters are known. If the wall planes have already been determined, e.g., based on airborne laser scanning point clouds, facade textures can be computed by applying a perspective transform. These images given, doors and windows can be detected and then added to the 3D model. In this study, the “Scaled YOLOv4” neural network is applied to detect facade objects. However, due to occlusions and artifacts from perspective correction, in general not all windows and doors are detected. This leads to the necessity of automatically continuing the pattern of facade objects into occluded or distorted areas. To this end, we propose a new approach based on recurrent neural networks. In addition to applying the Multi-Dimensional Long Short-term Memory network and the Quasi Recurrent Neural Network, we also use a novel architecture, the Rotated Multi-Dimensional Long Short-term Memory network. This architecture combines four two-dimensional Multi-Dimensional Long Short-term Memory networks on rotated images. Independent of the 3D city model workflow, the three networks were additionally tested on the Graz50 dataset for which the Rotated Multi-Dimensional Long Short-term Memory network delivered better results than the other two networks. The facade texture regions, in which windows and doors are added to the set of initially detected facade objects, are likely to be occluded or distorted. Before equipping 3D models with these textures, inpainting should be applied to these regions which then serve as automatically obtained inpainting masks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://cityjson.org/specs/ (accessed: 2023/01/21 09:34:06).
- 2.
References
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283. USENIX Association, Savannah (2016)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 214–223. PMLR (2017)
Bradbury, J., Merity, S., Xiong, C., Socher, R.: Quasi-recurrent neural networks. arXiv arXiv:1611.01576 (2016)
Chen, J., Yi, J.S.K., Kahoush, M., Cho, E.S., Cho, Y.K.: Point cloud scene completion of obstructed building facades with generative adversarial inpainting. Sensors 20(18), 5029 (2020)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Dai, D., Riemenschneider, H., Schmitt, G., Van Gool, L.: Example-based facade texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1065–1072 (2013)
Dehbi, Y., Staat, C., Mandtler, L., Pl, L., et al.: Incremental refinement of facade models with attribute grammar from 3D point clouds. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci. 3, 311 (2016)
Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014)
Graves, A., Fernández, S., Schmidhuber, J.: Multi-dimensional recurrent neural networks. CoRR (2007)
Gröger, G., Kolbe, T.H., Czerwinski, A.: OpenGIS CityGML Implementation Specification (City Geography Markup Language). Open Geospatial Consortium Inc., OGC (2007)
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hensel, S., Goebbels, S., Kada, M.: Facade reconstruction for textured LoD2 CityGML models based on deep learning and mixed integer linear programming. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci., IV-2/W5, 37–44 (2019). https://doi.org/10.5194/isprs-annals-IV-2-W5-37-2019
Hensel, S., Goebbels, S., Kada, M.: LSTM architectures for facade structure completion. In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 1: GRAPP, pp. 15–24. INSTICC, SciTePress (2021). https://doi.org/10.5220/0010194400150024
Hu, H., Wang, L., Zhang, M., Ding, Y., Zhu, Q.: Fast and regularized reconstruction of building facades from street-view images using binary integer programming. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci. V-2-2020, 365–371 (2020). https://doi.org/10.5194/isprs-annals-V-2-2020-365-2020
Huang, J.B., Kang, S.B., Ahuja, N., Kopf, J.: Image completion using planar structure guidance. ACM Trans. Graph. (TOG) 33(4), 1–10 (2014)
Kalchbrenner, N., Danihelka, I., Graves, A.: Grid long short-term memory. arXiv:1507.01526 (2015)
Kottler, B., Bulatov, D., Zhang, X.: Context-aware patch-based method for façade inpainting. In: Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 1: GRAPP, pp. 210–218 (2020)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Mehra, S., Dogra, A., Goyal, B., Sharma, A.M., Chandra, R.: From textural inpainting to deep generative models: an extensive survey of image inpainting techniques. J. Comput. Sci. 16(1), 35–49 (2020)
Mtibaa, F., Nguyen, K.K., Azam, M., Papachristou, A., Venne, J.S., Cheriet, M.: LSTM-based indoor air temperature prediction framework for HVAC systems in smart buildings. Neural Comput. Appl. 32, 1–17 (2020)
Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. arXiv:1901.00212 (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Riemenschneider, H., et al.: Irregular lattices for complex shape grammar facade parsing. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1640–1647 (2012)
Salehinejad, H., Sankar, S., Barfett, J., Colak, E., Valaee, S.: Recent advances in recurrent neural networks. arXiv:1801.01078 (2017)
Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D 404, 132306 (2020)
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., Paragios, N.: Shape grammar parsing via reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2273–2280. IEEE (2011)
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)
Tyleček, R., Šára, R.: Spatial pattern templates for recognition of objects with regular structure. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 364–374. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40602-7_39
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-YOLOv4: scaling cross stage partial network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13029–13038 (2021)
Wonka, P., Wimmer, M., Sillion, F., Ribarsky, W.: Instant architecture. ACM Trans. Graph. (TOG) 22(3), 669–677 (2003)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5505–5514 (2018)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Yu, T., et al.: Region normalization for image inpainting. Proc. AAAI Conf. Artif. Intell. 34(07), 12733–12740 (2020). https://doi.org/10.1609/aaai.v34i07.6967
Zhang, D., Wang, D.: Relation classification: CNN or RNN? In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 665–675. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_60
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Hensel, S., Goebbels, S., Kada, M. (2023). Facade Layout Completion with Long Short-Term Memory Networks. In: de Sousa, A.A., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2021. Communications in Computer and Information Science, vol 1691. Springer, Cham. https://doi.org/10.1007/978-3-031-25477-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-25477-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25476-5
Online ISBN: 978-3-031-25477-2
eBook Packages: Computer ScienceComputer Science (R0)