AtlantaNet: Inferring the 3D Indoor Layout from a Single $$360^\circ $$ Image Beyond the Manhattan World Assumption

Pintore, Giovanni; Agus, Marco; Gobbetti, Enrico

doi:10.1007/978-3-030-58598-3_26

AtlantaNet: Inferring the 3D Indoor Layout from a Single $360^\circ $ Image Beyond the Manhattan World Assumption

Conference paper
First Online: 07 November 2020

3518 Accesses
17 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12353))

Abstract

We introduce a novel end-to-end approach to predict a 3D room layout from a single panoramic image. Compared to recent state-of-the-art works, our method is not limited to Manhattan World environments, and can reconstruct rooms bounded by vertical walls that do not form right angles or are curved – i.e., Atlanta World models. In our approach, we project the original gravity-aligned panoramic image on two horizontal planes, one above and one below the camera. This representation encodes all the information needed to recover the Atlanta World 3D bounding surfaces of the room in the form of a 2D room footprint on the floor plan and a room height. To predict the 3D layout, we propose an encoder-decoder neural network architecture, leveraging Recurrent Neural Networks (RNNs) to capture long-range geometric patterns, and exploiting a customized training strategy based on domain-specific knowledge. The experimental results demonstrate that our method outperforms state-of-the-art solutions in prediction accuracy, in particular in cases of complex wall layouts or curved wall footprints.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient interactive annotation of segmentation datasets with Polygon-RNN++. In: Proceedings of CVPR (2018)
Google Scholar
Castrejon, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a Polygon-RNN. In: Proceedings of CVPR (2017)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE TPAMI 40(4), 834–848 (2017)
Article Google Scholar
Delage, E., Honglak Lee, Ng, A.Y.: A dynamic Bayesian network model for autonomous 3D reconstruction from a single indoor image. In: Proceedings of CVPR, vol. 2, pp. 2418–2428 (2006)
Google Scholar
Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int. J. Geogr. Inf. Geovisual. 10(2), 112–122 (1973)
Article Google Scholar
Fernandez-Labrador, C., Fácil, J.M., Perez-Yus, A., Demonceaux, C., Civera, J., Guerrero, J.J.: Corners for layout: End-to-end layout recovery from 360 images (2019). arXiv:1903.08094
Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3D features. In: Proceedings of ICCV, pp. 2228–2235 (2011)
Google Scholar
Flint, A., Mei, C., Murray, D., Reid, I.: A dynamic programming approach to reconstructing building interiors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 394–407. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_29
Chapter Google Scholar
Gallagher, A.C.: Using vanishing points to correct camera rotation in images. In: Proceedings of CVR, pp. 460–467 (2005)
Google Scholar
Geyer, C., Daniilidis, K.: A unifying theory for central panoramic systems and practical implications. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 445–461. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45053-X_29
Chapter Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of CVPR, pp. 770–778 (2016)
Google Scholar
Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: Proceedings of ICCV, pp. 1849–1856 (2009)
Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vision 75(1), 151–172 (2007). https://doi.org/10.1007/s11263-006-0031-y
Article MATH Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014)
Google Scholar
Kujiale.com: Structured3D Data (2019). https://structured3d-dataset.org/. Accessed 25 Sept 2019
Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: Proceedings of CVPR, pp. 2136–2143 (2009)
Google Scholar
Matterport: Matterport3D (2017). https://github.com/niessner/Matterport. Accessed 25 Sept 2019
Paszke, A., et al.: Automatic differentiation in pytorch. In: Proceedings of NIPS (2017)
Google Scholar
Pintore, G., Garro, V., Ganovelli, F., Agus, M., Gobbetti, E.: Omnidirectional image capture on mobile devices for fast automatic generation of 2.5D indoor maps. In: Proceedings of IEEE WACV, pp. 1–9 (2016)
Google Scholar
Pintore, G., Mura, C., Ganovelli, F., Fuentes-Perez, L., Pajarola, R., Gobbetti, E.: State-of-the-art in automatic 3D reconstruction of structured indoor environments. Comput. Graph. Forum 39(2), 667–699 (2020)
Article Google Scholar
Pintore, G., Pintus, R., Ganovelli, F., Scopigno, R., Gobbetti, E.: Recovering 3D existing-conditions of indoor structures from spherical images. Comput. Graph. 77, 16–29 (2018)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Schindler, G., Dellaert, F.: Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments. In: Proceedings of CVPR, vol. 1, p. I (2004)
Google Scholar
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W., Woo, W.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Proceedings of NIPS, pp. 802–810 (2015)
Google Scholar
Stanford University: BuildingParser Dataset (2017). http://buildingparser.stanford.edu/dataset.html. Accessed 25 Sept 2019
Sun, C., Hsiao, C.W., Sun, M., Chen, H.T.: HorizonNet: learning room layout with 1D representation and pano stretch data augmentation. In: Proceedings of CVPR (2019)
Google Scholar
Xu, J., Stenger, B., Kerola, T., Tung, T.: Pano2CAD: room layout from a single panorama image. In: Proceedings of WACV, pp. 354–362 (2017)
Google Scholar
Yang, H., Zhang, H.: Efficient 3D room shape recovery from a single panorama. In: Proceedings of CVPR, pp. 5422–5430 (2016)
Google Scholar
Yang, S.T., Peng, C.H., Wonka, P., Chu, H.K.: PanoAnnotator: a semi-automatic tool for indoor panorama layout annotation. In: Proceedings of SIGGRAPH Asia 2018 Posters, pp. 34:1–34:2 (2018)
Google Scholar
Yang, S.T., Wang, F.E., Peng, C.H., Wonka, P., Sun, M., Chu, H.K.: DuLa-Net: a dual-projection network for estimating room layouts from a single RGB panorama. In: Proceedings of CVPR (2019)
Google Scholar
Yang, Y., Jin, S., Liu, R., Yu, J.: Automatic 3D indoor scene modeling from single panorama. In: Proceedings of CVPR, pp. 3926–3934 (2018)
Google Scholar
Zhang, Y., Song, S., Tan, P., Xiao, J.: PanoContext: a whole-room 3D context model for panoramic scene understanding. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 668–686. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_43
Chapter Google Scholar
Zou, C., Colburn, A., Shan, Q., Hoiem, D.: LayoutNet: reconstructing the 3D room layout from a single RGB image. In: Proceedings of CVPR, pp. 2051–2059 (2018)
Google Scholar
Zou, C., et al.: 3D Manhattan room layout reconstruction from a single 360 image (2019)
Google Scholar

Download references

Acknowledgments

This work has received funding from Sardinian Regional Authorities under projects VIGECLAB, AMAC, and TDM (POR FESR 2014-2020). We also acknowledge the contribution of the European Union’s H2020 research and innovation programme under grant agreements 813170 (EVOCATION).

Author information

Authors and Affiliations

Visual Computing, CRS4, Cagliari, Italy
Giovanni Pintore, Marco Agus & Enrico Gobbetti
College of Science and Engineering, HBKU, Doha, Qatar
Marco Agus

Authors

Giovanni Pintore
View author publications
You can also search for this author in PubMed Google Scholar
Marco Agus
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Gobbetti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giovanni Pintore .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 15578 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pintore, G., Agus, M., Gobbetti, E. (2020). AtlantaNet: Inferring the 3D Indoor Layout from a Single $360^\circ $ Image Beyond the Manhattan World Assumption. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12353. Springer, Cham. https://doi.org/10.1007/978-3-030-58598-3_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-58598-3_26
Published: 07 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58597-6
Online ISBN: 978-3-030-58598-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AtlantaNet: Inferring the 3D Indoor Layout from a Single \(360^\circ \) Image Beyond the Manhattan World Assumption

Abstract

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 15578 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Abstract

Buying options

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 15578 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation