Skip to main content

AtlantaNet: Inferring the 3D Indoor Layout from a Single \(360^\circ \) Image Beyond the Manhattan World Assumption

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12353))

Abstract

We introduce a novel end-to-end approach to predict a 3D room layout from a single panoramic image. Compared to recent state-of-the-art works, our method is not limited to Manhattan World environments, and can reconstruct rooms bounded by vertical walls that do not form right angles or are curved – i.e., Atlanta World models. In our approach, we project the original gravity-aligned panoramic image on two horizontal planes, one above and one below the camera. This representation encodes all the information needed to recover the Atlanta World 3D bounding surfaces of the room in the form of a 2D room footprint on the floor plan and a room height. To predict the 3D layout, we propose an encoder-decoder neural network architecture, leveraging Recurrent Neural Networks (RNNs) to capture long-range geometric patterns, and exploiting a customized training strategy based on domain-specific knowledge. The experimental results demonstrate that our method outperforms state-of-the-art solutions in prediction accuracy, in particular in cases of complex wall layouts or curved wall footprints.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient interactive annotation of segmentation datasets with Polygon-RNN++. In: Proceedings of CVPR (2018)

    Google Scholar 

  2. Castrejon, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a Polygon-RNN. In: Proceedings of CVPR (2017)

    Google Scholar 

  3. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE TPAMI 40(4), 834–848 (2017)

    Article  Google Scholar 

  4. Delage, E., Honglak Lee, Ng, A.Y.: A dynamic Bayesian network model for autonomous 3D reconstruction from a single indoor image. In: Proceedings of CVPR, vol. 2, pp. 2418–2428 (2006)

    Google Scholar 

  5. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int. J. Geogr. Inf. Geovisual. 10(2), 112–122 (1973)

    Article  Google Scholar 

  6. Fernandez-Labrador, C., Fácil, J.M., Perez-Yus, A., Demonceaux, C., Civera, J., Guerrero, J.J.: Corners for layout: End-to-end layout recovery from 360 images (2019). arXiv:1903.08094

  7. Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3D features. In: Proceedings of ICCV, pp. 2228–2235 (2011)

    Google Scholar 

  8. Flint, A., Mei, C., Murray, D., Reid, I.: A dynamic programming approach to reconstructing building interiors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 394–407. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_29

    Chapter  Google Scholar 

  9. Gallagher, A.C.: Using vanishing points to correct camera rotation in images. In: Proceedings of CVR, pp. 460–467 (2005)

    Google Scholar 

  10. Geyer, C., Daniilidis, K.: A unifying theory for central panoramic systems and practical implications. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 445–461. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45053-X_29

    Chapter  Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of CVPR, pp. 770–778 (2016)

    Google Scholar 

  12. Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: Proceedings of ICCV, pp. 1849–1856 (2009)

    Google Scholar 

  13. Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. Int. J. Comput. Vision 75(1), 151–172 (2007). https://doi.org/10.1007/s11263-006-0031-y

    Article  MATH  Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014)

    Google Scholar 

  15. Kujiale.com: Structured3D Data (2019). https://structured3d-dataset.org/. Accessed 25 Sept 2019

  16. Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: Proceedings of CVPR, pp. 2136–2143 (2009)

    Google Scholar 

  17. Matterport: Matterport3D (2017). https://github.com/niessner/Matterport. Accessed 25 Sept 2019

  18. Paszke, A., et al.: Automatic differentiation in pytorch. In: Proceedings of NIPS (2017)

    Google Scholar 

  19. Pintore, G., Garro, V., Ganovelli, F., Agus, M., Gobbetti, E.: Omnidirectional image capture on mobile devices for fast automatic generation of 2.5D indoor maps. In: Proceedings of IEEE WACV, pp. 1–9 (2016)

    Google Scholar 

  20. Pintore, G., Mura, C., Ganovelli, F., Fuentes-Perez, L., Pajarola, R., Gobbetti, E.: State-of-the-art in automatic 3D reconstruction of structured indoor environments. Comput. Graph. Forum 39(2), 667–699 (2020)

    Article  Google Scholar 

  21. Pintore, G., Pintus, R., Ganovelli, F., Scopigno, R., Gobbetti, E.: Recovering 3D existing-conditions of indoor structures from spherical images. Comput. Graph. 77, 16–29 (2018)

    Article  Google Scholar 

  22. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  23. Schindler, G., Dellaert, F.: Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments. In: Proceedings of CVPR, vol. 1, p. I (2004)

    Google Scholar 

  24. Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W., Woo, W.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Proceedings of NIPS, pp. 802–810 (2015)

    Google Scholar 

  25. Stanford University: BuildingParser Dataset (2017). http://buildingparser.stanford.edu/dataset.html. Accessed 25 Sept 2019

  26. Sun, C., Hsiao, C.W., Sun, M., Chen, H.T.: HorizonNet: learning room layout with 1D representation and pano stretch data augmentation. In: Proceedings of CVPR (2019)

    Google Scholar 

  27. Xu, J., Stenger, B., Kerola, T., Tung, T.: Pano2CAD: room layout from a single panorama image. In: Proceedings of WACV, pp. 354–362 (2017)

    Google Scholar 

  28. Yang, H., Zhang, H.: Efficient 3D room shape recovery from a single panorama. In: Proceedings of CVPR, pp. 5422–5430 (2016)

    Google Scholar 

  29. Yang, S.T., Peng, C.H., Wonka, P., Chu, H.K.: PanoAnnotator: a semi-automatic tool for indoor panorama layout annotation. In: Proceedings of SIGGRAPH Asia 2018 Posters, pp. 34:1–34:2 (2018)

    Google Scholar 

  30. Yang, S.T., Wang, F.E., Peng, C.H., Wonka, P., Sun, M., Chu, H.K.: DuLa-Net: a dual-projection network for estimating room layouts from a single RGB panorama. In: Proceedings of CVPR (2019)

    Google Scholar 

  31. Yang, Y., Jin, S., Liu, R., Yu, J.: Automatic 3D indoor scene modeling from single panorama. In: Proceedings of CVPR, pp. 3926–3934 (2018)

    Google Scholar 

  32. Zhang, Y., Song, S., Tan, P., Xiao, J.: PanoContext: a whole-room 3D context model for panoramic scene understanding. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 668–686. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_43

    Chapter  Google Scholar 

  33. Zou, C., Colburn, A., Shan, Q., Hoiem, D.: LayoutNet: reconstructing the 3D room layout from a single RGB image. In: Proceedings of CVPR, pp. 2051–2059 (2018)

    Google Scholar 

  34. Zou, C., et al.: 3D Manhattan room layout reconstruction from a single 360 image (2019)

    Google Scholar 

Download references

Acknowledgments

This work has received funding from Sardinian Regional Authorities under projects VIGECLAB, AMAC, and TDM (POR FESR 2014-2020). We also acknowledge the contribution of the European Union’s H2020 research and innovation programme under grant agreements 813170 (EVOCATION).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giovanni Pintore .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 15578 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pintore, G., Agus, M., Gobbetti, E. (2020). AtlantaNet: Inferring the 3D Indoor Layout from a Single \(360^\circ \) Image Beyond the Manhattan World Assumption. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12353. Springer, Cham. https://doi.org/10.1007/978-3-030-58598-3_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58598-3_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58597-6

  • Online ISBN: 978-3-030-58598-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics