A Survey of 3D Indoor Scene Synthesis

Zhang, Song-Hai; Zhang, Shao-Kui; Liang, Yuan; Hall, Peter

doi:10.1007/s11390-019-1929-5

A Survey of 3D Indoor Scene Synthesis

Survey
Published: 10 May 2019

Volume 34, pages 594–608, (2019)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Song-Hai Zhang^1,2,
Shao-Kui Zhang¹,
Yuan Liang¹ &
…
Peter Hall³

825 Accesses
37 Citations
10 Altmetric
1 Mention
Explore all metrics

Abstract

Indoor scene synthesis has become a popular topic in recent years. Synthesizing functional and plausible indoor scenes is an inherently difficult task since it requires considerable knowledge to both choose reasonable object categories and arrange objects appropriately. In this survey, we propose four criteria which group a wide range of 3D (three-dimensional) indoor scene synthesis techniques according to various aspects (specifically, four groups of categories). It also provides hints, through comprehensively comparing all the techniques to demonstrate their effectiveness and drawbacks, and discussions of potential remaining problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Lyons G H. Ten Common Home Decorating Mistakes & How to Avoid Them. Blue Sage Press, 2008.
Germer T, Schwarz M. Procedural arrangement of furniture for real-time walkthroughs. Computer Graphics Forum, 2009, 28(8): 2068-2078.
Article Google Scholar
Merrell P, Schkufza E, Li Z et al. Interactive furniture layout using interior design guidelines. ACM Transactions on Graphics, 2011, 30(4): Article No. 87.
Yu L F, Yeung S K, Terzopoulos D. The clutterpalette: An interactive tool for detailing indoor scenes. IEEE Transactions on Visualization and Computer Graphics, 2016, 22(2): 1138-1148.
Article Google Scholar
Song S, Yu F, Zeng A et al. Semantic scene completion from a single depth image. In Proc. the 2017 IEEE Conf. Computer Vision and Pattern Recognition, July 2017, pp.1746-1754.
Fu Q, Chen X, Wang X et al. Adaptive synthesis of indoor scenes via activity-associated object relation graphs. ACM Transactions on Graphics, 2017, 36(6): Article No. 201.
Li W, Saeedi S, McCormac J et al. InteriorNet: Mega-scale multi-sensor photo-realistic indoor scenes dataset. In Proc. the 29th British Machine Vision Conference, September 2018, Article No. 77.
Qi S, Zhu Y, Huang S et al. Human-centric indoor scene synthesis using stochastic grammar. In Proc. the 2018 IEEE Conf. Computer Vision and Pattern Recognition, June 2018, pp.5899-5908.
Li Y, Zhang J, Cheng Y et al. DF²Net: Discriminative feature learning and fusion network for RGB-D indoor scene classification. In Proc. the 32nd AAAI Conference on Artificial Intelligence, February 2018, pp.7041-7048.
Chang A, Savva M, Manning C D. Learning spatial knowledge for text to 3D scene generation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, October 2014, pp.2028-2038.
Xie H, Xu W, Wang B. Reshuffle-based interior scene synthesis. In Proc. the 12th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and Its Applications in Industry, November 2013, pp.191-198.
Nan L, Xie K, Sharf A. A search-classify approach for cluttered indoor scene understanding. ACM Transactions on Graphics, 2012, 31(6): Article No. 137.
Yang S, Xu J, Chen K et al. View suggestion for interactive segmentation of indoor scenes. Computational Visual Media, 2017, 3(2): 131-146.
Article Google Scholar
Satkin S, Lin J, Hebert M. Data-driven scene understanding from 3D models. In Proc. the 2012 British Machine Vision Conference, September 2012, Article No. 128.
Lim J J, Pirsiavash H, Torralba A. Parsing IKEA objects: Fine pose estimation. In Proc. the 2013 IEEE International Conference on Computer Vision, December 2013, pp.2992-2999.
Lim J J, Khosla A, Torralba A. FPM: Fine pose parts-based model with 3D CAD models. In Proc. the 13th European Conference on Computer Vision, September 2014, pp.478-493.
Kim Y M, Mitra N J, Yan D M et al. Acquiring 3D indoor environments with variability and repetition. ACM Transactions on Graphics, 2012, 31(6): Article No. 138.
Savva M, Chang A X, Hanrahan P et al. PiGraphs: Learning interaction snapshots from observations. ACM Transactions on Graphics, 2016, 35(4): Article No. 139.
Bao S Y, Sun M, Savarese S. Toward coherent object detection and scene layout understanding. Image and Vision Computing, 2011, 29(9): 569-579.
Article Google Scholar
Jiang Y, Lim M, Zheng C et al. Learning to place new objects in a scene. The International Journal of Robotics Research, 2012, 31(9): 1021-1043.
Article Google Scholar
Cheng M M, Hou Q B, Zhang S H et al. Intelligent visual media processing: When graphics meets vision. Journal of Computer Science and Technology, 2017, 32(1): 110-121.
Article Google Scholar
Xu K, Ma R, Zhang H et al. Organizing heterogeneous scene collections through contextual focal points. ACM Transactions on Graphics, 2014, 33(4): Article No. 35.
Fisher M, Savva M, Hanrahan P. Characterizing structural relationships in scenes using graph kernels. ACM Transactions on Graphics, 2011, 30(4): Article No. 34.
Wu W, Fan L, Liu L et al. MIQP-based layout design for building interiors. Computer Graphics Forum, 2018, 37(2): 511-521.
Article Google Scholar
Sanchez V, Zakhor A. Planar 3D modeling of building interiors from point cloud data. In Proc. the 19th IEEE International Conference on Image Processing, September 2012, pp.1777-1780
Merrell P, Schkufza E, Koltun V. Computer-generated residential building layouts. ACM Transactions on Graphics, 2010, 29(6): Article No. 181.
Wang W, Gao W, Hu Z. Effectively modeling piecewise planar urban scenes based on structure priors and CNN. Science China Information Sciences, 2019, 62(2): Article No. 29102.
Fisher M, Hanrahan P. Context-based search for 3D models. ACM Transactions on Graphics, 2010, 29(6): Article No. 182.
Ovsjanikov M, Li W, Guibas L et al. Exploration of continuous variability in collections of 3D shapes. ACM Transactions on Graphics, 2011, 30(4): Article No. 33.
Chen D Y, Tian X P, Shen Y T et al. On visual similarity based 3D model retrieval. Computer Graphics Forum, 2003, 22(3): 223-232.
Article Google Scholar
Eitz M, Richter R, Boubekeur T et al. Sketch-based shape retrieval. ACM Transactions on Graphics, 2012, 31(4): Article No. 31.
Chen K, Lai Y,Wu Y X et al. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Transactions on Graphics, 2014, 33(6): Article No. 208.
Shen C H, Fu H, Chen K et al. Structure recovery by part assembly. ACM Transactions on Graphics, 2012, 31(6): Article No. 180.
Schuster S, Krishna R, Chang A et al. Generating semantically precise scene graphs from textual descriptions for improved image retrieval. In Proc. the 4th Workshop on Vision and Language, September 2015, pp.70-80.
Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009.
Handa A, Patraucean V, Badrinarayanan V et al. Understanding real world indoor scenes with synthetic data. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, June 2016, pp.4077-4085.
Fisher M, Ritchie D, Savva M et al. Example-based synthesis of 3D object arrangements. ACM Transactions on Graphics, 2012, 31(6): Article No. 135.
Xu K, Chen K, Fu H et al. Sketch2Scene: Sketch-based co-retrieval and co-placement of 3D models. ACM Transactions on Graphics, 2013, 32(4): Article No. 123.
Chang A X, Eric M, Savva M et al. SceneSeer: 3D scene design with natural language. arXiv:1703.00050, 2017. https://arxiv.org/abs/1703.00050, March 2019.
Yu L F, Yeung S K, Tang C K et al. Make it home: Automatic optimization of furniture arrangement. ACM Transactions on Graphics, 2011, 30(4): Article No. 86.
Wang K, Savva M, Chang A X et al. Deep convolutional priors for indoor scene synthesis. ACM Transactions on Graphics, 2018, 37(4): Article No. 70.
Savva M, Chang A X, Agrawala M. SceneSuggest: Context-driven 3D scene design. arXiv:1703.00061, 2017. https://arxiv.org/abs/1703.00061, March 2019.
Ma R, Li H, Zou C et al. Action-driven 3D indoor scene evolution. ACM Transactions on Graphics, 2016, 35(6): Article No. 173.
Fisher M, Savva M, Li Y et al. Activity-centric scene synthesis for functional 3D scene modeling. ACM Transactions on Graphics, 2015, 34(6): Article No. 179.
Li G, Zheng Y, Fan J et al. Crowdsourced data management: Overview and challenges. In Proc. the 2017 ACM International Conference on Management of Data, May 2017, pp.1711-1716.
Chen P P, Sun H L, Fang Y L et al. Collusion-proof result inference in crowdsourcing. Journal of Computer Science and Technology, 2018, 33(2): 351-365.
Article Google Scholar
Shao L, Chang A X, Su H et al. Cross-modal attribute transfer for rescaling 3D models. In Proc. the 2017 International Conference on 3D Vision, October 2017, pp.640-648.
Savva M, Chang A X, Bernstein G et al. On being the right scale: Sizing large collections of 3D models. In Proc. the 2014 SIGGRAPH Asia Indoor Scene Understanding Where Graphics Meets Vision, December 2014, Article No. 4.
Zhu Y, Tian Y, Metaxas D et al. Semantic amodal segmentation. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.3001-3009.
Du G G, Yin C L, Zhou M Q et al. Isometric 3D shape partial matching using GD-DNA. Journal of Computer Science and Technology, 2018, 33(6): 1178-1191.
Article Google Scholar
Jo S, Jeong Y, Lee S. GPU-driven scalable parser for OBJ models. Journal of Computer Science and Technology, 2018, 33(2): 417-428.
Article Google Scholar
Yin L, Guo K, Zhou B et al. 3D shape co-segmentation via sparse and low rank representations. Science China Information Sciences, 2018, 61(5): Article No. 054101.
Silberman N, Hoiem D, Kohli P et al. Indoor segmentation and support inference from RGBD images. In Proc. the 12th European Conference on Computer Vision, October 2012, pp.746-760.
Song S, Lichtenberg S P, Xiao J. SUN RGB-D: A RGBD scene understanding benchmark suite. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.567-576.
Anand A, Koppula H S, Joachims T et al. Contextually guided semantic labeling and search for three-dimensional point clouds. The International Journal of Robotics Research, 2013, 32(1): 19-34.
Article Google Scholar
Lai K, Bo L, Fox D. Unsupervised feature learning for 3D scene labeling. In Proc. the 2014 IEEE International Conference on Robotics and Automation, May 2014, pp.3050-3057.
Mattausch O, Panozzo D, Mura C et al. Object detection and classification from large-scale cluttered indoor scans. Computer Graphics Forum, 2014, 33(2): 11-21.
Article Google Scholar
Chen K, Lai Y K, Hu S M. 3D indoor scene modeling from RGB-D data: A survey. Computational Visual Media, 2015, 1(4): 267-278.
Article Google Scholar
Hua B S, Pham Q H, Nguyen D T et al. SceneNN: A scene meshes dataset with annotations. In Proc. the 4th International Conference on 3D Vision, October 2016, pp.92-101.
Xiao J, Owens A, Torralba A. SUN3D: A database of big spaces reconstructed using SfM and object labels. In Proc. the 2013 IEEE International Conference on Computer Vision, December 2013, pp.1625-1632.
Dai A, Chang A X, Savva M et al. ScanNet: Richlyannotated 3D reconstructions of indoor scenes. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.2432-2443.
Handa A, P˘atr˘aucean V, Stent S et al. SceneNet: An annotated model generator for indoor scsene understanding. In Proc. the 2016 IEEE International Conference on Robotics and Automation, May 2016, pp.5737-5743.
McCormac J, Handa A, Leutenegger S et al. SceneNet RGB-D: Can 5M synthetic images beat generic imageNet pre-training on indoor segmentation? In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.2697-2706.
Chang A, Monroe W, Savva M et al. Text to 3D scene generation with rich lexical grounding. arXiv:1505.06289, 2015. https://arxiv.org/abs/1505.06289, March 2019.
Chang A X, Funkhouser T, Guibas L et al. ShapeNet: An information-rich 3D model repository. arXiv:1512.03012, 2015. https://arxiv.org/abs/1512.03012, March 2019.
Savva M, Chang A X, Hanrahan P. Semantically-enriched 3D models for common-sense knowledge. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, June 2015, pp.24-31.
Avetisyan A, Dahnert M, Dai A et al. Scan2CAD: Learning CAD model alignment in RGB-D scans. arXiv:1811.11187, 2018. https://arxiv.org/abs/1811.11187, March 2019.
Li M, Patil A G, Xu K et al. GRAINS: Generative recursive autoencoders for indoor scenes. ACM Transactions on Graphics, 2019, 38(2): Article No. 12.
Yeh Y T, Yang L, Watson M et al. Synthesizing open worlds with constraints using locally annealed reversible jumpMCMC. ACM Transactions on Graphics, 2012, 31(4): Article No. 56.
Liang Y, Zhang S H, Martin R R. Automatic data-driven room design generation. In Proc. the 3rd International Workshop on Next Generation Computer Animation Techniques, June 2017, pp.133-148.
Ikehata S, Yang H, Furukawa Y. Structured indoor modeling. In Proc. the 2015 IEEE International Conference on Computer Vision, December 2015, pp.1323-1331.
Zhu J Z, Jia Y T, Xu J et al. Modeling the correlations of relations for knowledge graph embedding. Journal of Computer Science and Technology, 2018, 33(2): 323-334.
Article MathSciNet Google Scholar
Zhu S C, Mumford D. A stochastic grammar of images. Foundations and Trends® in Computer Graphics and Vision, 2006, 2(4): 259-362.
Article MATH Google Scholar
Savva M, Chang A X, Hanrahan P et al. SceneGrok: Inferring action maps in 3D environments. ACM Transactions on Graphics, 2014, 33(6): Article No. 212.
Ritchie D, Wang K, Lin Y. Fast and flexible indoor scene synthesis via deep convolutional generative models. arXiv:1811.12463, 2018. https://arxiv.org/abs/1811.12463, March 2019.
Xu W, Wang B, Yan D M. Wall grid structure for interior scene synthesis. Computers & Graphics, 2015, 46: 231-243.
Article Google Scholar
Kschischang F R, Frey B J, Loeliger H A. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 2001, 47(2): 498-519.
Article MathSciNet MATH Google Scholar
Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Machine Learning, 1997, 29(2/3): 131-163.
Article MATH Google Scholar
Jiang Y, Lim M, Saxena A. Learning object arrangements in 3D scenes using human context. arXiv:1206.6462, 2012. https://arxiv.org/abs/1206.6462, March 2019.
Gibson J J. The Ecological Approach to Visual Perception (1st edition). Routledge, 2014.
Jiang Y, Koppula H S, Saxena A. Modeling 3D environments through hidden human context. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2040-2053.
Article Google Scholar
Socher R, Lin C C, Manning C et al. Parsing natural scenes and natural language with recursive neural networks. In Proc. the 28th International Conference on Machine Learning, June 2011, pp.129-136.
Kingma D P, Welling M. Auto-encoding variational Bayes. arXiv:1312.6114, 2013. https://arxiv.org/abs/1312.6114, March 2019.
Lyu F, Xi R, Han Y et al. MagicMark: A marking menu using 2D direction and 3D depth information. Science China Information Sciences, 2018, 61(6): Article No. 064101.
Talton J O, Lou Y, Lesser S et al. Metropolis procedural modeling. ACM Transactions on Graphics, 2011, 30(2): Article No. 11.
Kirkpatrick S. Optimization by simulated annealing: Quantitative studies. Journal of Statistical Physics, 1984, 34(5/6): 975-986.
Hastings W K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 1970, 57(1): 97-109.
Article MathSciNet MATH Google Scholar
Metropolis N, Rosenbluth A W, Rosenbluth M N et al. Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 1953, 21(6): 1087-1092.
Article Google Scholar
Ramage D, Hall D, Nallapati R et al. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proc. the 2009 Conference on Empirical Methods in Natural Language Processing, August 2009, pp.248-256.
Chen C, Wang W, Zhang Y et al. A convergence analysis for a class of practical variance-reduction stochastic gradient MCMC. Science China Information Sciences, 2018, 62(1): Article No. 12101.
Chang A, Savva M, Manning C. Interactive learning of spatial knowledge for text to 3D scene generation. In Proc. the 2014 Association for Computational Linguistics Workshop on Interactive Language Learning, Visualization, and Interfaces, June 2014, pp.14-21.
Kermani Z S, Liao Z, Tan P et al. Learning 3D scene synthesis from annotated RGB-D images. Computer Graphics Forum, 2016, 35(5): 197-206.
Article Google Scholar
Liang Y, Xu F, Zhang S H et al. Knowledge graph construction with structure and parameter learning for indoor scene design. Computational Visual Media, 2018, 4(2): 123-137.
Article Google Scholar
Ma R, Patil A G, Fisher M et al. Language-driven synthesis of 3D scenes from scene databases. In Proc. SIGGRAPH Asia 2018, September 2018, Article No. 212.
Shao T, Xu W, Zhou K et al. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Transactions on Graphics, 2012, 31(6): Article No. 136.
Silberman N, Fergus R. Indoor scene segmentation using a structured light sensor. In Proc. the 2011 IEEE International Conference on Computer Vision Workshops, November 2011, pp.601-608.
Berge C. Hypergraphs: Combinatorics of Finite Sets (1st edition). North Holland, 1989.
Liu T, Hertzmann A, Li W et al. Style compatibility for 3D furniture models. ACM Transactions on Graphics, 2015, 34(4): Article No. 85.

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
Song-Hai Zhang, Shao-Kui Zhang & Yuan Liang
Beijing National Research Center for Information Science and Technology (BNRist), Beijing, 100084, China
Song-Hai Zhang
Department of Computer Science, University of Bath, Claverton Down, Bath, BA2 7AY, U.K.
Peter Hall

Authors

Song-Hai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shao-Kui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Liang
View author publications
You can also search for this author in PubMed Google Scholar
Peter Hall
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Song-Hai Zhang.

Electronic supplementary material

ESM 1

(PDF 220 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, SH., Zhang, SK., Liang, Y. et al. A Survey of 3D Indoor Scene Synthesis. J. Comput. Sci. Technol. 34, 594–608 (2019). https://doi.org/10.1007/s11390-019-1929-5

Download citation

Received: 15 March 2019
Revised: 17 April 2019
Published: 10 May 2019
Issue Date: May 2019
DOI: https://doi.org/10.1007/s11390-019-1929-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey of 3D Indoor Scene Synthesis

Abstract

Access this article

Similar content being viewed by others

Image Generation: A Review

Recent advances in implicit representation-based 3D shape generation

Taxonomy and Nomenclature for the Stone Domain in New England

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Survey of 3D Indoor Scene Synthesis

Abstract

Access this article

Similar content being viewed by others

Image Generation: A Review

Recent advances in implicit representation-based 3D shape generation

Taxonomy and Nomenclature for the Stone Domain in New England

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation