research-article

Open access

ScenePhotographer: Object-Oriented Photography for Residential Scenes

Authors:

Shao-Kui Zhang,

Yong-Liang Yang,

Song-Hai ZhangAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 7843 - 7851

https://doi.org/10.1145/3664647.3680942

Published: 28 October 2024 Publication History

Abstract

Humans understand digital 3D scenes by observing them from reasonably placed virtual cameras. Selecting camera views is fundamental for 3D scene applications but is typically manual. Existing literature on selecting views is based on regular or polygonal room shapes without focusing on the objects in the scene, resulting in poorly composed views concerning objects. This paper introduces ScenePhotographer, an object-oriented framework for automatic view selection in residential scenes. Potential object-oriented views are yielded by a learning-based method, which clusters objects into groups according to objects' functional and spatial relationships. We propose four criteria to evaluate the views and recommend the best batch, including room information, visibility, composition balance, and line dynamics. Each criterion measures the view according to its corresponding photography rule. Experiments on various room types and layouts demonstrate that our method can generate views focusing on coherent objects while preserving aesthetics, leading to more visually pleasing results.

References

[1]

Marcel R Ackermann, Johannes Blömer, Daniel Kuntze, and Christian Sohler. 2014. Analysis of agglomerative clustering. Algorithmica, Vol. 69 (2014), 184--215.

[2]

William Bares. 2006. A photographic composition assistant for intelligent virtual 3d camera systems. In International Symposium on Smart Graphics. Springer, 172--183.

[3]

William Bares, Scott McDermott, Christina Boudreaux, and Somying Thainimit. 2000. Virtual 3D camera composition from frame constraints. In Proceedings of the eighth ACM international conference on Multimedia. 177--186.

Digital Library

[4]

Xavier Bonaventura, Miquel Feixas, Mateu Sbert, Lewis Chuang, and Christian Wallraven. 2018. A survey of viewpoint selection methods for polygonal models. Entropy, Vol. 20, 5 (2018), 370.

[5]

Udeepta D Bordoloi and H-W Shen. 2005. View selection for volume rendering. In VIS 05. IEEE Visualization, 2005. IEEE, 487--494.

[6]

Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. 2003. On visual similarity based 3D model retrieval. In Computer graphics forum, Vol. 22. Wiley Online Library, 223--232.

[7]

Marc Christie, Patrick Olivier, and Jean-Marie Normand. 2008. Camera control in computer graphics. In Computer Graphics Forum, Vol. 27. Wiley Online Library, 2197--2218.

[8]

Fahad Daniyal, Murtaza Taj, and Andrea Cavallaro. 2010. Content and task-based view selection from multiple video streams. Multimedia tools and applications, Vol. 46 (2010), 235--258.

[9]

Maurice De Sausmarez. 2007. Basic design: the dynamics of visual form. Bloomsbury Publishing.

[10]

Kangle Deng, Andrew Liu, Jun-Yan Zhu, and Deva Ramanan. 2022. Depth-supervised NeRF: Fewer Views and Faster Training for Free. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 12872--12881.

[11]

Helin Dutagaci, Chun Pan Cheung, and Afzal Godil. 2010. A benchmark for best view selection of 3D objects. In Proceedings of the ACM workshop on 3D object retrieval. 45--50.

Digital Library

[12]

Mathias Eitz, Ronald Richter, Tamy Boubekeur, Kristian Hildebrand, and Marc Alexa. 2012. Sketch-based shape retrieval. ACM Transactions on graphics (TOG), Vol. 31, 4 (2012), 1--10.

[13]

Matthew Fisher, Manolis Savva, Yangyan Li, Pat Hanrahan, and Matthias Nießner. 2015. Activity-centric scene synthesis for functional 3D scene modeling. ACM Transactions on Graphics (TOG), Vol. 34, 6 (2015), 179.

Digital Library

[14]

Huan Fu, Bowen Cai, Lin Gao, Ling-Xiao Zhang, Jiaming Wang, Cao Li, Qixun Zeng, Chengyue Sun, Rongfei Jia, Binqiang Zhao, and Hao Zhang. 2021. 3D-FRONT: 3D Furnished Rooms With layOuts and semaNTics. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 10933--10942.

[15]

Huan Fu, Rongfei Jia, Lin Gao, Mingming Gong, Binqiang Zhao, Steve Maybank, and Dacheng Tao. 2020. 3D-FUTURE: 3D Furniture shape with TextURE. arXiv preprint arXiv:2009.09633 (2020).

[16]

Qiang Fu, Xiaowu Chen, Xiaotian Wang, Sijia Wen, Bin Zhou, and Hongbo Fu. 2017. Adaptive synthesis of indoor scenes via activity-associated object relation graphs. ACM Transactions on Graphics (TOG), Vol. 36, 6 (2017), 1--13.

Digital Library

[17]

Kyle Genova, Manolis Savva, Angel X Chang, and Thomas Funkhouser. 2017. Learning where to look: Data-driven viewpoint set selection for 3d scenes. arXiv preprint arXiv:1704.02393 (2017).

[18]

Bruce Gooch, Erik Reinhard, Chris Moulding, and Peter Shirley. 2001. Artistic composition for image creation. In Eurographics Workshop on Rendering Techniques. Springer, 83--88.

[19]

Ankur Handa, Viorica Patraucean, Vijay Badrinarayanan, Simon Stent, and Roberto Cipolla. 2016. Understanding real world indoor scenes with synthetic data. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4077--4085.

[20]

Michael Harris. 2002. Professional interior photography. Routledge.

[21]

Michael G Harris. 2003. Professional interior photography. Taylor & Francis.

[22]

Guangfeng Ji and Han-wei Shen. 2006. Dynamic View Selection for Time-Varying Volumes. IEEE Transactions on Visualization and Computer Graphics, Vol. 12, 5 (2006), 1109--1116. https://doi.org/10.1109/TVCG.2006.137

Digital Library

[23]

Mengyu Ji and Ligang Liu. 2018. Perception-aware multi-view rendering optimization for indoor scenes. Journal of University of Science and Technology of China, Vol. 48, 2, Article 140 (2018), 140--147 pages.

[24]

Vladimir Kolmogorov and Ramin Zabih. 2002. Multi-camera scene reconstruction via graph cuts. In Computer Vision?ECCV 2002: 7th European Conference on Computer Vision Copenhagen, Denmark, May 28-31, 2002 Proceedings, Part III 7. Springer, 82--96.

[25]

Andrew Luo, Zhoutong Zhang, Jiajun Wu, and Joshua B Tenenbaum. 2020. End-to-End Optimization of Scene Layout. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3754--3763.

[26]

David Prakel. 2006. Basics Photography 01: Composition. Vol. 1. AVA Publishing.

[27]

Marc Ruiz, Imma Boada, Miquel Feixas, and Mateu Sbert. 2010. Viewpoint information channel for illustrative volume rendering. Computers & Graphics, Vol. 34, 4 (2010), 351--360.

Digital Library

[28]

K Sasirekha and P Baby. 2013. Agglomerative hierarchical clustering algorithm-a. International Journal of Scientific and Research Publications, Vol. 83, 3 (2013), 83.

[29]

Greg Slabaugh, Ron Schafer, Tom Malzbender, and Bruce Culbertson. 2001. A survey of methods for volumetric scene reconstruction from photographs. In Volume Graphics 2001: Proceedings of the Joint IEEE TCVG and Eurographics Workshop in Stony Brook, New York, USA, June 21-22, 2001. Springer, 81--100.

[30]

Mads Soegaard. 2010. Gestalt principles of form perception. Interaction-Design. org, Vol. 8 (2010).

[31]

Jia-Mu Sun, Jie Yang, Kaichun Mo, Yu-Kun Lai, Leonidas Guibas, and Lin Gao. 2024. Haisor: Human-aware indoor scene optimization via deep reinforcement learning. ACM Transactions on Graphics, Vol. 43, 2 (2024), 1--17.

Digital Library

[32]

Shigeo Takahashi, Issei Fujishiro, Yuriko Takeshima, and Tomoyuki Nishita. 2005. A feature-driven approach to locating optimal viewpoints for volume visualization. In VIS 05. IEEE Visualization, 2005. IEEE, 495--502.

[33]

Jun Tao, Jun Ma, Chaoli Wang, and Ching-Kuang Shene. 2012. A unified approach to streamline selection and viewpoint selection for 3D flow visualization. IEEE Transactions on Visualization and Computer Graphics, Vol. 19, 3 (2012), 393--406.

Digital Library

[34]

Pere-Pau Vázquez. 2009. Automatic view selection through depth-based view stability analysis. The Visual Computer, Vol. 25, 5--7 (2009), 441--449.

Digital Library

[35]

Pere-Pau Vázquez, Miquel Feixas, Mateu Sbert, and Wolfgang Heidrich. 2003. Automatic view selection using viewpoint entropy and its application to image-based modelling. In Computer Graphics Forum, Vol. 22. Wiley Online Library, 689--700.

[36]

Pere-Pau Vázquez and Mateu Sbert. 2004. On the fly best view detection using graphics hardware. In Visualization, Imaging, And Image Processing: Fourth IASTED International Conference Proceedings.

[37]

Kai Wang, Yu-An Lin, Ben Weissmann, Manolis Savva, Angel X Chang, and Daniel Ritchie. 2019. Planit: Planning and instantiating indoor scenes with relation graph and spatial prior networks. ACM Transactions on Graphics (TOG), Vol. 38, 4 (2019), 132.

Digital Library

[38]

Kai Wang, Manolis Savva, Angel X Chang, and Daniel Ritchie. 2018. Deep convolutional priors for indoor scene synthesis. ACM Transactions on Graphics (TOG), Vol. 37, 4 (2018), 70.

Digital Library

[39]

Song-Hai Zhang, Shao-Kui Zhang, Yuan Liang, and Peter Hall. 2019. A Survey of 3D Indoor Scene Synthesis. Journal of Computer Science and Technology, Vol. 34, 3, Article 594 (2019), 14 pages. https://doi.org/10.1007/s11390-019-1929-5

[40]

Song-Hai Zhang, Shao-Kui Zhang, Wei-Yu Xie, Cheng-Yang Luo, Yong-Liang Yang, and Hongbo Fu. 2022. Fast 3D Indoor Scene Synthesis by Learning Spatial Relation Priors of Objects. IEEE Transactions on Visualization and Computer Graphics, Vol. 28, 9 (2022), 3082--3092. https://doi.org/10.1109/TVCG.2021.3050143

[41]

Shao-Kui Zhang, Yi-Xiao Li, Yu He, Yong-Liang Yang, and Song-Hai Zhang. 2021. MageAdd: Real-Time Interaction Simulation for Scene Synthesis. In Proceedings of the 29th ACM International Conference on Multimedia (Virtual Event, China) (MM '21). Association for Computing Machinery, New York, NY, USA, 965--973. https://doi.org/10.1145/3474085.3475194

Digital Library

[42]

Shao-Kui Zhang, Jia-Hong Liu, Yike Li, Tianyi Xiong, Ke-Xin Ren, Hongbo Fu, and Song-Hai Zhang. 2023. Automatic Generation of Commercial Scenes. In Proceedings of the 31st ACM International Conference on Multimedia (Ottawa ON, Canada) (MM '23). Association for Computing Machinery, New York, NY, USA, 1137--1147. https://doi.org/10.1145/3581783.3613456

Digital Library

[43]

Shao-Kui Zhang, Hou Tam, Yike Li, Ke-Xin Ren, Hongbo Fu, and Song-Hai Zhang. 2024. SceneDirector: Interactive Scene Synthesis by Simultaneously Editing Multiple Objects in Real-Time. IEEE Transactions on Visualization and Computer Graphics, Vol. 30, 8 (2024), 4558--4569. https://doi.org/10.1109/TVCG.2023.3268115

Digital Library

[44]

Shao-Kui Zhang, Hou Tam, Yi-Xiao Li, Tai-Jiang Mu, and Song-Hai Zhang. 2023. SceneViewer: Automating Residential Photography in Virtual Environments. IEEE Transactions on Visualization and Computer Graphics, Vol. 29, 12 (2023), 5523--5537. https://doi.org/10.1109/TVCG.2022.3214836

Digital Library

[45]

Yinda Zhang, Shuran Song, Ersin Yumer, Manolis Savva, Joon-Young Lee, Hailin Jin, and Thomas Funkhouser. 2017. Physically-based rendering for indoor scene understanding using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5287--5295.

Index Terms

ScenePhotographer: Object-Oriented Photography for Residential Scenes
1. Computing methodologies
  1. Computer graphics

Recommendations

Real-time multi-view 3d object tracking in cluttered scenes
ISVC'06: Proceedings of the Second international conference on Advances in Visual Computing - Volume Part II

This paper presents an approach to real-time 3D object tracking in cluttered scenes using multiple synchronized and calibrated cameras. The goal is to accurately track targets over a long period of time in the presence of complete occlusion in some of ...
Robust Object Tracking Using Motion Context in Crowded Scenes
Advances in Multimedia Information Processing – PCM 2013
Abstract
Tracking objects in a crowded scene with occlusions has been a challenge in computer vision and multimedia in the past years. This paper presents a novel framework to track any arbitrary object through modeling its coupled motion context. For a ...
Finding Object Depth Using Stereoscopic Photography
ICA3PP '09: Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing

Stereoscopic scenes of the mankind is naturally caused by synthesizing two images produced by the parallax of the two eyes of human. Such being the case, mankind can distinguish the relative position of the objects. In the study of related stereovision, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Shuimu Tsinghua Scholar Program
Young Elite Scientists Sponsorship Program by BAST
Tsinghua University Student Research Training
Tsinghua-Tencent Joint Laboratory for Internet Innovation Technology
National Key Research and Development Program of China
National Natural Science Foundation of China
UKRI grant CAMERA
Postdoctoral Fellowship Program of CPSF
China Postdoctoral Science Foundation

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
99
Total Downloads

Downloads (Last 12 months)99
Downloads (Last 6 weeks)57

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten