ABSTRACT
Automatic composition optimization is a vital technique for computational photography systems. Balance in composition is one of the agreed-upon principles of aesthetics and is commonly employed as a visual feature in many computational aesthetics studies. It refers to an equilibrium of visual weights within composition. Existing composition optimization and aesthetic quality assessment systems utilize the saliency map to represent balance. However, saliency map methods fail to account for high-level visual features that are important for compositional balance. Our work establishes a framework for the purpose of evaluating the relationship between visual features and compositional balance. This provides a better understanding of compositional balance and help improve composition optimization performance. A dataset based on a human subject study was created with photos representing main balance concepts such as symmetric, dynamic balance, and imbalance. We take the visual center given by human subjects as the dependent variable and the center-of-mass for each type of visual features as the predictor variable. Based on a linear regression model, we can assess how much each type of visual features contributes to the prediction of the visual center. Our findings show that high-level visual elements can help increase prediction accuracy with significance on top of saliency maps. Specifically, extra information provided through human and dominant vanishing point detection is statistically significant for assessing balance in the composition.
- Radhakrishna Achanta, Sheila Hemami, Francisco Estrada, and Sabine Susstrunk. 2009. Frequency-tuned salient region detection. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 1597--1604.Google ScholarCross Ref
- Rudolf Arnheim. 1954. Art and Visual Perception: A Psychology of the Creative Eye. Univ. of California Press.Google Scholar
- Subhabrata Bhattacharya, Rahul Sukthankar, and Mubarak Shah. 2010. A framework for photo-quality assessment and enhancement based on visual aesthetics. In Proceedings of International Conference on Multimedia. ACM, 271--280. Google ScholarDigital Library
- Ali Borji and Laurent Itti. 2013. State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (2013), 185--207. Google ScholarDigital Library
- Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z.Wang. 2006. Studying aesthetics in photographic images using a computational approach. Proceedings of the European Conference on Computer Vision (2006), 288--301. Google ScholarDigital Library
- Lijuan Duan, Chunpeng Wu, Jun Miao, Laiyun Qing, and Yu Fu. 2011. Visual saliency detection by spatially weighted dissimilarity. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 473--480. Google ScholarDigital Library
- Erkut Erdem and Aykut Erdem. 2013. Visual saliency estimation by nonlinearly integrating features using region covariances. Journal of Vision 13, 4 (2013), 11--11.Google ScholarCross Ref
- Sharon Gershoni and Shaul Hochstein. 2011. Measuring pictorial balance perception at first glance using Japanese calligraphy. i-Perception 2, 6 (2011), 508--527.Google Scholar
- Stas Goferman, Lihi Zelnik-Manor, and Ayellet Tal. 2012. Context-aware saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 10 (2012), 1915--1926. Google ScholarDigital Library
- Y.W. Guo, M. Liu, T.T. Gu, and W.P. Wang. 2012. Improving photo composition elegantly: Considering image similarity during composition optimization. In Proceedings of Computer Graphics Forum, Vol. 31. Wiley Online Library, 2193-- 2202. Google ScholarDigital Library
- Jonathan Harel, Christof Koch, and Pietro Perona. 2007. Graph-based visual saliency. In Advances in neural information processing systems. 545--552. Google ScholarDigital Library
- Xiaodi Hou, Jonathan Harel, and Christof Koch. 2012. Image signature: Highlighting sparse salient regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 1 (2012), 194--201. Google ScholarDigital Library
- Ali Jahanian, Jerry Liu, Qian Lin, Daniel Tretter, Eamonn O'Brien-Strain, Seungyon Claire Lee, Nic Lyons, and Jan Allebach. 2013. Recommendation system for automatic design of magazine covers. In Proceedings of International Conference on Intelligent User Interfaces. ACM, 95--106. Google ScholarDigital Library
- Huaizu Jiang, Jingdong Wang, Zejian Yuan, Yang Wu, Nanning Zheng, and Shipeng Li. 2013. Salient object detection: A discriminative regional feature integration approach. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 2083--2090. Google ScholarDigital Library
- Dhiraj Joshi, Ritendra Datta, Elena Fedorovskaya, Quang-Tuan Luong, James Z Wang, Jia Li, and Jiebo Luo. 2011. Aesthetics and emotions in images. IEEE Signal Processing Magazine 28, 5 (2011), 94--115.Google ScholarCross Ref
- Tilke Judd, Krista Ehinger, Frédo Durand, and Antonio Torralba. 2009. Learning to predict where humans look. In Proceedings of International Conference on Computer Vision. IEEE, 2106--2113.Google ScholarCross Ref
- John Kahrs, Sharon Calahan, Dave Carson, and Stephen Poster. 1996. Pixel cinematography: a lighting approach for computer graphics. ACM SIGGRAPH Course Notes (1996), 433--42.Google Scholar
- David A. Lauer and Stephen Pentak. 2011. Design Basics. Cengage Learning.Google Scholar
- Guanbin Li and Yizhou Yu. 2015. Visual saliency based on multiscale deep features. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 5455--5463.Google Scholar
- Ligang Liu, Renjie Chen, Lior Wolf, and Daniel Cohen-Or. 2010. Optimizing photo composition. In Proceedings of Computer Graphics Forum, Vol. 29. Wiley Online Library, 469--478.Google ScholarCross Ref
- Wei Luo, Xiaogang Wang, and Xiaoou Tang. 2011. Content-based photo quality assessment. In Proceedings of International Conference on Computer Vision. IEEE, 2206--2213. Google ScholarDigital Library
- I.C. McManus, D. Edmondson, and J. Rodger. 1985. Balance in pictures. British Journal of Psychology 76, 3 (1985), 311--324.Google ScholarCross Ref
- I.C. McManus, Katharina Stöver, and Do Kim. 2011. Arnheim's Gestalt theory of visual balance: Examining the compositional structure of art photographs and abstract images. i-Perception 2, 6 (2011), 615--647.Google Scholar
- Robert H. Morriss and William P. Dunlap. 1988. Influence of chroma and hue on spatial balance of color pairs. Color Research & Application 13, 6 (1988), 385--388.Google ScholarCross Ref
- Naila Murray, Maria Vanrell, Xavier Otazu, and C. Alejandro Parraga. 2011. Saliency estimation using a non-parametric low-level vision model. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 433--440. Google ScholarDigital Library
- Tam V. Nguyen, Bingbing Ni, Hairong Liu,Wei Xia, Jiebo Luo, Mohan Kankanhalli, and Shuicheng Yan. 2013. Image re-attentionizing. IEEE Transactions on 15, 8 (2013), 1910--1919. Google ScholarDigital Library
- Aude Oliva, Antonio Torralba, Monica S. Castelhano, and John M. Henderson. 2003. Top-down control of visual attention in object detection. In Proceedings of The International Conference on Image Processing, Vol. 1. IEEE, I--253.Google Scholar
- Junting Pan, Elisa Sayrol, Xavier Giro-i Nieto, Kevin McGuinness, and Noel E. O'Connor. 2016. Shallowand deep convolutional networks for saliency prediction. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 598--606.Google Scholar
- Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the Conference on Computer Vision and Pattern Recognition. IEEE, 779--788.Google ScholarCross Ref
- Wirawit Rueopas, Sangsan Leelhapantu, and Thanarat H. Chalidabhongse. 2016. A corner-based saliency model. In Proceedings of International Joint Conference on Computer Science and Software Engineering. IEEE, 1--6.Google Scholar
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252. Google ScholarDigital Library
- Hamed R. Tavakoli and Jorma Laaksonen. 2016. Bottom-Up Fixation Prediction Using Unsupervised Hierarchical Models. In Proceedings of Asian Conference on Computer Vision. Springer, 287--302.Google Scholar
- Hamed R. Tavakoli, Esa Rahtu, and Janne Heikkilä. 2011. Fast and efficient saliency detection using sparse sampling and kernel density estimation. In Proceedings of Scandinavian Conference on Image Analysis. Springer, 666--675. Google ScholarDigital Library
- Zhi Tian, Weilin Huang, Tong He, Pan He, and Yu Qiao. 2016. Detecting text in natural image with connectionist text proposal network. In Proceedings of the European Conference on Computer Vision. Springer, 56--72.Google ScholarCross Ref
- Anne Treisman. 1985. Preattentive processing in vision. Computer vision, graphics, and image processing 31, 2 (1985), 156--177. Google ScholarDigital Library
- Xianjun Sam Zheng, Ishani Chakraborty, James Jeng-Weei Lin, and Robert Rauschenberger. 2009. Correlating low-level image statistics with users-rapid aesthetic and affective judgments of web pages. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1--10. Google ScholarDigital Library
- Zihan Zhou, Farshid Farhat, and James Z. Wang. 2017. Detecting dominant vanishing points In natural scenes with application to composition-sensitive image retrieval. IEEE Transactions on Multimedia (2017).Google Scholar
Index Terms
- Beyond Saliency: Assessing Visual Balance with High-level Cues
Recommendations
Mesh saliency
Research over the last decade has built a solid mathematical foundation for representation and analysis of 3D meshes in graphics and geometric modeling. Much of this work however does not explicitly incorporate models of low-level human visual ...
Mesh saliency
SIGGRAPH '05: ACM SIGGRAPH 2005 PapersResearch over the last decade has built a solid mathematical foundation for representation and analysis of 3D meshes in graphics and geometric modeling. Much of this work however does not explicitly incorporate models of low-level human visual ...
Saliency in motion: selective rendering of dynamic virtual environments
SCCG '09: Proceedings of the 25th Spring Conference on Computer GraphicsA major obstacle for real-time rendering of high-fidelity graphics is computational complexity. A key point to consider in the pursuit of "realism in real-time" in computer graphics is that the Human Visual System (HVS) is a fundamental part of the ...
Comments