ABSTRACT
Time-lapse videos can visualize the temporal change of dynamic scenes and present wonderful sights with drastic variance in color appearance and rapid movement that interests people. We propose an aesthetics-driven virtual time-lapse photography framework to explore the automatic generation of time-lapse videos in the virtual world, which has potential applications like artistic creation and entertainment in the virtual space. We first define shooting parameters to parameterize the time-lapse photography process and accordingly propose image, video, and time-lapse aesthetic assessments to optimize these parameters, enabling the process to be autonomous and adaptive. We also build an interactive interface to visualize the shooting process and help users conduct virtual time-lapse photography by personalizing shooting parameters according to their aesthetic preferences. Finally, we present a two-stream time-lapse aesthetic model and a time-lapse aesthetic dataset, which can evaluate the aesthetic quality of time-lapse videos. Experimental results demonstrate our method can automatically generate time-lapse videos comparable to those of professional photographers and is more efficient.
Supplemental Material
- Eric P Bennett and Leonard McMillan. 2007. Computational time-lapse video. In ACM SIGGRAPH. 102-es.Google Scholar
- Chia-Chi Cheng, Hung-Yu Chen, and Wei-Chen Chiu. 2020. Time flies: Animating a still image with time-lapse video as reference. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5641--5650.Google ScholarCross Ref
- Daniel Cohen-Or, Olga Sorkine, Ran Gal, Tommer Leyvand, and Ying-Qing Xu. 2006. Color harmonization. In ACM SIGGRAPH. 624--630.Google Scholar
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Ieee, 248--255.Google ScholarCross Ref
- Alex Graves and Alex Graves. 2012. Long short-term memory. Supervised sequence labelling with recurrent neural networks (2012), 37--45.Google ScholarCross Ref
- Erik Härkönen, Miika Aittala, Tuomas Kynkäänniemi, Samuli Laine, Timo Aila, and Jaakko Lehtinen. 2022. Disentangling random and cyclic effects in time-lapse sequences. ACM Transactions on Graphics, Vol. 41, 4 (2022), 1--13.Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In European Conference on Computer Vision. Springer, 630--645.Google ScholarCross Ref
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.Google Scholar
- Azlan Iqbal, Harold Van Der Heijden, Matej Guid, and Ali Makhmali. 2012. Evaluating the aesthetics of endgame studies: a computational model of human aesthetic perception. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 4, 3 (2012), 178--191.Google ScholarCross Ref
- Xin Jin, Hao Lou, Heng Huang, Xinning Li, Xiaodong Li, Shuai Cui, Xiaokun Zhang, and Xiqiao Li. 2022. Pseudo-Labeling and Meta Reweighting Learning for Image Aesthetic Quality Assessment. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 12 (2022), 25226--25235.Google ScholarCross Ref
- Neel Joshi, Wolf Kienzle, Mike Toelle, Matt Uyttendaele, and Michael F Cohen. 2015. Real-time hyperlapse creation via optimal frame selection. ACM Transactions on Graphics, Vol. 34, 4 (2015), 1--9.Google ScholarDigital Library
- Yueying Kao, Chong Wang, and Kaiqi Huang. 2015. Visual aesthetic quality assessment with a regression model. In IEEE International Conference on Image Processing. 1583--1587.Google ScholarDigital Library
- Yan Ke, Xiaoou Tang, and Feng Jing. 2006. The design of high-level features for photo quality assessment. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1. 419--426.Google Scholar
- Qi Kuang, Xin Jin, Qinping Zhao, and Bin Zhou. 2019. Deep multimodality learning for UAV video aesthetic quality assessment. IEEE Transactions on Multimedia, Vol. 22, 10 (2019), 2623--2634.Google ScholarCross Ref
- Hui-Jin Lee, Ki-Sang Hong, Henry Kang, and Seungyong Lee. 2017. Photo aesthetics analysis via DCNN feature encoding. IEEE Transactions on Multimedia, Vol. 19, 8 (2017), 1921--1932.Google ScholarDigital Library
- Hao Lou, Heng Huang, Chaoen Xiao, and Xin Jin. 2021. Aesthetic Evaluation and Guidance for Mobile Photography. In Proceedings of the 29th ACM International Conference on Multimedia. 2780--2782.Google ScholarDigital Library
- Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z Wang. 2014. Rapid: Rating pictorial aesthetics using deep learning. In Proceedings of the 22nd ACM international conference on Multimedia. 457--466.Google ScholarDigital Library
- Yiwen Luo and Xiaoou Tang. 2008. Photo and video quality evaluation: Focusing on the subject. In European Conference on Computer Vision. 386--399.Google ScholarDigital Library
- Luca Marchesotti, Florent Perronnin, Diane Larlus, and Gabriela Csurka. 2011. Assessing the aesthetic quality of photographs using generic image descriptors. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1784--1791.Google ScholarDigital Library
- Ricardo Martin-Brualla, David Gallup, and Steven M Seitz. 2015a. 3d time-lapse reconstruction from internet photos. In Proceedings of the IEEE International Conference on Computer Vision. 1332--1340.Google ScholarDigital Library
- Ricardo Martin-Brualla, David Gallup, and Steven M Seitz. 2015b. Time-lapse mining from internet photos. ACM Transactions on Graphics, Vol. 34, 4 (2015), 1--8.Google ScholarDigital Library
- Seonghyeon Nam, Chongyang Ma, Menglei Chai, William Brendel, Ning Xu, and Seon Joo Kim. 2019. End-to-end time-lapse video synthesis from a single outdoor image. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1409--1418.Google ScholarCross Ref
- L Neumann, M Sbert, B Gooch, W Purgathofer, et al. 2005. Defining computational aesthetics. Computational aesthetics in graphics, visualization and imaging (2005), 13--18.Google Scholar
- Masashi Nishiyama, Takahiro Okabe, Imari Sato, and Yoichi Sato. 2011. Aesthetic quality classification of photographs based on color harmony. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 33--40.Google ScholarDigital Library
- Yuzhen Niu and Feng Liu. 2012. What makes a professional video? A computational aesthetics approach. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, 7 (2012), 1037--1049.Google ScholarDigital Library
- Stephen M Pizer, E Philip Amburn, John D Austin, Robert Cromartie, Ari Geselowitz, Trey Greer, Bart ter Haar Romeny, John B Zimmerman, and Karel Zuiderveld. 1987. Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing, Vol. 39, 3 (1987), 355--368.Google Scholar
- Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652--660.Google Scholar
- Esa Rahtu, Juho Kannala, Mikko Salo, and Janne Heikkilä. 2010. Segmenting salient objects from images and videos. In European conference on computer vision. Springer, 366--379.Google ScholarDigital Library
- Yogesh Singh Rawat and Mohan S Kankanhalli. 2016. ClickSmart: A context-aware viewpoint recommendation system for mobile photography. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 27, 1 (2016), 149--158.Google ScholarDigital Library
- Yogesh Singh Rawat, Mubarak Shah, and Mohan S Kankanhalli. 2019. Photography and Exploration of Tourist Locations Based on Optimal Foraging Theory. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 30, 7 (2019), 2276--2287.Google Scholar
- Michael Rubinstein, Ce Liu, Peter Sand, Fredo Durand, and William T Freeman. 2011. Motion denoising with application to time-lapse photography. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 313--320.Google ScholarDigital Library
- Hanghang Tong, Mingjing Li, Hong-Jiang Zhang, Jingrui He, and Changshui Zhang. 2004. Classification of digital photos taken by photographers or home users. In Pacific-Rim Conference on Multimedia. 198--205.Google ScholarDigital Library
- Christos Tzelepis, Eftichia Mavridaki, Vasileios Mezaris, and Ioannis Patras. 2016. Video aesthetic quality assessment using kernel Support Vector Machine with isotropic Gaussian sample uncertainty (KSVM-IGSU). In IEEE International Conference on Image Processing. IEEE, 2410--2414.Google ScholarCross Ref
- Miao Wang, Jun-Bang Liang, Song-Hai Zhang, Shao-Ping Lu, Ariel Shamir, and Shi-Min Hu. 2017. Hyper-lapse from multiple spatially-overlapping videos. IEEE Transactions on Image Processing, Vol. 27, 4 (2017), 1735--1747.Google ScholarDigital Library
- Scott Wehrwein, Kavita Bala, and Noah Snavely. 2020. Scene summarization via motion normalization. IEEE Transactions on Visualization and Computer Graphics, Vol. 27, 4 (2020), 2495--2501.Google ScholarCross Ref
- Wei Xiong, Wenhan Luo, Lin Ma, Wei Liu, and Jiebo Luo. 2018. Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2364--2373.Google ScholarCross Ref
- Hsin-Ho Yeh, Chun-Yu Yang, Ming-Sui Lee, and Chu-Song Chen. 2013. Video aesthetic quality assessment by temporal integration of photo-and motion-based features. IEEE transactions on multimedia, Vol. 15, 8 (2013), 1944--1957.Google ScholarDigital Library
- Hui Zeng, Zisheng Cao, Lei Zhang, and Alan C Bovik. 2019. A unified probabilistic formulation of image aesthetic assessment. IEEE Transactions on Image Processing, Vol. 29 (2019), 1548--1561.Google ScholarCross Ref
- Jiangning Zhang, Chao Xu, Liang Liu, Mengmeng Wang, Xia Wu, Yong Liu, and Yunliang Jiang. 2020. Dtvnet: Dynamic time-lapse video generation via single still image. In European Conference on Computer Vision. Springer, 300--315.Google ScholarDigital Library
- Luming Zhang, Yue Gao, Roger Zimmermann, Qi Tian, and Xuelong Li. 2014. Fusion of multichannel local and global structural cues for photo aesthetics evaluation. IEEE Transactions on Image Processing, Vol. 23, 3 (2014), 1419--1429.Google ScholarDigital Library
- Feng Zhou, Sing Bing Kang, and Michael F Cohen. 2014. Time-mapping using space-time saliency. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3358--3365.Google ScholarDigital Library
Index Terms
- Aesthetics-Driven Virtual Time-Lapse Photography Generation
Comments