Abstract
A common way to view a 360° video on a 2D display is to crop and render a part of the video as a normal field-of-view (NFoV) video. While users can enjoy natural-looking NFoV videos using this approach, they need to constantly make manual adjustment of the viewing direction not to miss interesting events in the video. In this paper, we propose an interactive and automatic navigation system for comfortable 360° video playback. Our system finds a virtual camera path that shows the most salient areas through the video, generates a NFoV video based on the path, and plays it in an online manner. A user can interactively change the viewing direction while watching a video, and the system instantly updates the path reflecting the intention of the user. To enable online processing, we design our system consisting of an offline pre-processing step, and an online 360° video navigation step. The pre-processing step computes optical flow and saliency scores for an input video. Based on these, the online video navigation step computes an optimal camera path reflecting user interaction, and plays a NFoV video in an online manner. For improved user experience, we also introduce optical flow-based camera path planning, saliency-aware path update, and adaptive control of the temporal window size. Our experimental results including user studies show that our system provides more pleasant experience of watching 360° videos than existing approaches.
Supplemental Material
Available for Download
Supplemental material
- Marc Assens, Xavier Giro-i Nieto, Kevin McGuinness, and Noel E O'Connor. 2017. SaltiNet: Scan-Path Prediction on 360 Degree Images Using Saliency Volumes. In 2017 IEEE International Conference on Computer Vision Workshops (ICCVW). 2331--2338.Google ScholarCross Ref
- Jiamin Bai, Aseem Agarwala, Maneesh Agrawala, and Ravi Ramamoorthi. 2014. User-Assisted Video Stabilization. Comput. Graph. Forum 33, 4 (2014), 61--70.Google ScholarDigital Library
- Hsien-Tzu Cheng, Chun-Hung Chao, Jin-Dong Dong, Hao-Kai Wen, Tyng-Luh Liu, and Min Sun. 2018. Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1420--1429.Google ScholarCross Ref
- Thomas Deselaers, Philippe Dreuw, and Hermann Ney. 2008. Pan, zoom, scan - Time- coherent, trained automatic video cropping. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google ScholarCross Ref
- Michael L Gleicher and Feng Liu. 2007. Re-cinematography: Improving the Camera Dynamics of Casual Video. In Proceedings of the 15th ACM International Conference on Multimedia (MM '07). 27--36. Google ScholarDigital Library
- Michael L Gleicher and Feng Liu. 2008. Re-cinematography: Improving the Camerawork of Casual Video. ACM Trans. Multimedia Comput. Commun. Appl. 5, 1, Article 2 (2008), 28 pages. Google ScholarDigital Library
- Amit Goldstein and Raanan Fattal. 2012. Video Stabilization Using Epipolar Geometry. ACM Trans. Graph. 31, 5, Article 126 (2012), 10 pages. Google ScholarDigital Library
- Matthias Grundmann, Vivek Kwatra, and Irfan Essa. 2011. Auto-directed video stabilization with robust L1 optimal camera paths. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. 225--232. Google ScholarDigital Library
- Hou-Ning Hu, Yen-Chen Lin, Ming-Yu Liu, Hsien-Tzu Cheng, Yung-Ju Chang, and Min Sun. 2017. Deep 360 pilot: Learning a deep agent for piloting through 360 sports video. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1396--1405.Google ScholarCross Ref
- Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. 2017. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1647--1655.Google Scholar
- Junho Jeon, Jinwoong Jung, and Seungyong Lee. 2018. Deep Upright Adjustment of 360 Panoramas using Multiple Roll Estimations. In Proceedings of the Asian Conference on Computer Vision (ACCV).Google Scholar
- Wei Jiang, Zhenyu Wu, John Wus, and Heather Yu. 2014. One-Pass Video Stabilization on Mobile Devices. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM '14). 817--820. Google ScholarDigital Library
- Jinwoong Jung, Beomseok Kim, Joon-Young Lee, Byungmoon Kim, and Seungyong Lee. 2017. Robust Upright Adjustment of 360 Spherical Panoramas. Vis. Comput. 33, 6--8 (2017), 737--747. Google ScholarDigital Library
- Yeong Won Kim, Chang-Ryeol Lee, Dae-Yong Cho, Yong Hoon Kwon, Hyeok-Jae Choi, and Kuk-Jin Yoon. 2017. Automatic Content-Aware Projection for 360deg Videos. In 2017 IEEE International Conference on Computer Vision (ICCV). 4753--4761.Google ScholarCross Ref
- Johannes Kopf. 2016. 360° Video Stabilization. ACM Trans. Graph. 35, 6, Article 195 (2016), 9 pages. Google ScholarDigital Library
- Wei-Sheng Lai, Yujia Huang, Neel Joshi, Christopher Buehler, Ming-Hsuan Yang, and Sing Bing Kang. 2017. Semantic-driven generation of hyperlapse from 360 video. IEEE Transactions on Visualization and Computer Graphics, PP (99) (2017), 1--1.Google Scholar
- Yen-Chen Lin, Yung-Ju Chang, Hou-Ning Hu, Hsien-Tzu Cheng, Chi-Wen Huang, and Min Sun. 2017a. Tell Me Where to Look: Investigating Ways for Assisting Focus in 360° Video. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). 2535--2545. Google ScholarDigital Library
- Yung-Ta Lin, Yi-Chi Liao, Shan-Yuan Teng, Yi-Ju Chung, Liwei Chan, and Bing-Yu Chen. 2017b. Outside-In: Visualizing Out-of-Sight Regions-of-Interest in a 360° Video Using Spatial Picture-in-Picture Previews. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST '17). 255--265. Google ScholarDigital Library
- Ce Liu. 2009. Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. Dissertation. Massachusetts Institute of Technology. Google ScholarDigital Library
- Feng Liu and Michael Gleicher. 2006. Video Retargeting: Automating Pan and Scan. In Proceedings of the 14th ACM International Conference on Multimedia (MM '06). 241--250. Google ScholarDigital Library
- Feng Liu, Michael Gleicher, Hailin Jin, and Aseem Agarwala. 2009. Content-preserving Warps for 3D Video Stabilization. ACM Trans. Graph. 28, 3, Article 44 (2009), 9 pages. Google ScholarDigital Library
- Feng Liu, Michael Gleicher, Jue Wang, Hailin Jin, and Aseem Agarwala. 2011. Subspace Video Stabilization. ACM Trans. Graph. 30, 1, Article 4 (2011), 10 pages. Google ScholarDigital Library
- Shuaicheng Liu, Ping Tan, Lu Yuan, Jian Sun, and Bing Zeng. 2016. In European Conference on Computer Vision, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). 800--815.Google Scholar
- Shuaicheng Liu, Lu Yuan, Ping Tan, and Jian Sun. 2013. Bundled Camera Paths for Video Stabilization. ACM Trans. Graph. 32, 4, Article 78 (2013), 10 pages. Google ScholarDigital Library
- Yasuyuki Matsushita, Eyal Ofek, Weina Ge, Xiaoou Tang, and Heung-Yeung Shum. 2006. Full-frame video stabilization with motion inpainting. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 7 (2006), 1150--1163. Google ScholarDigital Library
- Amy Pavel, Björn Hartmann, and Maneesh Agrawala. 2017. Shot Orientation Controls for Interactive Cinematography with 360 Video. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST '17). 289--297. Google ScholarDigital Library
- Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2008. Improved Seam Carving for Video Retargeting. ACM Trans. Graph. 27, 3, Article 16 (2008), 9 pages. Google ScholarDigital Library
- Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2009. Multi-operator Media Retargeting. ACM Trans. Graph. 28, 3, Article 23 (2009), 11 pages. Google ScholarDigital Library
- Yu-Chuan Su and Kristen Grauman. 2017a. Learning Spherical Convolution for Fast Features from 360°Imagery. In Advances in Neural Information Processing Systems 30. 529--539. Google ScholarDigital Library
- Yu-Chuan Su and Kristen Grauman. 2017b. Making 360° Video Watchable in 2D: Learning Videography for Click Free Viewing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1368--1376.Google ScholarCross Ref
- Yu-Chuan Su and Kristen Grauman. 2018. Learning Compressible 360° Video Isomers. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7824--7833.Google ScholarCross Ref
- Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. 2016. Pano2Vid: Automatic Cinematography for Watching 360° Videos. In Proceedings of the Asian Conference on Computer Vision (ACCV). 154--171.Google Scholar
- Yu-Shuen Wang, Jen-Hung Hsiao, Olga Sorkine, and Tong-Yee Lee. 2011. Scalable and Coherent Video Resizing with Per-frame Optimization. ACM Trans. Graph. 30, 4, Article 88 (2011), 8 pages. Google ScholarDigital Library
- Yu-Shuen Wang, Hui-Chih Lin, Olga Sorkine, and Tong-Yee Lee. 2010. Motion-based Video Retargeting with Optimized Crop-and-warp. ACM Trans. Graph. 29, 4, Article 90 (2010), 9 pages. Google ScholarDigital Library
- Lior Wolf, Moshe Guttmann, and Daniel Cohen-Or. 2007. Non-homogeneous Content-driven Video-retargeting. In 2007 IEEE 11th International Conference on Computer Vision. 1--6.Google Scholar
- Stephen Wright and Jorge Nocedal. 2006. Numerical Optimization (2 ed.).Google Scholar
- Feng Zhou, Sing Bing Kang, and Michael F Cohen. 2014. Time-mapping using spacetime saliency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3358--3365. Google ScholarDigital Library
Index Terms
- Interactive and automatic navigation for 360° video playback
Recommendations
Enhanced Interactive 360° Viewing via Automatic Guidance
We present a new interactive playback method to enhance 360° viewing experiences. Our method automatically rotates the virtual camera of a 360° panoramic video (360° video) player during interactive viewing to guide the viewer through the most ...
Towards optimal navigation through video content on interactive TV
A wide variety of video content-news programs, documentaries, sports shows, movies, and the like-is broadcast today in digital format to interactive TVs. Unlike a conventional TV, an interactive TV allows the viewer to navigate back and forth in time ...
Viewport-driven DASH media playback for interactive storytelling: a seamless non-linear storyline experience
MMSys '19: Proceedings of the 10th ACM Multimedia Systems ConferenceOver the past few years, the HTTP Adaptive Streaming (HAS) technologies, e.g. the MPEG-DASH standard (DASH), became the predominant form of online video streaming. 360 Virtual Reality (VR) content have recently emerged on video streaming platform as a ...
Comments