skip to main content
10.1145/3581783.3612257acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

A Blind Streaming System for Multi-client Online 6-DoF View Touring

Published: 27 October 2023 Publication History

Abstract

Online 6-DoF view touring has become increasingly popular due to hardware advances and the recent pandemic. One way for content creators to support many 6-DoF clients is by transmitting 3D content to them, which leads to content leakage. Another way for content creators is to render and stream novel views for 6-DoF clients, which incurs staggering computational and networking workloads. In this paper, we develop a blind streaming system that leverages cloud service providers between content creators and 6-DoF clients. Our system has two core design objectives: (i) to generate high-quality novel views for 6-DoF clients without retrieving 3D content from content creators, (ii) to support many 6-DoF clients without overloading the content creators. We achieve these two goals in the following steps. First, we design a source view request/response interface between cloud service providers and content creators for efficient communications. Second, we design novel view optimization algorithms for cloud service providers to intelligently select the minimal set of source views while considering the workload of content creators. Third, we employ scalable client side view synthesis for 6-DoF clients with heterogeneous device capabilities and personalized 6-DoF client poses and preferences. Our evaluation results demonstrate the merits of our solution, compared to the state-of-the-arts, our system: (i) improves synthesized novel views by 2.27 dB in PSNR and 12 in VMAF on average and (ii) reduces the bandwidth consumption by 94% on average. In fact, our solution approaches the performance of an unrealistic optimal solution with unlimited source views, achieving performance gaps as small as 0.75 dB in PSNR and 3.8 in VMAF.

References

[1]
alvr org. 2023. ALVR - Air Light VR. https://github.com/alvr-org/ALVR Retrieved May 2, 2023 from
[2]
Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, and James Tompkin. 2020. MatryODShka: Real-time 6DoF video view synthesis using multi-sphere images. In Proceedings of European Conference on Computer Vision (ECCV'20). Glasgow, United Kingdom, 441--459.
[3]
Dejan Azinović, Ricardo Martin-Brualla, Dan Goldman, Matthias Nießner, and Justus Thies. 2022. Neural RGB-D Surface Reconstruction. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'22). New Orleans, LA, 6290--6301.
[4]
Jonathan Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul Srinivasan. 2021. Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. In Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV'21). Montreal, Canada, 5855--5864.
[5]
Jonathan Barron, Ben Mildenhall, Dor Verbin, Pratul Srinivasan, and Peter Hedman. 2022. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'22). New Orleans, LA, 5470--5479.
[6]
Daniele Bonatto, Sarah Fachada, Ségolène Rogge, Adrian Munteanu, and Gauthier Lafruit. 2021. Real-Time Depth Video-Based Rendering for 6-DoF HMD Navigation and Light Field Displays. IEEE Access, Vol. 9 (October 2021), 146868--146887.
[7]
Jill Boyce, Renaud Doré, Adrian Dziembowski, Julien Fleureau, Joel Jung, Bart Kroon, Basel Salahieh, Vinod Kumar Malamal Vadakital, and Lu Yu. 2021. MPEG Immersive Video Coding Standard. Proc. IEEE, Vol. 109, 9 (September 2021), 1521--1536.
[8]
Shu-Ching Chen. 2022. Multimedia Research Toward the Metaverse. IEEE MultiMedia, Vol. 29, 1 (2022), 125--127.
[9]
Inchang Choi, Orazio Gallo, Alejandro Troccoli, Min Kim, and Jan Kautz. 2019. Extreme View Synthesis. In Proc. of IEEE/CVF International Conference on Computer Vision (ICCV'19). Seoul, Korea.
[10]
Epic Games. 2019. Unreal Engine. https://www.unrealengine.com.
[11]
Serhan Gül, Sebastian Bosse, Dimitri Podborski, Thomas Schierl, and Cornelius Hellge. 2020. Kalman Filter-Based Head Motion Prediction for Cloud-Based Mixed Reality. In Proc. of ACM International Conference on Multimedia (MM'20). Seattle, WA, 3632--3641.
[12]
Haoyu Guo, Sida Peng, Haotong Lin, Qianqian Wang, Guofeng Zhang, Hujun Bao, and Xiaowei Zhou. 2022. Neural 3D Scene Reconstruction With the Manhattan-World Assumption. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'22). New Orleans, LA, 5511--5520.
[13]
John Haas. 2014. A History of the Unity Game Engine. Diss. Worcester Polytechnic Institute, Vol. 483, 2014 (March 2014), 484.
[14]
Jozef Hladky, Michael Stengel, Nicholas Vining, Bernhard Kerbl, Hans-Peter Seidel, and Markus Steinberger. 2022. QuadStream: A Quad-Based Scene Streaming Architecture for Novel Viewpoint Reconstruction. ACM Transactions on Graphics, Vol. 41, 6 (November 2022), 1--13.
[15]
Alain Horé and Djemel Ziou. 2010. Image Quality Metrics: PSNR vs. SSIM. In Proc. of IEEE International Conference on Pattern Recognition (ICPR'20). Istanbul, Turkey, 2366--2369.
[16]
Xueshi Hou and Sujit Dey. 2020. Motion prediction and pre-rendering at the edge to enable ultra-low latency mobile 6DoF experiences. IEEE Open Journal of the Communications Society, Vol. 1 (2020), 1674--1690.
[17]
Maria Hänel and Carola-Bibiane Schönlieb. 2022. Efficient Global Optimization of Non-Differentiable, Symmetric Objectives for Multi Camera Placement. IEEE Sensors Journal, Vol. 22, 6 (March 2022), 5278--5287.
[18]
Bart Kroon and Gauthier Lafruit. 2018. Reference View Synthesizer (RVS) 2.0 manual. Taipa, Macao.
[19]
MarketWatch. 2023. Metaverse Market Global Analysis 2023--2030. https://www.marketwatch.com/press-release/metaverse-market-global-analysis-2023-2030-2023-04-21.
[20]
Landis Markley, Yang Cheng, John Crassidis, and Yaakov Oshman. 2007. Averaging quaternions. Journal of Guidance, Control, and Dynamics, Vol. 30, 4 (May 2007), 1193--1197.
[21]
Ben Mildenhall, Pratul Srinivasan, Matthew Tancik, Jonathan Barron, Ravi Ramamoorthi, and Ren Ng. 2021. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Commun. ACM, Vol. 65, 1 (December 2021), 99--106.
[22]
MIV. 2022. MPEG IMMERSIVE VIDEO (MIV). https://mpeg-miv.org/.
[23]
Vikram Munishwar and Nael Abu-Ghazaleh. 2010. Scalable Target Coverage in Smart Camera Networks. In Proc. of ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC'10). Atlanta, GA, 206--213.
[24]
Netflix. 2021. VMAF - Video Multi-Method Assessment Fusion. https://github.com/Netflix/vmaf Retrieved April 28, 2023 from
[25]
NVIDIA, Péter Vingelmann, and Frank H.P. Fitzek. 2020. CUDA, release: 10.2.89. https://developer.nvidia.com/cuda-toolkit
[26]
Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, and Alexander Berg. 2017. Transformation-Grounded Image Generation Network for Novel 3D View Synthesis. In Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'17). Honolulu, HI.
[27]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[28]
Cheng Peng and Volkanr Isler. 2019. Adaptive View Planning for Aerial 3D Reconstruction. In Proc. of IEEE International Conference on Robotics and Automation (ICRA'19). Montreal, Canada, 2981--2987.
[29]
Iain Richardson. 2010. The H.264 Advanced Video Compression Standard 2nd ed.). Wiley Publishing.
[30]
Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). Las Vegas, NV.
[31]
Yuan-Chun Sun, Sheng-Ming Tang, Ching-Ting Wang, and Cheng-Hsin Hsu. 2022. On Objective and Subjective Quality of 6DoF Synthesized Live Immersive Videos. In Proc. of ACM International Workshop on Quality of Experience in Visual Multimedia Applications (QoEVMA'22). Lisboa, Portugal, 49--56.
[32]
Sumi Suresh, Athi Narayanan, and Vivek Menon. 2020. Maximizing Camera Coverage in Multicamera Surveillance Networks. IEEE Sensors Journal, Vol. 20, 17 (September 2020), 10170--10178.
[33]
Sheng-Ming Tang, Yuan-Chun Sun, Jia-Wei Fang, Kuan-Yu Lee, Ching-Ting Wang, and Cheng-Hsin Hsu. 2022. Optimal Camera Placement for 6 Degree-of-Freedom Immersive Video Streaming Without Accessing 3D Scenes. In Proc. of ACM International Workshop on Interactive EXtended Reality (IXR'22). Lisboa, Portugal, 31--39.
[34]
Suramya Tomar. 2006. Converting video formats with FFmpeg. Linux Journal, Vol. 2006, 146 (2006), 10.
[35]
Jingwen Wang, Tymoteusz Bleja, and Lourdes Agapito. 2022. GO-Surf: Neural Feature Grid Optimization for Fast, High-Fidelity RGB-D Surface Reconstruction. In International Conference on 3D Vision (3DV'22). Prague, Czech Republic, 433--442.
[36]
Xiaoli Wang, Aakanksha Chowdhery, and Mung Chiang. 2017. Networked Drone Cameras for Sports Streaming. In Proc. of IEEE International Conference on Distributed Computing Systems (ICDCS'17). Atlanta, GA, 308--318.
[37]
Mason Woo, Jackie Neider, Tom Davis, and Dave Shreiner. 1999. OpenGL programming guide: the official guide to learning OpenGL, version 1.2. Addison-Wesley Longman Publishing Co., Inc.
[38]
Qi Zhang, Shibo He, and Jiming Chen. 2016. Toward Optimal Orientation Scheduling for Full-View Coverage in Camera Sensor Networks. In Proc. of IEEE International Conference on Global Communications Conference (GLOBECOM'16). Washington, DC, 1--6.
[39]
Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. 2018. Open3D: A Modern Library for 3D Data Processing.

Cited By

View all
  • (2024)Optimally Planning Drone Trajectories to Capture 3D Gaussian Splatting ObjectsMultiMedia Modeling10.1007/978-981-96-2064-7_13(171-185)Online publication date: 28-Dec-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computer graphics
  2. content privacy
  3. discrete optimization
  4. system design
  5. view synthesis

Qualifiers

  • Research-article

Funding Sources

  • National Science and Technology Council of Taiwan

Conference

MM '23
Sponsor:
MM '23: The 31st ACM International Conference on Multimedia
October 29 - November 3, 2023
Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)59
  • Downloads (Last 6 weeks)6
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Optimally Planning Drone Trajectories to Capture 3D Gaussian Splatting ObjectsMultiMedia Modeling10.1007/978-981-96-2064-7_13(171-185)Online publication date: 28-Dec-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media