research-article

A Blind Streaming System for Multi-client Online 6-DoF View Touring

Authors:

Sheng-Ming Tang,

Cheng-Hsin HsuAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 9124 - 9133

https://doi.org/10.1145/3581783.3612257

Published: 27 October 2023 Publication History

Abstract

Online 6-DoF view touring has become increasingly popular due to hardware advances and the recent pandemic. One way for content creators to support many 6-DoF clients is by transmitting 3D content to them, which leads to content leakage. Another way for content creators is to render and stream novel views for 6-DoF clients, which incurs staggering computational and networking workloads. In this paper, we develop a blind streaming system that leverages cloud service providers between content creators and 6-DoF clients. Our system has two core design objectives: (i) to generate high-quality novel views for 6-DoF clients without retrieving 3D content from content creators, (ii) to support many 6-DoF clients without overloading the content creators. We achieve these two goals in the following steps. First, we design a source view request/response interface between cloud service providers and content creators for efficient communications. Second, we design novel view optimization algorithms for cloud service providers to intelligently select the minimal set of source views while considering the workload of content creators. Third, we employ scalable client side view synthesis for 6-DoF clients with heterogeneous device capabilities and personalized 6-DoF client poses and preferences. Our evaluation results demonstrate the merits of our solution, compared to the state-of-the-arts, our system: (i) improves synthesized novel views by 2.27 dB in PSNR and 12 in VMAF on average and (ii) reduces the bandwidth consumption by 94% on average. In fact, our solution approaches the performance of an unrealistic optimal solution with unlimited source views, achieving performance gaps as small as 0.75 dB in PSNR and 3.8 in VMAF.

References

[1]

alvr org. 2023. ALVR - Air Light VR. https://github.com/alvr-org/ALVR Retrieved May 2, 2023 from

[2]

Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, and James Tompkin. 2020. MatryODShka: Real-time 6DoF video view synthesis using multi-sphere images. In Proceedings of European Conference on Computer Vision (ECCV'20). Glasgow, United Kingdom, 441--459.

Digital Library

[3]

Dejan Azinović, Ricardo Martin-Brualla, Dan Goldman, Matthias Nießner, and Justus Thies. 2022. Neural RGB-D Surface Reconstruction. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'22). New Orleans, LA, 6290--6301.

[4]

Jonathan Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul Srinivasan. 2021. Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. In Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV'21). Montreal, Canada, 5855--5864.

[5]

Jonathan Barron, Ben Mildenhall, Dor Verbin, Pratul Srinivasan, and Peter Hedman. 2022. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'22). New Orleans, LA, 5470--5479.

[6]

Daniele Bonatto, Sarah Fachada, Ségolène Rogge, Adrian Munteanu, and Gauthier Lafruit. 2021. Real-Time Depth Video-Based Rendering for 6-DoF HMD Navigation and Light Field Displays. IEEE Access, Vol. 9 (October 2021), 146868--146887.

[7]

Jill Boyce, Renaud Doré, Adrian Dziembowski, Julien Fleureau, Joel Jung, Bart Kroon, Basel Salahieh, Vinod Kumar Malamal Vadakital, and Lu Yu. 2021. MPEG Immersive Video Coding Standard. Proc. IEEE, Vol. 109, 9 (September 2021), 1521--1536.

[8]

Shu-Ching Chen. 2022. Multimedia Research Toward the Metaverse. IEEE MultiMedia, Vol. 29, 1 (2022), 125--127.

[9]

Inchang Choi, Orazio Gallo, Alejandro Troccoli, Min Kim, and Jan Kautz. 2019. Extreme View Synthesis. In Proc. of IEEE/CVF International Conference on Computer Vision (ICCV'19). Seoul, Korea.

[10]

Epic Games. 2019. Unreal Engine. https://www.unrealengine.com.

[11]

Serhan Gül, Sebastian Bosse, Dimitri Podborski, Thomas Schierl, and Cornelius Hellge. 2020. Kalman Filter-Based Head Motion Prediction for Cloud-Based Mixed Reality. In Proc. of ACM International Conference on Multimedia (MM'20). Seattle, WA, 3632--3641.

Digital Library

[12]

Haoyu Guo, Sida Peng, Haotong Lin, Qianqian Wang, Guofeng Zhang, Hujun Bao, and Xiaowei Zhou. 2022. Neural 3D Scene Reconstruction With the Manhattan-World Assumption. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'22). New Orleans, LA, 5511--5520.

[13]

John Haas. 2014. A History of the Unity Game Engine. Diss. Worcester Polytechnic Institute, Vol. 483, 2014 (March 2014), 484.

[14]

Jozef Hladky, Michael Stengel, Nicholas Vining, Bernhard Kerbl, Hans-Peter Seidel, and Markus Steinberger. 2022. QuadStream: A Quad-Based Scene Streaming Architecture for Novel Viewpoint Reconstruction. ACM Transactions on Graphics, Vol. 41, 6 (November 2022), 1--13.

Digital Library

[15]

Alain Horé and Djemel Ziou. 2010. Image Quality Metrics: PSNR vs. SSIM. In Proc. of IEEE International Conference on Pattern Recognition (ICPR'20). Istanbul, Turkey, 2366--2369.

Digital Library

[16]

Xueshi Hou and Sujit Dey. 2020. Motion prediction and pre-rendering at the edge to enable ultra-low latency mobile 6DoF experiences. IEEE Open Journal of the Communications Society, Vol. 1 (2020), 1674--1690.

[17]

Maria Hänel and Carola-Bibiane Schönlieb. 2022. Efficient Global Optimization of Non-Differentiable, Symmetric Objectives for Multi Camera Placement. IEEE Sensors Journal, Vol. 22, 6 (March 2022), 5278--5287.

[18]

Bart Kroon and Gauthier Lafruit. 2018. Reference View Synthesizer (RVS) 2.0 manual. Taipa, Macao.

[19]

MarketWatch. 2023. Metaverse Market Global Analysis 2023--2030. https://www.marketwatch.com/press-release/metaverse-market-global-analysis-2023-2030-2023-04-21.

[20]

Landis Markley, Yang Cheng, John Crassidis, and Yaakov Oshman. 2007. Averaging quaternions. Journal of Guidance, Control, and Dynamics, Vol. 30, 4 (May 2007), 1193--1197.

[21]

Ben Mildenhall, Pratul Srinivasan, Matthew Tancik, Jonathan Barron, Ravi Ramamoorthi, and Ren Ng. 2021. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Commun. ACM, Vol. 65, 1 (December 2021), 99--106.

Digital Library

[22]

MIV. 2022. MPEG IMMERSIVE VIDEO (MIV). https://mpeg-miv.org/.

[23]

Vikram Munishwar and Nael Abu-Ghazaleh. 2010. Scalable Target Coverage in Smart Camera Networks. In Proc. of ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC'10). Atlanta, GA, 206--213.

Digital Library

[24]

Netflix. 2021. VMAF - Video Multi-Method Assessment Fusion. https://github.com/Netflix/vmaf Retrieved April 28, 2023 from

[25]

NVIDIA, Péter Vingelmann, and Frank H.P. Fitzek. 2020. CUDA, release: 10.2.89. https://developer.nvidia.com/cuda-toolkit

[26]

Eunbyung Park, Jimei Yang, Ersin Yumer, Duygu Ceylan, and Alexander Berg. 2017. Transformation-Grounded Image Generation Network for Novel 3D View Synthesis. In Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'17). Honolulu, HI.

[27]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Digital Library

[28]

Cheng Peng and Volkanr Isler. 2019. Adaptive View Planning for Aerial 3D Reconstruction. In Proc. of IEEE International Conference on Robotics and Automation (ICRA'19). Montreal, Canada, 2981--2987.

Digital Library

[29]

Iain Richardson. 2010. The H.264 Advanced Video Compression Standard 2nd ed.). Wiley Publishing.

[30]

Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). Las Vegas, NV.

[31]

Yuan-Chun Sun, Sheng-Ming Tang, Ching-Ting Wang, and Cheng-Hsin Hsu. 2022. On Objective and Subjective Quality of 6DoF Synthesized Live Immersive Videos. In Proc. of ACM International Workshop on Quality of Experience in Visual Multimedia Applications (QoEVMA'22). Lisboa, Portugal, 49--56.

Digital Library

[32]

Sumi Suresh, Athi Narayanan, and Vivek Menon. 2020. Maximizing Camera Coverage in Multicamera Surveillance Networks. IEEE Sensors Journal, Vol. 20, 17 (September 2020), 10170--10178.

[33]

Sheng-Ming Tang, Yuan-Chun Sun, Jia-Wei Fang, Kuan-Yu Lee, Ching-Ting Wang, and Cheng-Hsin Hsu. 2022. Optimal Camera Placement for 6 Degree-of-Freedom Immersive Video Streaming Without Accessing 3D Scenes. In Proc. of ACM International Workshop on Interactive EXtended Reality (IXR'22). Lisboa, Portugal, 31--39.

Digital Library

[34]

Suramya Tomar. 2006. Converting video formats with FFmpeg. Linux Journal, Vol. 2006, 146 (2006), 10.

Digital Library

[35]

Jingwen Wang, Tymoteusz Bleja, and Lourdes Agapito. 2022. GO-Surf: Neural Feature Grid Optimization for Fast, High-Fidelity RGB-D Surface Reconstruction. In International Conference on 3D Vision (3DV'22). Prague, Czech Republic, 433--442.

[36]

Xiaoli Wang, Aakanksha Chowdhery, and Mung Chiang. 2017. Networked Drone Cameras for Sports Streaming. In Proc. of IEEE International Conference on Distributed Computing Systems (ICDCS'17). Atlanta, GA, 308--318.

[37]

Mason Woo, Jackie Neider, Tom Davis, and Dave Shreiner. 1999. OpenGL programming guide: the official guide to learning OpenGL, version 1.2. Addison-Wesley Longman Publishing Co., Inc.

[38]

Qi Zhang, Shibo He, and Jiming Chen. 2016. Toward Optimal Orientation Scheduling for Full-View Coverage in Camera Sensor Networks. In Proc. of IEEE International Conference on Global Communications Conference (GLOBECOM'16). Washington, DC, 1--6.

Digital Library

[39]

Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. 2018. Open3D: A Modern Library for 3D Data Processing.

Cited By

Wu CSun YLee CHsu C(2024)Optimally Planning Drone Trajectories to Capture 3D Gaussian Splatting ObjectsMultiMedia Modeling10.1007/978-981-96-2064-7_13(171-185)Online publication date: 28-Dec-2024
https://doi.org/10.1007/978-981-96-2064-7_13

Index Terms

A Blind Streaming System for Multi-client Online 6-DoF View Touring

Recommendations

Workspace analysis of a 6-DOF cable-driven parallel robot considering pulley bearing friction under ultra-high acceleration

Cable robots can generate high velocities and accelerations due to the very small inertia of the end-effector. Therefore, CDPRs have been used widely in special industrial applications requiring high dynamics. However, the high acceleration generated ...
Segment-based streaming media proxy: modeling and optimization

Researchers often use segment-based proxy caching strategies to deliver streaming media by partially caching media objects. The existing strategies mainly consider increasing the byte hit ratio and/or reducing the client perceived startup latency (...
Client starvation: a shortcoming of client-driven adaptive streaming in named data networking
ACM-ICN '14: Proceedings of the 1st ACM Conference on Information-Centric Networking

Information-centric Networking (ICN) as a potential Future Internet architecture has to efficiently support the consumption of multimedia content. Recent proposals consider the reuse of MPEG-DASH to provide adaptive streaming in ICN. Due to the fact ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science and Technology Council of Taiwan

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
123
Total Downloads

Downloads (Last 12 months)59
Downloads (Last 6 weeks)6

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu CSun YLee CHsu C(2024)Optimally Planning Drone Trajectories to Capture 3D Gaussian Splatting ObjectsMultiMedia Modeling10.1007/978-981-96-2064-7_13(171-185)Online publication date: 28-Dec-2024
https://doi.org/10.1007/978-981-96-2064-7_13

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten