research-article

FSVFG: Towards Immersive Full-Scene Volumetric Video Streaming with Adaptive Feature Grid

Authors:

Jiangchuan Liu,

Fang DongAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 11089 - 11098

https://doi.org/10.1145/3664647.3680908

Published: 28 October 2024 Publication History

Abstract

Given the truly immersive viewing experiences, full-scene volumetric videos have received increasing attention from both academia and industry. Their vast data volumes, however, present significant challenges for real-time streaming over today's bandwidth-limited Internet. Considering the vast amount of full-scene volumetric data to be streamed and the limited bandwidth on the Internet, achieving adaptive full-scene volumetric video streaming over the Internet presents a significant challenge. Inspired by the advantages offered by neural fields, especially the feature grid method, we propose FSVFG, a novel full-scene volumetric video streaming system integrated feature grids as the representation of volumetric content. FSVFG employs an incremental training approach for feature grids and stores the features and residuals between adjacent grids as frames. To support adaptive streaming, we delve into the data structure and rendering processes of feature grids and propose bandwidth adaptation mechanisms. The mechanisms involve a coarse ray-marching for the selection of features and residuals to be sent, and achieve variable bitrate streaming by Level-of-Detail (LoD) and residual filtering. Based on these mechanisms, FSVFG achieves adaptive streaming by adaptively balancing the transmission of feature and residual according to the available bandwidth. Our preliminary results demonstrate the effectiveness of FSVFG, demonstrating its ability to improve visual quality and reduce bandwidth requirements of full-scene volumetric video streaming.

References

[1]

Gianluca Cernigliaro, Marc Martos, Mario Montagud, Amir Ansari, and Sergi Fernandez. 2020. PC-MCU: Point Cloud Multipoint Control Unit for Multi-User Holoconferencing Systems. In Proceedings of the 30th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV '20). 47--53.

Digital Library

[2]

Julian Chibane, Thiemo Alldieck, and Gerard Pons-Moll. 2020. Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6970--6981.

[3]

Junwoo Cho, Seungtae Nam, Daniel Rho, Jong Hwan Ko, and Eunbyung Park. 2022. Streamable Neural Fields. In Computer Vision - ECCV 2022 (Lecture Notes in Computer Science), Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). 595--612.

[4]

Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. 2022. Plenoxels: Radiance Fields Without Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5501--5510.

[5]

Yongjie Guan, Xueyu Hou, Nan Wu, Bo Han, and Tao Han. 2023. MetaStream: Live Volumetric Content Capture, Creation, Delivery, and Rendering in Real Time. In Proceedings of the 29th Annual International Conference on Mobile Computing and Networking. Number 29. 1--15.

Digital Library

[6]

Bo Han, Yu Liu, and Feng Qian. 2020. ViVo: Visibility-Aware Mobile Volumetric Video Streaming. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--13.

Digital Library

[7]

Kaiyuan Hu, Yongting Chen, Kaiying Han, Junhua Liu, Haowen Yang, Yili Jin, Boyan Li, and Fangxin Wang. 2023. LiveVV: Human-Centered Live Volumetric Video Streaming System. arXiv:2310.08205 [cs]

[8]

Kaiyuan Hu, Yili Jin, Haowen Yang, Junhua Liu, and FangxinWang. 2023. FSVVD: A Dataset of Full Scene Volumetric Video. In Proceedings of the 14th Conference on ACM Multimedia Systems (Vancouver, BC, Canada) (MMSys '23). 410--415.

Digital Library

[9]

Yakun Huang, Yuanwei Zhu, Xiuquan Qiao, Zhijie Tan, and Boyuan Bai. 2021. AITransfer: Progressive AI-powered Transmission for Real-Time Point Cloud Video Streaming. In Proceedings of the 29th ACM International Conference on Multimedia (MM '21). 3989--3997.

Digital Library

[10]

Yili Jin, Kaiyuan Hu, Junhua Liu, Fangxin Wang, and Xue Liu. 2023. From Capture to Display: A Survey on Volumetric Video. arXiv preprint arXiv:2309.05658 (2023).

[11]

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkuehler, and George Drettakis. 2023. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics 42, 4 (July 2023), 139:1--139:14.

Digital Library

[12]

Kyungjin Lee, Juheon Yi, Youngki Lee, Sunghyun Choi, and Young Min Kim. 2020. GROOT: A Real-Time Streaming System of High-Fidelity Volumetric Videos. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--14.

Digital Library

[13]

Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, and Liefeng Bo. 2023. Compressing Volumetric Radiance Fields to 1 MB. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4222--4231.

[14]

Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, and Ping Tan. 2022. Streaming Radiance Fields for 3D Video Synthesis. Advances in Neural Information Processing Systems 35 (Dec. 2022), 13485--13498.

[15]

Tianye Li, Mira Slavcheva, Michael Zollhöfer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, and Zhaoyang Lv. 2022. Neural 3D Video Synthesis From Multi-View Video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5521--5531.

[16]

Wei Li, K. Mueller, and A. Kaufman. 2003. Empty space skipping and occlusion clipping for texture-based volume rendering. In IEEE Visualization, 2003. VIS 2003. 317--324.

[17]

Zhicheng Liang, Junhua Liu, Mallesham Dasari, and Fangxin Wang. 2024. Fumos: Neural Compression and Progressive Refinement for Continuous Point Cloud Video Streaming. IEEE Transactions on Visualization and Computer Graphics (2024), 1--11.

[18]

Junhua Liu, Yuanyuan Wang, Yan Wang, Yufeng Wang, Shuguang Cui, and Fangxin Wang. 2023. Mobile Volumetric Video Streaming System through Implicit Neural Representation. In Proceedings of the 2023 Workshop on Emerging Multimedia Systems (EMS '23). 1--7.

Digital Library

[19]

Junhua Liu, Boxiang Zhu, Fangxin Wang, Yili Jin, Wenyi Zhang, Zihan Xu, and Shuguang Cui. 2023. CaV3: Cache-assisted Viewport Adaptive Volumetric Video Streaming. In 2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR). 173--183.

[20]

Jia-Wei Liu, Yan-Pei Cao, Weijia Mao, Wenqiao Zhang, David Junhao Zhang, Jussi Keppo, Ying Shan, Xiaohu Qie, and Mike Zheng Shou. 2022. DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes. Advances in Neural Information Processing Systems 35 (Dec. 2022), 36762--36775.

[21]

Kaiyan Liu, Ruizhi Cheng, Nan Wu, and Bo Han. 2023. Toward Next-generation Volumetric Video Streaming with Neural-based Content Representations. In Proceedings of the 1st ACM Workshop on Mobile Immersive Computing, Networking, and Systems (ImmerCom '23). 199--207.

Digital Library

[22]

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020. Neural Sparse Voxel Fields. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS'20). 15651--15663.

Digital Library

[23]

Yu Liu, Bo Han, Feng Qian, Arvind Narayanan, and Zhi-Li Zhang. 2022. Vues: Practical Mobile Volumetric Video Streaming through Multiview Transcoding. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking (MobiCom '22). 514--527.

Digital Library

[24]

Omnia Mahmoud, Théo Ladune, and Matthieu Gendrin. 2023. CAwa-NeRF: Instant Learning of Compression-Aware NeRF Features. arXiv:2310.14695 [cs]

[25]

Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro, and Gordon Wetzstein. 2021. Acorn: Adaptive Coordinate Networks for Neural Scene Representation. ACM Transactions on Graphics 40, 4 (July 2021), 58:1--58:13.

Digital Library

[26]

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2021. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Commun. ACM 65, 1 (Dec. 2021), 99--106.

Digital Library

[27]

Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Transactions on Graphics 41, 4 (July 2022), 102:1--102:15.

Digital Library

[28]

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 165--174.

[29]

Keunhong Park, Utkarsh Sinha, Jonathan T. Barron, Sofien Bouaziz, Dan B. Goldman, Steven M. Seitz, and Ricardo Martin-Brualla. 2021. Nerfies: Deformable Neural Radiance Fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5865--5874.

[30]

Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T. Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin-Brualla, and Steven M. Seitz. 2021. HyperNeRF: a higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph. 40, 6, Article 238 (dec 2021), 12 pages.

Digital Library

[31]

Evaristo Ramalho, Eduardo Peixoto, and Edil Medeiros. 2021. Silhouette 4D With Context Selection: Lossless Geometry Compression of Dynamic Point Clouds. IEEE Signal Processing Letters 28 (2021), 1660--1664.

[32]

Sebastian Schwarz, Marius Preda, Vittorio Baroncini, Madhukar Budagavi, Pablo Cesar, Philip A. Chou, Robert A. Cohen, Maja Krivokuća, Sébastien Lasserre, Zhu Li, Joan Llach, Khaled Mammou, Rufael Mekuria, Ohji Nakagami, Ernestasia Siahaan, Ali Tabatabai, Alexis M. Tourapis, and Vladyslav Zakharchenko. 2019. Emerging MPEG Standards for Point Cloud Compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, 1 (March 2019), 133--148.

[33]

Anil Shanbhag, Holger Pirk, and Samuel Madden. 2018. Efficient Top-K query processing on massively parallel hardware. In Proceedings of the 2018 International Conference on Management of Data. ACM, 1557--1570.

Digital Library

[34]

Ruizhi Shao, Zerong Zheng, Hanzhang Tu, Boning Liu, Hongwen Zhang, and Yebin Liu. 2023. Tensor4D: Efficient Neural 4D Decomposition for High-Fidelity Dynamic Reconstruction and Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16632--16642.

[35]

Jianxin Shi, Miao Zhang, Linfeng Shen, Jiangchuan Liu, Yuan Zhang, Lingjun Pu, and Jingdong Xu. 2024. Towards Full-scene Volumetric Video Streaming via Spatially Layered Representation and NeRF Generation. In Proceedings of the 34th Edition of the Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV '24). Association for Computing Machinery, New York, NY, USA, 22--28. https://doi.org/10.1145/3651863.3651879

Digital Library

[36]

Liangchen Song, Anpei Chen, Zhong Li, Zhang Chen, Lele Chen, Junsong Yuan, Yi Xu, and Andreas Geiger. 2023. NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields. IEEE Transactions on Visualization and Computer Graphics 29, 5 (May 2023), 2732--2742.

Digital Library

[37]

Shishir Subramanyam, Irene Viola, Alan Hanjalic, and Pablo Cesar. 2020. User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20). 3669--3677.

Digital Library

[38]

Shishir Subramanyam, Irene Viola, Jack Jansen, Evangelos Alexiou, Alan Hanjalic, and Pablo Cesar. 2022. Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). 3094--3103.

Digital Library

[39]

Towaki Takikawa, Alex Evans, Jonathan Tremblay, Thomas Müller, Morgan McGuire, Alec Jacobson, and Sanja Fidler. 2022. Variable Bitrate Neural Fields. In Special Interest Group on Computer Graphics and Interactive Techniques Conference Proceedings. 1--9.

[40]

Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. 2021. Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11358--11367.

[41]

Jeroen van der Hooft, Tim Wauters, Filip De Turck, Christian Timmerer, and Hermann Hellwagner. 2019. Towards 6DoF HTTP Adaptive Streaming Through Point Cloud Compression. In Proceedings of the 27th ACM International Conference on Multimedia (MM '19). 2405--2413.

Digital Library

[42]

Irene Viola and Pablo Cesar. 2023. Chapter 15 - Volumetric Video Streaming: Current Approaches and Implementations. In Immersive Video Technologies, Giuseppe Valenzise, Martin Alain, Emin Zerman, and Cagri Ozcinar (Eds.). 425--443.

[43]

Liao Wang, Qiang Hu, Qihan He, Ziyu Wang, Jingyi Yu, Tinne Tuytelaars, Lan Xu, and Minye Wu. 2023. Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos. arXiv:2304.04452 [cs]

[44]

Shengze Wang, Alexey Supikov, Joshua Ratcliff, Henry Fuchs, and Ronald Azuma. 2023. INV: Towards Streaming Incremental Neural Videos. arXiv:2302.01532 [cs]

[45]

Yizong Wang, Dong Zhao, Huanhuan Zhang, Chenghao Huang, Teng Gao, Zixuan Guo, Liming Pang, and Huadong Ma. 2023. Hermes: Leveraging Implicit Inter-Frame Correlation for Bandwidth-Efficient Mobile Volumetric Video Streaming. In Proceedings of the 31st ACM International Conference on Multimedia (MM '23). 9185--9193.

Digital Library

[46]

Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, and Xinggang Wang. 2023. 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering. arXiv:2310.08528 [cs]

[47]

Minye Wu and Tinne Tuytelaars. 2023. NeVRF: Neural Video-based Radiance Fields for Long-duration Sequences. arXiv:2312.05855 [cs]

[48]

Zeyu Yang, Hongye Yang, Zijie Pan, Xiatian Zhu, and Li Zhang. 2023. Real-Time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting. arXiv:2310.10642 [cs]

[49]

Anlan Zhang, Chendong Wang, Bo Han, and Feng Qian. 2022. YuZu: Neural-Enhanced Volumetric Video Streaming. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 137--154.

[50]

Jiakai Zhang, Xinhang Liu, Xinyi Ye, Fuqiang Zhao, Yanshun Zhang, Minye Wu, Yingliang Zhang, Lan Xu, and Jingyi Yu. 2021. Editable Free-Viewpoint Video Using a Layered Neural Representation. ACM Transactions on Graphics 40, 4 (July 2021), 149:1--149:18.

Digital Library

Index Terms

FSVFG: Towards Immersive Full-Scene Volumetric Video Streaming with Adaptive Feature Grid
1. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia streaming
2. Networks
  1. Network protocols
    1. Application layer protocols

Recommendations

Towards Full-scene Volumetric Video Streaming via Spatially Layered Representation and NeRF Generation
NOSSDAV '24: Proceedings of the 34th edition of the Workshop on Network and Operating System Support for Digital Audio and Video

Immersive full-scene volumetric video (VV) showcases the richness and detail of the 3D world, yet poses significant streaming challenges given its massive data volume. Existing 3D tile-based viewport approaches struggle to effectively adapt to full-scene ...
Toward Next-generation Volumetric Video Streaming with Neural-based Content Representations
ImmerCom '23: Proceedings of the 1st ACM Workshop on Mobile Immersive Computing, Networking, and Systems

Striking a balance between minimizing bandwidth consumption and maintaining high visual quality stands as the paramount objective in volumetric content delivery. However, achieving this ambitious target is a substantial challenge, especially for ...
Benchmarking and Visualizing Compression Errors in Volumetric Streaming Systems
HotMobile '25: Proceedings of the 26th International Workshop on Mobile Computing Systems and Applications

Volumetric streaming is a powerful medium that transmits volumetric data, which primarily includes color and depth information, over a network in real-time. While color data can be effectively compressed using standard video codecs, compressing depth ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

British Columbia Salmon Recovery and Innovation Fund
NSERC Discovery Grant
MITACS Accelerate Cluster Grant

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
181
Total Downloads

Downloads (Last 12 months)181
Downloads (Last 6 weeks)99

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten