research-article

Hermes: Leveraging Implicit Inter-Frame Correlation for Bandwidth-Efficient Mobile Volumetric Video Streaming

Authors:

Huanhuan Zhang,

Chenghao Huang,

Huadong MaAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 9185 - 9193

https://doi.org/10.1145/3581783.3613907

Published: 27 October 2023 Publication History

Abstract

Volumetric videos offer viewers more immersive experiences, enabling a variety of applications. However, state-of-the-art streaming systems still need hundreds of Mbps, exceeding the common bandwidth capabilities of mobile devices. We find a research gap in reusing inter-frame redundant information to reduce bandwidth consumption, while the existing inter-frame compression methods rely on the so-called explicit correlation, i.e., the redundancy from the same/adjacent locations in the previous frame, which does not apply to highly dynamic frames or dynamic viewports. This work introduces a new concept called implicit correlation, i.e., the consistency of topological structures, which stably exists in dynamic frames and is beneficial for reducing bandwidth consumption. We design a mobile volumetric video streaming system Hermes consisting of an implicit correlation encoder to reduce bandwidth consumption and a hybrid streaming method that adapts to dynamic viewports. Experiments show that Hermes achieves a frame rate of 30+ FPS over daily networks and on commodity smartphones, with at least 3.37x improvement compared with two baselines.

References

[1]

Armen S Asratian, Tristan MJ Denley, and Roland H"aggkvist. 1998. Bipartite graphs and their applications. Vol. 131. Cambridge university press.

[2]

018)]% 8331850, Ricardo L. de Queiroz, Diogo C. Garcia, Philip A. Chou, and Dinei A. Florencio. 2018. Distance-Based Probability Model for Octree Coding. IEEE Signal Processing Letters, Vol. 25, 6 (2018), 739--742.

[3]

017)]% 8i, Eugene d'Eon, Bob Harrison, Taos Myers, and Philip A. Chou. 2017. 8i Voxelized Full Bodies - A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006. http://plenodb.jpeg.org/pc/8ilabs/

[4]

Diogo C. Garcia and Ricardo L. de Queiroz. 2017. Context-based octree coding for point-cloud video. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP). 1412--1416.

[5]

Diogo C. Garcia, Tiago A. Fonseca, Renan U. Ferreira, and Ricardo L. de Queiroz. 2020. Geometry Coding for Dynamic Voxelized Point Clouds Using Octrees and Multiple Contexts. IEEE Transactions on Image Processing, Vol. 29 (2020), 313--322.

Digital Library

[6]

Cyril Goutte and Eric Gaussier. 2005. A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In Proceedings of the 27th European Conference on IR Research (ECIR '05). 345--359.

Digital Library

[7]

Bo Han, Yu Liu, and Feng Qian. 2020. ViVo: Visibility-Aware Mobile Volumetric Video Streaming. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom '20). Article 11, 13 pages.

Digital Library

[8]

Tianxin Huang and Yong Liu. 2019. 3D Point Cloud Geometry Compression on Deep Learning. In Proceedings of the 27th ACM International Conference on Multimedia (MM '19). 890--898.

Digital Library

[9]

Yakun Huang, Yuanwei Zhu, Xiuquan Qiao, Zhijie Tan, and Boyuan Bai. 2021. AITransfer: Progressive AI-Powered Transmission for Real-Time Point Cloud Video Streaming. In Proceedings of the 29th ACM International Conference on Multimedia (MM '21). 3989--3997.

Digital Library

[10]

Hanbyul Joo, Tomas Simon, Xulong Li, Hao Liu, Lei Tan, Lin Gui, Sean Banerjee, Timothy Godisart, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh. 2019. Panoptic Studio: A Massively Multiview System for Social Interaction Capture. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, 1 (2019), 190--204.

Digital Library

[11]

Julius Kammerl, Nico Blodow, Radu Bogdan Rusu, Suat Gedikli, Michael Beetz, and Eckehard Steinbach. 2012. Real-time compression of point cloud streams. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation. 778--785.

[12]

Kyungjin Lee, Juheon Yi, Youngki Lee, Sunghyun Choi, and Young Min Kim. 2020. GROOT: A Real-Time Streaming System of High-Fidelity Volumetric Videos. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom '20). Article 57, 14 pages.

Digital Library

[13]

Yu Liu, Bo Han, Feng Qian, Arvind Narayanan, and Zhi-Li Zhang. 2022. Vues: Practical Mobile Volumetric Video Streaming through Multiview Transcoding. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking (MobiCom '22). 514--527.

Digital Library

[14]

Charles Loop, Qin Cai, Sergio Orts Escolano, and Philip A. Chou. 2016. Microsoft Voxelized Upper Bodies -- A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m38673/M72012. http://plenodb.jpeg.org/pc/microsoft/

[15]

Rufael Mekuria, Kees Blom, and Pablo Cesar. 2017. Design, Implementation, and Evaluation of a Point Cloud Codec for Tele-Immersive Video. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 27, 4 (2017), 828--842.

Digital Library

[16]

Newsmantraa. 2022. Volumetric Video Market to hit $9,685.7 Million by 2028. https://www.digitaljournal.com/pr/volumetric-video-market-to-hit-9685--7-million-by-2028

[17]

Eduardo Peixoto, Edil Medeiros, and Evaristo Ramalho. 2020. Silhouette 4d: An Inter-Frame Lossless Geometry Coder of Dynamic Voxelized Point Clouds. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP). 2691--2695.

[18]

Evaristo Ramalho, Eduardo Peixoto, and Edil Medeiros. 2021. Silhouette 4D With Context Selection: Lossless Geometry Compression of Dynamic Point Clouds. IEEE Signal Processing Letters, Vol. 28 (2021), 1660--1664.

[19]

Cristiano Santos, Mateus Gonçalves, Guilherme Corrêa, and Marcelo Porto. 2021. Block-Based Inter-Frame Prediction For Dynamic Point Cloud Compression. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP). 3388--3392.

[20]

Khalid Sayood. 2017. Introduction to data compression. Morgan Kaufmann.

[21]

Sebastian Schwarz, Marius Preda, Vittorio Baroncini, Madhukar Budagavi, Pablo Cesar, Philip A. Chou, Robert A. Cohen, Maja Krivoku?a, Sébastien Lasserre, Zhu Li, Joan Llach, Khaled Mammou, Rufael Mekuria, Ohji Nakagami, Ernestasia Siahaan, Ali Tabatabai, Alexis M. Tourapis, and Vladyslav Zakharchenko. 2019. Emerging MPEG Standards for Point Cloud Compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Vol. 9, 1 (2019), 133--148.

[22]

Claude E Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, Vol. 27, 3 (1948), 379--423.

[23]

André L. Souto, Ricardo L. de Queiroz, and Camilo Dorea. 2020. A 3D Motion Vector Database for Dynamic Point Clouds. https://doi.org/10.48550/ARXIV.2008.08438

[24]

Shishir Subramanyam, Irene Viola, Alan Hanjalic, and Pablo Cesar. 2020. User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20). 3669--3677.

Digital Library

[25]

Shishir Subramanyam, Irene Viola, Jack Jansen, Evangelos Alexiou, Alan Hanjalic, and Pablo Cesar. 2022. Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). 3094--3103.

Digital Library

[26]

Steven L. Tanimoto, Alon Itai, and Michael Rodeh. 1978. Some Matching Problems for Bipartite Graphs. J. ACM, Vol. 25, 4 (1978), 517--525.

Digital Library

[27]

Petroc Taylor. 2023. Average mobile and fixed broadband download and upload speeds worldwide as of July 2022. https://www.statista.com/statistics/896779/average-mobile-fixed-broadband-download-upload-speeds

[28]

Jeroen van der Hooft, Tim Wauters, Filip De Turck, Christian Timmerer, and Hermann Hellwagner. 2019. Towards 6DoF HTTP Adaptive Streaming Through Point Cloud Compression. In Proceedings of the 27th ACM International Conference on Multimedia (MM '19). 2405--2413.

Digital Library

[29]

David W Walker. 2018. Morton ordering of 2D arrays for efficient access to hierarchical memory. The International Journal of High Performance Computing Applications, Vol. 32, 1 (2018), 189--203.

Digital Library

[30]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, Vol. 13, 4 (2004), 600--612.

Digital Library

[31]

Hang Yuan, Wei Gao, Ge Li, and Zhu Li. 2022. Rate-Distortion-Guided Learning Approach with Cross-Projection Information for V-PCC Fast CU Decision. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). 3085--3093.

Digital Library

[32]

Anlan Zhang, Chendong Wang, Bo Han, and Feng Qian. 2022. YuZu: Neural-Enhanced Volumetric Video Streaming. In Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI '22). 137--154.

[33]

Zhengyou Zhang. 1994. Iterative point matching for registration of free-form curves and surfaces. International journal of computer vision, Vol. 13, 2 (1994), 119--152

Digital Library

Cited By

Zhang Hzhuo LLi HZhou AWang CMa HCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)AraLive: Automatic Reward Adaption for Learning-based Live Video StreamingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681499(11099-11108)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681499
Yin DShi JZhang MHuang ZLiu JDong FCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)FSVFG: Towards Immersive Full-Scene Volumetric Video Streaming with Adaptive Feature GridProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680908(11089-11098)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680908

Index Terms

Hermes: Leveraging Implicit Inter-Frame Correlation for Bandwidth-Efficient Mobile Volumetric Video Streaming
1. Information systems
  1. Information systems applications
    1. Spatial-temporal systems

Recommendations

Mobile Volumetric Video Streaming System through Implicit Neural Representation
EMS '23: Proceedings of the 2023 Workshop on Emerging Multimedia Systems

Volumetric video (VV) emerges as a new video paradigm with six degree-of-freedom (DoF) immersive viewing experience. Most existing VV systems focus on the point cloud (PtCl)-based architecture, which is however far from effective due to the huge video ...
Bandwidth-Efficient Mobile Volumetric Video Streaming by Exploiting Inter-Frame Correlation
Volumetric videos offer viewers more immersive experiences, enabling a variety of applications. However, state-of-the-art streaming systems still need hundreds of Mbps bandwidth to transmit volumetric videos, exceeding the common bandwidth capabilities of ...
Benchmarking and Visualizing Compression Errors in Volumetric Streaming Systems
HotMobile '25: Proceedings of the 26th International Workshop on Mobile Computing Systems and Applications

Volumetric streaming is a powerful medium that transmits volumetric data, which primarily includes color and depth information, over a network in real-time. While color data can be effectively compressed using standard video codecs, compressing depth ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

China National Postdoctoral Program for Innovative Talents
National Natural Science Foundation of China
Innovation Research Group Project of NSFC
BUPT Excellent Ph.D. Students Foundation

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
401
Total Downloads

Downloads (Last 12 months)222
Downloads (Last 6 weeks)13

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang Hzhuo LLi HZhou AWang CMa HCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)AraLive: Automatic Reward Adaption for Learning-based Live Video StreamingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681499(11099-11108)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681499
Yin DShi JZhang MHuang ZLiu JDong FCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)FSVFG: Towards Immersive Full-Scene Volumetric Video Streaming with Adaptive Feature GridProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680908(11089-11098)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680908

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten