skip to main content
10.1145/3581783.3613907acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Hermes: Leveraging Implicit Inter-Frame Correlation for Bandwidth-Efficient Mobile Volumetric Video Streaming

Published: 27 October 2023 Publication History

Abstract

Volumetric videos offer viewers more immersive experiences, enabling a variety of applications. However, state-of-the-art streaming systems still need hundreds of Mbps, exceeding the common bandwidth capabilities of mobile devices. We find a research gap in reusing inter-frame redundant information to reduce bandwidth consumption, while the existing inter-frame compression methods rely on the so-called explicit correlation, i.e., the redundancy from the same/adjacent locations in the previous frame, which does not apply to highly dynamic frames or dynamic viewports. This work introduces a new concept called implicit correlation, i.e., the consistency of topological structures, which stably exists in dynamic frames and is beneficial for reducing bandwidth consumption. We design a mobile volumetric video streaming system Hermes consisting of an implicit correlation encoder to reduce bandwidth consumption and a hybrid streaming method that adapts to dynamic viewports. Experiments show that Hermes achieves a frame rate of 30+ FPS over daily networks and on commodity smartphones, with at least 3.37x improvement compared with two baselines.

References

[1]
Armen S Asratian, Tristan MJ Denley, and Roland H"aggkvist. 1998. Bipartite graphs and their applications. Vol. 131. Cambridge university press.
[2]
018)]% 8331850, Ricardo L. de Queiroz, Diogo C. Garcia, Philip A. Chou, and Dinei A. Florencio. 2018. Distance-Based Probability Model for Octree Coding. IEEE Signal Processing Letters, Vol. 25, 6 (2018), 739--742.
[3]
017)]% 8i, Eugene d'Eon, Bob Harrison, Taos Myers, and Philip A. Chou. 2017. 8i Voxelized Full Bodies - A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006. http://plenodb.jpeg.org/pc/8ilabs/
[4]
Diogo C. Garcia and Ricardo L. de Queiroz. 2017. Context-based octree coding for point-cloud video. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP). 1412--1416.
[5]
Diogo C. Garcia, Tiago A. Fonseca, Renan U. Ferreira, and Ricardo L. de Queiroz. 2020. Geometry Coding for Dynamic Voxelized Point Clouds Using Octrees and Multiple Contexts. IEEE Transactions on Image Processing, Vol. 29 (2020), 313--322.
[6]
Cyril Goutte and Eric Gaussier. 2005. A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In Proceedings of the 27th European Conference on IR Research (ECIR '05). 345--359.
[7]
Bo Han, Yu Liu, and Feng Qian. 2020. ViVo: Visibility-Aware Mobile Volumetric Video Streaming. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom '20). Article 11, 13 pages.
[8]
Tianxin Huang and Yong Liu. 2019. 3D Point Cloud Geometry Compression on Deep Learning. In Proceedings of the 27th ACM International Conference on Multimedia (MM '19). 890--898.
[9]
Yakun Huang, Yuanwei Zhu, Xiuquan Qiao, Zhijie Tan, and Boyuan Bai. 2021. AITransfer: Progressive AI-Powered Transmission for Real-Time Point Cloud Video Streaming. In Proceedings of the 29th ACM International Conference on Multimedia (MM '21). 3989--3997.
[10]
Hanbyul Joo, Tomas Simon, Xulong Li, Hao Liu, Lei Tan, Lin Gui, Sean Banerjee, Timothy Godisart, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh. 2019. Panoptic Studio: A Massively Multiview System for Social Interaction Capture. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, 1 (2019), 190--204.
[11]
Julius Kammerl, Nico Blodow, Radu Bogdan Rusu, Suat Gedikli, Michael Beetz, and Eckehard Steinbach. 2012. Real-time compression of point cloud streams. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation. 778--785.
[12]
Kyungjin Lee, Juheon Yi, Youngki Lee, Sunghyun Choi, and Young Min Kim. 2020. GROOT: A Real-Time Streaming System of High-Fidelity Volumetric Videos. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom '20). Article 57, 14 pages.
[13]
Yu Liu, Bo Han, Feng Qian, Arvind Narayanan, and Zhi-Li Zhang. 2022. Vues: Practical Mobile Volumetric Video Streaming through Multiview Transcoding. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking (MobiCom '22). 514--527.
[14]
Charles Loop, Qin Cai, Sergio Orts Escolano, and Philip A. Chou. 2016. Microsoft Voxelized Upper Bodies -- A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m38673/M72012. http://plenodb.jpeg.org/pc/microsoft/
[15]
Rufael Mekuria, Kees Blom, and Pablo Cesar. 2017. Design, Implementation, and Evaluation of a Point Cloud Codec for Tele-Immersive Video. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 27, 4 (2017), 828--842.
[16]
Newsmantraa. 2022. Volumetric Video Market to hit $9,685.7 Million by 2028. https://www.digitaljournal.com/pr/volumetric-video-market-to-hit-9685--7-million-by-2028
[17]
Eduardo Peixoto, Edil Medeiros, and Evaristo Ramalho. 2020. Silhouette 4d: An Inter-Frame Lossless Geometry Coder of Dynamic Voxelized Point Clouds. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP). 2691--2695.
[18]
Evaristo Ramalho, Eduardo Peixoto, and Edil Medeiros. 2021. Silhouette 4D With Context Selection: Lossless Geometry Compression of Dynamic Point Clouds. IEEE Signal Processing Letters, Vol. 28 (2021), 1660--1664.
[19]
Cristiano Santos, Mateus Gonçalves, Guilherme Corrêa, and Marcelo Porto. 2021. Block-Based Inter-Frame Prediction For Dynamic Point Cloud Compression. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP). 3388--3392.
[20]
Khalid Sayood. 2017. Introduction to data compression. Morgan Kaufmann.
[21]
Sebastian Schwarz, Marius Preda, Vittorio Baroncini, Madhukar Budagavi, Pablo Cesar, Philip A. Chou, Robert A. Cohen, Maja Krivoku?a, Sébastien Lasserre, Zhu Li, Joan Llach, Khaled Mammou, Rufael Mekuria, Ohji Nakagami, Ernestasia Siahaan, Ali Tabatabai, Alexis M. Tourapis, and Vladyslav Zakharchenko. 2019. Emerging MPEG Standards for Point Cloud Compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Vol. 9, 1 (2019), 133--148.
[22]
Claude E Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, Vol. 27, 3 (1948), 379--423.
[23]
André L. Souto, Ricardo L. de Queiroz, and Camilo Dorea. 2020. A 3D Motion Vector Database for Dynamic Point Clouds. https://doi.org/10.48550/ARXIV.2008.08438
[24]
Shishir Subramanyam, Irene Viola, Alan Hanjalic, and Pablo Cesar. 2020. User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20). 3669--3677.
[25]
Shishir Subramanyam, Irene Viola, Jack Jansen, Evangelos Alexiou, Alan Hanjalic, and Pablo Cesar. 2022. Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). 3094--3103.
[26]
Steven L. Tanimoto, Alon Itai, and Michael Rodeh. 1978. Some Matching Problems for Bipartite Graphs. J. ACM, Vol. 25, 4 (1978), 517--525.
[27]
Petroc Taylor. 2023. Average mobile and fixed broadband download and upload speeds worldwide as of July 2022. https://www.statista.com/statistics/896779/average-mobile-fixed-broadband-download-upload-speeds
[28]
Jeroen van der Hooft, Tim Wauters, Filip De Turck, Christian Timmerer, and Hermann Hellwagner. 2019. Towards 6DoF HTTP Adaptive Streaming Through Point Cloud Compression. In Proceedings of the 27th ACM International Conference on Multimedia (MM '19). 2405--2413.
[29]
David W Walker. 2018. Morton ordering of 2D arrays for efficient access to hierarchical memory. The International Journal of High Performance Computing Applications, Vol. 32, 1 (2018), 189--203.
[30]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, Vol. 13, 4 (2004), 600--612.
[31]
Hang Yuan, Wei Gao, Ge Li, and Zhu Li. 2022. Rate-Distortion-Guided Learning Approach with Cross-Projection Information for V-PCC Fast CU Decision. In Proceedings of the 30th ACM International Conference on Multimedia (MM '22). 3085--3093.
[32]
Anlan Zhang, Chendong Wang, Bo Han, and Feng Qian. 2022. YuZu: Neural-Enhanced Volumetric Video Streaming. In Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI '22). 137--154.
[33]
Zhengyou Zhang. 1994. Iterative point matching for registration of free-form curves and surfaces. International journal of computer vision, Vol. 13, 2 (1994), 119--152

Cited By

View all
  • (2024)AraLive: Automatic Reward Adaption for Learning-based Live Video StreamingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681499(11099-11108)Online publication date: 28-Oct-2024
  • (2024)FSVFG: Towards Immersive Full-Scene Volumetric Video Streaming with Adaptive Feature GridProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680908(11089-11098)Online publication date: 28-Oct-2024

Index Terms

  1. Hermes: Leveraging Implicit Inter-Frame Correlation for Bandwidth-Efficient Mobile Volumetric Video Streaming

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. entropy encoding
    2. implicit correlation
    3. volumetric video streaming

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)222
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)AraLive: Automatic Reward Adaption for Learning-based Live Video StreamingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681499(11099-11108)Online publication date: 28-Oct-2024
    • (2024)FSVFG: Towards Immersive Full-Scene Volumetric Video Streaming with Adaptive Feature GridProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680908(11089-11098)Online publication date: 28-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media