skip to main content
10.1145/3638036.3640297acmconferencesArticle/Chapter ViewAbstractPublication PagesmhvConference Proceedingsconference-collections
short-paper

Usage of Multiplane Image Information SEI Message for Distribution of Lightweight 3DoF+ Immersive Content with Conventional 2D Decoder

Published: 14 March 2024 Publication History

Abstract

In this presentation, we describe a new SEI message under consideration for VSEI/H.274 - the multiplane image information SEI message (MPII SEI) [1]. The motivation to design this MPI is to establish open non-proprietary standards to enable distribution of lightweight 3DoF+ immersive consumer experiences encoded in a single bitstream compatible with already deployed smartphones, laptops, etc. equipped with a conventional 2D decoder.
Multiplane Image (MPI) scene representation is commonly used for three-dimensional (3D) scene reconstruction [2][3][4]. MPI is a state-of-the-art method for novel view synthesis which can model challenging occlusions and reflections. It requires very low computation for 3D scene reconstruction, making it possible to be implemented on existing devices to enable immersive image/video experiences. MPI representation stores fronto-parallel planes of a scene at a discretely sampled fixed range of depths (e.g., D layers) from the reference coordinate frame. Information stored in each plane contains the texture (in terms of color components such as R, G, B or Y, Cb, Cr value) and opacity/transparency.
MPI representation is high dimensional and volumetric thus cannot be directly distributed over existing video transmission infrastructure designed for 2D video. Multiplane image information SEI message is designed to help filling the gap. The high texture and opacity layers are first spatially arranged by D=KxM arrangement to form two constituent pictures for texture and opacity respectively.
Then the two constituent pictures can choose from three packing options: spatial side-by-side, spatial top-andbottom, and temporal interleaving, to form a conventional 2D picture sequence suitable for 2D video distribution infrastructure. The MPII SEI defines the minimal set of syntax elements to signal the packing arrangement and the layer depth information (either explicitly signaled for all layers or implicitly calculated by signaled min and max value for equal-distance case). A conventional receiver can decode the packed MPI constituent pictures and restore the volumetric MPI representation for novel view synthesis with the information carried in MPII SEI.
We also investigated the practical operation points of the packed MPI video with MPII SEI. The recommended operation point based on comprehensive considering the following factors: 1) resolution of the reference camera; 2) the number of layers to use in MPI representation; 3) the profile/level that the decoder on the receiver device can support. Based on experiments, we confirmed typical camera resolution to be from VGA (640x480) to FHD (1920x1080) at 30 fps; 16 MPI layers are compatible with common HEVC/VVC main10 decoders of level 5.1 to 6.2. The reasonable transmission bandwidth for VGA is below 10Mbps for broadcast quality.
The MPI generation is an active research topic in academic domain. Open-source tools [5][6][7] could be used and we also see improvement of the quality of MPI with the advance of AI-based MPI generation technologies. The novel view rendering using MPI is instead a low complexity, standard and well- well-known algorithm that are proven to be feasible with existing devices easily. We are also studying systems-level requirements to investigate how to enable the MPI based service in real life applications.

References

[1]
Taoran Lu, Peng Yin, Oh Sejin, Sean McCarthy, Walt Husak, and Gary J. Sullivan. 2023. AHG9: On the Proposed Multiplane Image Information SEI Message, JVET meeting document JVET-AF0167, October 2023
[2]
Richard Tucker and Noah Snavely. 2020. Single-view View Synthesis with Multiplane Images. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.48550/arXiv.2205.11733
[3]
Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local light field fusion: practical view synthesis with prescriptive sampling guidelines, Trans. Graph. 38, 4, Article 29 (August 2019), 14 pages. https://doi.org/10.1145/3306346.3322980
[4]
Yuxuan Han, Ruicheng Wang, and Jiaolong Yang. 2022. Single-View View Synthesis in the Wild with Learned Adaptive Multiplane Images. In ACM SIGGRAPH 2022 Conference Proceedings (Vancouver, BC, Canada) (SIGGRAPH '22). Association for Computing Machinery, New York, NY, USA, Article 14, 8 pages. https://doi.org/10.1145/3528233.3530755
[5]
https://github.com/Fyusion/LLFF/
[6]
https://github.com/yxuhan/AdaMPI
[7]
https://github.com/vincentfung13/MINE

Index Terms

  1. Usage of Multiplane Image Information SEI Message for Distribution of Lightweight 3DoF+ Immersive Content with Conventional 2D Decoder

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MHV '24: Proceedings of the 3rd Mile-High Video Conference
    February 2024
    150 pages
    ISBN:9798400704932
    DOI:10.1145/3638036
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Multiplane image
    2. SEI

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    MHV '24
    Sponsor:
    MHV '24: Mile-High Video Conference
    February 11 - 14, 2024
    CO, Denver, USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 27
      Total Downloads
    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media