research-article

Progressive Coding for Deep Learning based Point Cloud Attribute Compression

Authors:

Michael Rudolph,

Aron Riemenschneider,

Amr RizkAuthors Info & Claims

MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems

Pages 78 - 84

https://doi.org/10.1145/3652212.3652217

Published: 15 April 2024 Publication History

Abstract

Progressive coding is a valuable technique for networked immersive media. As users approach objects in an immersive environment, progressive coding enables a gradual improvement of content quality. This effectively reduces bandwidth consumption compared to non-progressive methods that require to fully exchange a content representation by an independent, new representation.

In this work, we introduce an approach to progressively code point cloud attributes in a learned manner by compressing quantization residuals of each preceding representation through a learned, lightweight transformation in the entropy bottleneck. This allows to progressively reduce quantization errors using a single model in an end-to-end learning manner given the quantization residuals. In contrast to the state of the art that conditions the compression on a fixed rate-distortion, i.e. it requires an ensemble of models to build an adaptive streaming system, our approach requires only a single model during compression and decompression. We present preliminary results of our method, showing bandwidth savings for the scenario of a user approaching an object and gradually transitioning from low to high quality representations.

References

[1]

ISO/IEC JTC 1/SC 29. 2021. ISO/IEC 23090-5:2021, Information technology --- Coded representation of immersive media --- Part 5: Visual volumetric video-based coding (V3C) and video-based point cloud compression (V-PCC). ISO/IEC.

[2]

ISO/IEC JTC 1/SC 29. 2022. ISO/IEC 23009-1:2022, Information technology --- Dynamic adaptive streaming over HTTP (DASH) --- Part 1: Media presentation description and segment formats. ISO/IEC.

[3]

ISO/IEC JTC 1/SC 29. 2023. ISO/IEC 23090-9:2023, Information technology --- Coded representation of immersive media --- Part 9: Geometry-based point cloud compression. ISO/IEC.

[4]

Johannes Ballé, Valero Laparra, and Eero P Simoncelli. 2016. End-to-end optimization of nonlinear transform codes for perceptual quality. In Picture Coding Symposium (PCS). IEEE, 1--5.

[5]

Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).

[6]

Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. CompressAI: a PyTorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).

[7]

Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. ITU-T SG16/Q.16, 33th VCEG Meeting (2001).

[8]

Chunlei Cai, Li Chen, Xiaoyun Zhang, Guo Lu, and Zhiyong Gao. 2019. A novel deep progressive image compression framework. In Picture Coding Symposium (PCS). IEEE, 1--5.

[9]

Christopher Choy, JunYoung Gwak, and Silvio Savarese. 2019. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)). 3075--3084.

[10]

M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi. 2014. Describing Textures in the Wild. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]

Ricardo L De Queiroz and Philip A Chou. 2016. Compression of 3D point clouds using a region-adaptive hierarchical transform. IEEE Transactions on Image Processing 25, 8 (2016), 3947--3956.

Digital Library

[12]

Enmao Diao, Jie Ding, and Vahid Tarokh. 2020. Drasic: Distributed recurrent autoencoder for scalable image compression. In Data Compression Conference (DCC). IEEE, 3--12.

[13]

Eugene d'Eon, Bob Harrison, Taos Myers, and Philip A Chou. 2017. 8i voxelized full bodies-A voxelized point cloud dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006 7, 8 (2017), 11.

[14]

Guangchi Fang, Qingyong Hu, Hanyun Wang, Yiling Xu, and Yulan Guo. 2022. 3dac: Learning attribute compression for point clouds. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14819--14828.

[15]

Guillaume Gautier, Alexandre Mercat, Louis Fréneau, Mikko Pitkänen, and Jarno Vanne. 2023. UVG-VPC: Voxelized Point Cloud Dataset for Visual Volumetric Video-based Coding. In International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 244--247.

[16]

André FR Guarda, Nuno MM Rodrigues, and Fernando Pereira. 2019. Point cloud coding: Adopting a deep learning-based approach. In Picture Coding Symposium (PCS). IEEE, 1--5.

[17]

Ali Hojjat, Janek Haberer, and Olaf Landsiedel. 2023. ProgDTD: Progressive Learned Image Compression With Double-Tail-Drop Training. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 1130--1139.

[18]

Jeroen Van der Hooft, Maria Torres Vega, Stefano Petrangeli, Tim Wauters, and Filip De Turck. 2019. Tile-based adaptive streaming for virtual reality video. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 15, 4 (2019), 1--24.

Digital Library

[19]

Mohammad Hosseini and Christian Timmerer. 2018. Dynamic adaptive point cloud streaming. In 23rd Packet Video Workshop. 25--30.

Digital Library

[20]

Khawar Islam, L Minh Dang, Sujin Lee, and Hyeonjoon Moon. 2021. Image compression with recurrent neural network and generalized divisive normalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1875--1879.

[21]

Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin Hwang, Joel Shor, and George Toderici. 2018. Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4385--4393.

[22]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[23]

Toshiaki Koike-Akino and Ye Wang. 2020. Stochastic bottleneck: Rateless auto-encoder for flexible dimensionality reduction. In IEEE International Symposium on Information Theory (ISIT). IEEE, 2735--2740.

Digital Library

[24]

Jae-Han Lee, Seungmin Jeon, Kwang Pyo Choi, Youngo Park, and Chang-Su Kim. 2022. DPICT: Deep progressive image compression using trit-planes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 16113--16122.

[25]

Shaohui Li, Han Li, Wenrui Dai, Chenglin Li, Junni Zou, and Hongkai Xiong. 2022. Learned Progressive Image Compression With Dead-Zone Quantizers. IEEE Transactions on Circuits and Systems for Video Technology (2022).

[26]

Charles Loop, Qin Cai, S Orts Escolano, and Philip A Chou. 2016. Microsoft voxelized upper bodies-A voxelized point cloud dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m38673 M 72012 (2016), 2016.

[27]

Yadong Lu, Yinhao Zhu, Yang Yang, Amir Said, and Taco S Cohen. 2021. Progressive neural image compression with nested quantization and latent ordering. In IEEE International Conference on Image Processing (ICIP). IEEE, 539--543.

[28]

Rufael Mekuria, Kees Blom, and Pablo Cesar. 2016. Design, implementation, and evaluation of a point cloud codec for tele-immersive video. IEEE Transactions on Circuits and Systems for Video Technology 27, 4 (2016), 828--842.

Digital Library

[29]

David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems 31 (2018).

[30]

Minh Nguyen, Shivi Vats, Sam Van Damme, Jeroen Van Der Hooft, Maria Torres Vega, Tim Wauters, Christian Timmerer, and Hermann Hellwagner. 2023. Impact of Quality and Distance on the Perception of Point Clouds in Mixed Reality. In International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 87--90.

[31]

Roger Pantos and William May. 2017. HTTP live streaming. Technical Report.

[32]

Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2019. Learning convolutional transforms for lossy point cloud geometry compression. In IEEE International Conference on Image Processing(ICIP). IEEE, 4320--4324.

[33]

Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2020. Folding-based compression of point cloud attributes. In IEEE International Conference on Image Processing(ICIP). IEEE, 3309--3313.

[34]

Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2020. Improved deep point cloud geometry compression. In IEEE International Workshop on Multimedia Signal Processing(MMSP). IEEE, 1--6.

[35]

Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2007. Overview of the scalable video coding extension of the H. 264/AVC standard. IEEE Transactions on circuits and systems for video technology 17, 9 (2007), 1103--1120.

Digital Library

[36]

Sebastian Schwarz, Marius Preda, Vittorio Baroncini, Madhukar Budagavi, Pablo Cesar, Philip A Chou, Robert A Cohen, Maja Krivokuća, Sébastien Lasserre, Zhu Li, et al. 2018. Emerging MPEG standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, 1 (2018), 133--148.

[37]

Xihua Sheng, Li Li, Dong Liu, Zhiwei Xiong, Zhu Li, and Feng Wu. 2021. Deeppcac: An end-to-end deep lossy compression framework for point cloud attributes. IEEE Transactions on Multimedia (TOMM) 24 (2021), 2617--2632.

Digital Library

[38]

Shishir Subramanyam, Irene Viola, Alan Hanjalic, and Pablo Cesar. 2020. User centered adaptive streaming of dynamic point clouds with low complexity tiling. In ACM International Conference on Multimedia (MM). 3669--3677.

Digital Library

[39]

Irene Viola, Jack Jansen, Shishir Subramanyam, Ignacio Reimat, and Pablo Cesar. 2023. Vr2gather: A collaborative social vr system for adaptive multi-party real-time communication. IEEE MultiMedia (2023).

[40]

Gregory K Wallace. 1991. The JPEG still picture compression standard. Commun. ACM 34, 4 (1991), 30--44.

Digital Library

[41]

Jianqiang Wang, Dandan Ding, Hao Chen, and Zhan Ma. 2023. Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding. arXiv preprint arXiv:2301.12165 (2023).

[42]

Jianqiang Wang, Dandan Ding, Zhu Li, Xiaoxing Feng, Chuntong Cao, and Zhan Ma. 2022. Sparse tensor-based multiscale representation for point cloud geometry compression. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).

Digital Library

[43]

Jianqiang Wang, Dandan Ding, Zhu Li, and Zhan Ma. 2021. Multiscale point cloud geometry compression. In 2021 Data Compression Conference (DCC). IEEE, 73--82.

[44]

Jianqiang Wang, Hao Zhu, Haojie Liu, and Zhan Ma. 2021. Lossy point cloud geometry compression via end-to-end learning. IEEE Transactions on Circuits and Systems for Video Technology 31, 12 (2021), 4909--4923.

[45]

Cha Zhang, Dinei Florencio, and Charles Loop. 2014. Point cloud attribute compression with graph transform. In IEEE International Conference on Image Processing (ICIP). IEEE, 2066--2070.

[46]

Junteng Zhang, Tong Chen, Dandan Ding, and Zhan Ma. 2023. YOGA: Yet Another Geometry-based Point Cloud Compressor. In ACM International Conference on Multimedia (MM). 9070--9081.

Index Terms

Progressive Coding for Deep Learning based Point Cloud Attribute Compression
1. Computing methodologies
  1. Computer graphics
    1. Shape modeling
      1. Point-based models
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia streaming

Recommendations

RABBIT: Live Transcoding of V-PCC Point Cloud Streams
MMSys '23: Proceedings of the 14th ACM Multimedia Systems Conference

Point clouds are a mature representation format for volumetric objects in 6 degrees-of-freedom multimedia streaming. To handle the massive size of point cloud data for visually satisfying immersive media, MPEG standardized Video-based Point Cloud ...
Transcoding V-PCC Point Cloud Streams in Real-time
Dynamic Point Clouds are a representation for 3D immersive media that allows users to freely navigate a scene while consuming the content. However, this comes at the cost of substantial data size, requiring efficient compression techniques to make point ...
A subspace based progressive coding method for speech compression

In this study, two novel methods, which are based on Karhunen Loeve Transform (KLT) and Independent Component Analysis (ICA), are proposed for coding of speech signals. Instead of immediately dealing with eigenvalue magnitudes, the KLT- and ICA-based ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MMVE '24: Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems

April 2024

101 pages

ISBN:9798400706189

DOI:10.1145/3652212

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 April 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MMSys '24

Sponsor:

SIGMM

MMSys '24: ACM Multimedia Systems Conference 2024

April 15 - 18, 2024

Bari, Italy

Acceptance Rates

Overall Acceptance Rate 26 of 44 submissions, 59%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
235
Total Downloads

Downloads (Last 12 months)235
Downloads (Last 6 weeks)18

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten