research-article

ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision

Authors:

Ge LiAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 3741 - 3750

https://doi.org/10.1145/3664647.3681301

Published: 28 October 2024 Publication History

Abstract

Point cloud data is pivotal in applications like autonomous driving, virtual reality, and robotics. However, its substantial volume poses significant challenges in storage and transmission. In order to obtain a high compression ratio, crucial semantic details usually confront severe damage, leading to difficulties in guaranteeing the accuracy of downstream tasks. To tackle this problem, we are the first to introduce a novel Region of Interest (ROI)-guided Point Cloud Geometry Compression (RPCGC) method for human and machine vision. Our framework employs a dual-branch parallel structure, where the base layer encodes and decodes a simplified version of the point cloud, and the enhancement layer refines this by focusing on geometry details. Furthermore, the residual information of the enhancement layer undergoes refinement through an ROI prediction network. This network generates mask information, which is then incorporated into the residuals, serving as a strong supervision signal. Additionally, we intricately apply these mask details in the Rate-Distortion (RD) optimization process, with each point weighted in the distortion calculation. Our loss function includes RD loss and detection loss to better guide point cloud encoding for the machine. Experiment results demonstrate that RPCGC achieves exceptional compression performance and better detection accuracy (10% gain) than some learning-based compression methods at high bitrates in ScanNet and SUN RGB-D datasets.

References

[1]

Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, Yaowei Wang, Xiangyang Ji, and Wen Gao. 2022. Towards End-to-End Image Compression and Analysis with Transformers. In AAAI Conference on Artificial Intelligence, Vol. 36. 104--112.

[2]

Chunlei Cai, Li Chen, Xiaoyun Zhang, and Zhiyong Gao. 2019. End-to-End Optimized ROI Image Compression. IEEE Transactions on Image Processing, Vol. 29 (2019), 3442--3457.

Digital Library

[3]

Zhengxue Cheng, Heming Sun, and Masaru Takeuchi. 2020. Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7939--7948.

[4]

Christopher Choy, Jun Young Gwak, and Silvio Savarese. 2019. 4D Spatio-Temporal Convnets: Minkowski Convolutional Neural Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3075--3084.

[5]

Angela Dai, Angel X Chang, Manolis Savva, and Maciej Halber. 2017. Scannet: Richly-Annotated 3D Reconstructions of Indoor Scenes. In IEEE Conference on Computer Vision and Pattern Recognition. 5828--5839.

[6]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv preprint arXiv:2010.11929 (2020).

[7]

Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In IEEE Conference on Computer Vision and Pattern Recognition. 605--613.

[8]

Songlin Fan, Wei Gao, and Ge Li. 2022. Salient object detection for point clouds. In European Conference on Computer Vision. Springer, 1--19.

Digital Library

[9]

Chunyang Fu, Ge Li, Rui Song, Wei Gao, and Shan Liu. 2022. OctAttention: Octree-based Large-scale Contexts Model for Point Cloud Compression. ArXiv Preprint ArXiv:2202.06028 (2022).

[10]

Linyao Gao, Tingyu Fan, Jianqiang Wan, Yiling Xu, Jun Sun, and Zhan Ma. 2021. Point Cloud Geometry Compression Via Neural Graph Sampling. In IEEE International Conference on Image Processing. IEEE, 3373--3377.

[11]

Wei Gao, Ge Li, Huiming Zheng, Yuyang Wu, and Liang Xie. 2022. OpenPointCloud: An Open-Source Algorithm Library of Deep Learning Based Point Cloud Compression. In ACM International Conference on Multimedia. 7347--7350.

Digital Library

[12]

AVS Point Cloud Compression Working Group. September, 2022. Reference Software Algorithm Description of AVS Point Cloud Coding. Audio Video Standard, N3445, Online (September, 2022).

[13]

Gongde Guo, Hui Wang, and David Bell. 2003. KNN Model-Based Approach in Classification. In OTM Confederated International Conferences. Springer, 986--996.

[14]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity Mappings in Deep Residual Networks. In European Conference on Computer Vision. Springer, 630--645.

[15]

Lila Huang, Shenlong Wang, Kelvin Wong, Jerry Liu, and Raquel Urtasun. 2020. Octsqueeze: Octree-Structured Entropy Model for LiDAR Compression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1313--1323.

[16]

Tianxin Huang and Yong Liu. 2019. 3D Point Cloud Geometry Compression on Deep Learning. In ACM International Conference on Multimedia. 890--898.

Digital Library

[17]

Alireza Javaheri, Catarina Brites, Fernando Pereira, and Jo ao Ascenso. 2020. A Generalized Hausdorff Distance Based Quality Metric for Point Cloud Geometry. In International Conference on Quality of Multimedia Experience. IEEE, 1--6.

[18]

Davi Lazzarotto and Touradj Ebrahimi. 2023. Evaluating the Effect of Sparse Convolutions on Point Cloud Compression. In 11th European Workshop on Visual Information Processing. IEEE, 1--6.

[19]

Bojun Liu, Shanshan Li, Xihua Sheng, Li Li, and Dong Liu. 2023. Joint Optimized Point Cloud Compression for 3D Object Detection. In IEEE International Conference on Image Processing. IEEE, 1185--1189.

[20]

Kang Liu, Dong Liu, Li Li, Ning Yan, and Houqiang Li. 2021. Semantics-to-Signal Scalable Image Compression with Learned Revertible Representations. International Journal of Computer Vision, Vol. 129, 9 (2021), 2605--2621.

Digital Library

[21]

Lei Liu, Zhihao Hu, and Jing Zhang. 2023. PCHM-Net: A New Point Cloud Compression Framework for Both Human Vision and Machine Vision. In IEEE International Conference on Multimedia and Expo. IEEE, 1997--2002.

[22]

Ze Liu, Zheng Zhang, Yue Cao, Han Hu, and Xin Tong. 2021. Group-Free 3D Object Detection via Transformers. In IEEE/CVF International Conference on Computer Vision. 2949--2958.

[23]

Charles Loop, Qin Cai, S Orts Escolano, and Philip A Chou. January, 2016. Microsoft Voxelized Upper Bodies A Voxelized Point Cloud Dataset. MPEG, M38673, Geneva (January, 2016).

[24]

Xiaoqi Ma, Yingzhan Xu, Xinfeng Zhang, Lv Tang, Kai Zhang, and Li Zhang. 2023. HM-PCGC: A Human-Machine Balanced Point Cloud Geometry Compression Scheme. In IEEE International Conference on Image Processing. IEEE, 2265--2269.

[25]

Yi Ma, Yongqi Zhai, Chunhui Yang, Jiayu Yang, Ruofan Wang, Jing Zhou, Kai Li, Ying Chen, and Ronggang Wang. 2021. Variable Rate ROI Image Compression Optimized for Visual Quality. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1936--1940.

[26]

Khaled Mammou, Philip A. Chou, David Flynn, Maja Krivokuća, Ohji Nakagami, and Toshiyasu Sugio. January, 2019. G-PCC Codec Description v2. MPEG, N18189, Marrakech (January, 2019).

[27]

Jiahao Pang, Muhammad Asad Lodhi, and Dong Tian. 2022. GRASP-Net: Geometric Residual Analysis and Synthesis for Point Cloud Compression. In International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 11--19.

Digital Library

[28]

Aaditya Prakash, Nick Moran, Solomon Garber, Antonella DiLillo, and James Storer. 2017. Semantic Perceptual Image Compression using Deep Convolution Networks. In Data Compression Conference. IEEE, 250--259.

[29]

Charles R Qi, Or Litany, Kaiming He, and Leonidas J Guibas. 2019. Deep Hough Voting for 3D Object Detection in Point Clouds. In IEEE/CVF International Conference on Computer Vision. 9277--9286.

[30]

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In IEEE conference on computer vision and pattern recognition. 652--660.

[31]

Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Advances in neural information processing systems, Vol. 30 (2017).

[32]

Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2019. Learning Convolutional Transforms for Lossy Point Cloud Geometry Compression. In IEEE international conference on image processing. IEEE, 4320--4324.

[33]

Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2020. Improved Deep Point Cloud Geometry Compression. In IEEE International Workshop on Multimedia Signal Processing. IEEE, 1--6.

[34]

Zizheng Que, Guo Lu, and Dong Xu. 2021. Voxelcontext-Net: An Octree Based Framework for Point Cloud Compression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6042--6051.

[35]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention. Springer, 234--241.

[36]

Rui Song, Chunyang Fu, Shan Liu, and Ge Li. 2023. Efficient Hierarchical Entropy Model for Learned Point Cloud Compression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14368--14377.

[37]

Shuran Song, Samuel P Lichtenberg, and Jianxiong Xiao. 2015. SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite. In IEEE Conference on Computer Vision and Pattern Recognition. 567--576.

[38]

Naftali Tishby, Fernando C Pereira, and William Bialek. 2000. The Information Bottleneck Method. arXiv preprint physics/0004057 (2000).

[39]

Mateen Ulhaq and Ivan V Bajić. 2023. Learned Point Cloud Compression for Classification. arXiv preprint arXiv:2308.05959 (2023).

[40]

Jianqiang Wang, Dandan Ding, Zhu Li, Xiaoxing Feng, Chuntong Cao, and Zhan Ma. 2021. Sparse Tensor-based Multiscale Representation for Point Cloud Geometry Compression. ArXiv Preprint ArXiv:2111.10633 (2021).

[41]

Jianqiang Wang, Dandan Ding, Zhu Li, and Zhan Ma. 2021. Multiscale Point Cloud Geometry Compression. In Data Compression Conference. IEEE, 73--82.

[42]

Jianqiang Wang, Hao Zhu, Haojie Liu, and Zhan Ma. 2021. Lossy Point Cloud Geometry Compression via End-to-End Learning. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 31, 12 (2021), 4909--4923.

[43]

Louis Wiesmann, Andres Milioto, Xieyuanli Chen, Cyrill Stachniss, and Jens Behley. 2021. Deep Compression for Dense Point Cloud Maps. IEEE Robotics and Automation Letters, Vol. 6, 2 (2021), 2060--2067.

[44]

Yuyang Wu, Liang Xie, Shangkun Sun, Wei Gao, and Yiqiang Yan. 2024. Adaptive Intra Period Size for Deep Learning-based Screen Content Video Coding. In IEEE International Conference on Multimedia and Expo Workshops. IEEE.

[45]

Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, and Xiaoou Tang. 2015. 3D Shapenets: A Deep Representation for Volumetric Shapes. In IEEE Conference on Computer Vision and Pattern Recognition. 1912--1920.

[46]

Liang Xie and Wei Gao. 2024. LearningPCC: A PyTorch Library for Learning-Based Point Cloud Compression. In Proceedings of the 32nd ACM International Conference on Multimedia.

Digital Library

[47]

Liang Xie and Wei Gao. 2024. PCHMVision: An Open-Source Library of Point Cloud Compression for Human and Machine Vision. In Proceedings of the 32nd ACM International Conference on Multimedia.

Digital Library

[48]

Liang Xie, Wei Gao, and Songlin Fan. 2024. PDNet: Parallel Dual-branch Network for Point Cloud Geometry Compression and Analysis. In 2024 Data Compression Conference. 596--596. https://doi.org/10.1109/DCC58796.2024.00113

[49]

Liang Xie, Wei Gao, and Huiming Zheng. 2022. End-to-End Point Cloud Geometry Compression and Analysis with Sparse Tensor. In International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 27--32.

[50]

Liang Xie, Wei Gao, Huiming Zheng, and Ge Li. 2024. SPCGC: Scalable Point Cloud Geometry Compression for Machine Vision. In IEEE International Conference on Robotics and Automation. 594--595.

[51]

Liang Xie, Wei Gao, Huiming Zheng, and Hua Ye. 2024. Semantic-Aware Visual Decomposition for Point Cloud Geometry Compression. In 2024 Data Compression Conference. 595--595. https://doi.org/10.1109/DCC58796.2024.00112

[52]

Liang Xie, MengHao Hu, and XinBei Bai. 2022. Online Improved Vehicle Tracking Accuracy via Unsupervised Route Generation. In IEEE 34th International Conference on Tools with Artificial Intelligence. IEEE, 788--792.

[53]

Liang Xie, MengHao Hu, XinBei Bai, and WenKe Huang. 2022. Towards Hardware-Friendly and Robust Facial Landmark Detection Method. In International Conference on Neural Information Processing. Springer, 432--444.

[54]

Liang Xie, Xingming Mu, and Wei Gao. 2024 d. PKU-DPCC: A New Dataset for Dynamic Point Cloud Compression. In APSIPA Transactions on Signal and Information Processing.

[55]

Jiacheng Xu, Zhijun Fang, Yongbin Gao, Siwei Ma, Yaochu Jin, Heng Zhou, and Anjie Wang. 2021. Point AE-DCGAN: A Deep Learning Model for 3D Point Cloud Lossy Geometry Compression. In Data Compression Conference. IEEE, 379--379.

[56]

Wei Yan, Shan Liu, Thomas H Li, Zhu Li, Ge Li, et al. 2019. Deep Autoencoder-based Lossy Geometry Compression for Point Clouds. ArXiv Preprint ArXiv:1905.03691 (2019).

[57]

Wenhan Yang, Haofeng Huang, Yueyu Hu, Ling-Yu Duan, and Jiaying Liu. 2021. Video Coding for Machine: Compact Visual Representation Compression for Intelligent Collaborative Analytics. arXiv preprint arXiv:2110.09241 (2021).

[58]

Vladyslav Zakharchenko. January, 2019. V-PCC Codec Description. MPEG, N18190, Marrakech (January, 2019).

Cited By

Gao WLi GCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Point Cloud Compression, Enhancement and Applications: From 3D Perception to Large ModelsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3689172(11292-11293)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3689172
Gao WLi GGao WLi G(2024)Open-Source Projects for 3D Point CloudsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_9(255-272)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_9
Gao WLi GGao WLi G(2024)Point Cloud-Language Multi-modal LearningDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_8(227-254)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_8
Show More Cited By

Index Terms

ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision
1. Theory of computation
  1. Design and analysis of algorithms
    1. Data structures design and analysis
      1. Data compression

Recommendations

PCHMVision: An Open-Source Library of Point Cloud Compression for Human and Machine Vision
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

In today's era, three-dimensional point cloud data is not only voluminous but also widely applicable. Therefore, data compression has become a crucial step prior to processing. Although existing 3D point cloud compression techniques primarily focus on ...
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression
MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in Asia

The ever-increasing 3D application makes the point cloud compression unprecedentedly important and needed. In this paper, we propose a patch-based compression process using deep learning, focusing on the lossy point cloud geometry compression. Unlike ...
Fast Point Cloud Geometry Compression with Context-Based Residual Coding and INR-Based Refinement
Computer Vision – ECCV 2024
Abstract
Compressing a set of unordered points is far more challenging than compressing images/videos of regular sample grids, because of the difficulties in characterizing neighboring relations in an irregular layout of points. Many researchers resort to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Guangdong Province Pearl River Talent Program
Natural Science Foundation of China
CAAI-MindSpore Open Fund, developed on OpenI Community
Shenzhen Science and Technology Program
The Major Key Project of PCL
Guangdong Basic and Applied Basic Research Foundation

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
162
Total Downloads

Downloads (Last 12 months)162
Downloads (Last 6 weeks)34

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gao WLi GCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Point Cloud Compression, Enhancement and Applications: From 3D Perception to Large ModelsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3689172(11292-11293)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3689172
Gao WLi GGao WLi G(2024)Open-Source Projects for 3D Point CloudsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_9(255-272)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_9
Gao WLi GGao WLi G(2024)Point Cloud-Language Multi-modal LearningDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_8(227-254)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_8
Gao WLi GGao WLi G(2024)Point Cloud Pre-trained Models and Large ModelsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_7(195-225)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_7
Gao WLi GGao WLi G(2024)Deep-Learning-Based Point Cloud Analysis IIDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_6(163-193)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_6
Gao WLi GGao WLi G(2024)Deep-Learning-Based Point Cloud Analysis IDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_5(131-162)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_5
Gao WLi GGao WLi G(2024)Deep-Learning-Based Point Cloud Enhancement IIDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_4(99-130)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_4
Gao WLi GGao WLi G(2024)Deep-Learning-based Point Cloud Enhancement IDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_3(71-97)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_3
Gao WLi GGao WLi G(2024)Learning Basics for 3D Point CloudsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_2(29-70)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_2
Gao WLi GGao WLi G(2024)Future Work on Deep Learning-Based Point Cloud TechnologiesDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_11(301-315)Online publication date: 10-Oct-2024
https://doi.org/10.1007/978-981-97-9570-3_11
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten