skip to main content
10.1145/3664647.3681301acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision

Published: 28 October 2024 Publication History

Abstract

Point cloud data is pivotal in applications like autonomous driving, virtual reality, and robotics. However, its substantial volume poses significant challenges in storage and transmission. In order to obtain a high compression ratio, crucial semantic details usually confront severe damage, leading to difficulties in guaranteeing the accuracy of downstream tasks. To tackle this problem, we are the first to introduce a novel Region of Interest (ROI)-guided Point Cloud Geometry Compression (RPCGC) method for human and machine vision. Our framework employs a dual-branch parallel structure, where the base layer encodes and decodes a simplified version of the point cloud, and the enhancement layer refines this by focusing on geometry details. Furthermore, the residual information of the enhancement layer undergoes refinement through an ROI prediction network. This network generates mask information, which is then incorporated into the residuals, serving as a strong supervision signal. Additionally, we intricately apply these mask details in the Rate-Distortion (RD) optimization process, with each point weighted in the distortion calculation. Our loss function includes RD loss and detection loss to better guide point cloud encoding for the machine. Experiment results demonstrate that RPCGC achieves exceptional compression performance and better detection accuracy (10% gain) than some learning-based compression methods at high bitrates in ScanNet and SUN RGB-D datasets.

References

[1]
Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, Yaowei Wang, Xiangyang Ji, and Wen Gao. 2022. Towards End-to-End Image Compression and Analysis with Transformers. In AAAI Conference on Artificial Intelligence, Vol. 36. 104--112.
[2]
Chunlei Cai, Li Chen, Xiaoyun Zhang, and Zhiyong Gao. 2019. End-to-End Optimized ROI Image Compression. IEEE Transactions on Image Processing, Vol. 29 (2019), 3442--3457.
[3]
Zhengxue Cheng, Heming Sun, and Masaru Takeuchi. 2020. Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7939--7948.
[4]
Christopher Choy, Jun Young Gwak, and Silvio Savarese. 2019. 4D Spatio-Temporal Convnets: Minkowski Convolutional Neural Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3075--3084.
[5]
Angela Dai, Angel X Chang, Manolis Savva, and Maciej Halber. 2017. Scannet: Richly-Annotated 3D Reconstructions of Indoor Scenes. In IEEE Conference on Computer Vision and Pattern Recognition. 5828--5839.
[6]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv preprint arXiv:2010.11929 (2020).
[7]
Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In IEEE Conference on Computer Vision and Pattern Recognition. 605--613.
[8]
Songlin Fan, Wei Gao, and Ge Li. 2022. Salient object detection for point clouds. In European Conference on Computer Vision. Springer, 1--19.
[9]
Chunyang Fu, Ge Li, Rui Song, Wei Gao, and Shan Liu. 2022. OctAttention: Octree-based Large-scale Contexts Model for Point Cloud Compression. ArXiv Preprint ArXiv:2202.06028 (2022).
[10]
Linyao Gao, Tingyu Fan, Jianqiang Wan, Yiling Xu, Jun Sun, and Zhan Ma. 2021. Point Cloud Geometry Compression Via Neural Graph Sampling. In IEEE International Conference on Image Processing. IEEE, 3373--3377.
[11]
Wei Gao, Ge Li, Huiming Zheng, Yuyang Wu, and Liang Xie. 2022. OpenPointCloud: An Open-Source Algorithm Library of Deep Learning Based Point Cloud Compression. In ACM International Conference on Multimedia. 7347--7350.
[12]
AVS Point Cloud Compression Working Group. September, 2022. Reference Software Algorithm Description of AVS Point Cloud Coding. Audio Video Standard, N3445, Online (September, 2022).
[13]
Gongde Guo, Hui Wang, and David Bell. 2003. KNN Model-Based Approach in Classification. In OTM Confederated International Conferences. Springer, 986--996.
[14]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity Mappings in Deep Residual Networks. In European Conference on Computer Vision. Springer, 630--645.
[15]
Lila Huang, Shenlong Wang, Kelvin Wong, Jerry Liu, and Raquel Urtasun. 2020. Octsqueeze: Octree-Structured Entropy Model for LiDAR Compression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1313--1323.
[16]
Tianxin Huang and Yong Liu. 2019. 3D Point Cloud Geometry Compression on Deep Learning. In ACM International Conference on Multimedia. 890--898.
[17]
Alireza Javaheri, Catarina Brites, Fernando Pereira, and Jo ao Ascenso. 2020. A Generalized Hausdorff Distance Based Quality Metric for Point Cloud Geometry. In International Conference on Quality of Multimedia Experience. IEEE, 1--6.
[18]
Davi Lazzarotto and Touradj Ebrahimi. 2023. Evaluating the Effect of Sparse Convolutions on Point Cloud Compression. In 11th European Workshop on Visual Information Processing. IEEE, 1--6.
[19]
Bojun Liu, Shanshan Li, Xihua Sheng, Li Li, and Dong Liu. 2023. Joint Optimized Point Cloud Compression for 3D Object Detection. In IEEE International Conference on Image Processing. IEEE, 1185--1189.
[20]
Kang Liu, Dong Liu, Li Li, Ning Yan, and Houqiang Li. 2021. Semantics-to-Signal Scalable Image Compression with Learned Revertible Representations. International Journal of Computer Vision, Vol. 129, 9 (2021), 2605--2621.
[21]
Lei Liu, Zhihao Hu, and Jing Zhang. 2023. PCHM-Net: A New Point Cloud Compression Framework for Both Human Vision and Machine Vision. In IEEE International Conference on Multimedia and Expo. IEEE, 1997--2002.
[22]
Ze Liu, Zheng Zhang, Yue Cao, Han Hu, and Xin Tong. 2021. Group-Free 3D Object Detection via Transformers. In IEEE/CVF International Conference on Computer Vision. 2949--2958.
[23]
Charles Loop, Qin Cai, S Orts Escolano, and Philip A Chou. January, 2016. Microsoft Voxelized Upper Bodies A Voxelized Point Cloud Dataset. MPEG, M38673, Geneva (January, 2016).
[24]
Xiaoqi Ma, Yingzhan Xu, Xinfeng Zhang, Lv Tang, Kai Zhang, and Li Zhang. 2023. HM-PCGC: A Human-Machine Balanced Point Cloud Geometry Compression Scheme. In IEEE International Conference on Image Processing. IEEE, 2265--2269.
[25]
Yi Ma, Yongqi Zhai, Chunhui Yang, Jiayu Yang, Ruofan Wang, Jing Zhou, Kai Li, Ying Chen, and Ronggang Wang. 2021. Variable Rate ROI Image Compression Optimized for Visual Quality. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1936--1940.
[26]
Khaled Mammou, Philip A. Chou, David Flynn, Maja Krivokuća, Ohji Nakagami, and Toshiyasu Sugio. January, 2019. G-PCC Codec Description v2. MPEG, N18189, Marrakech (January, 2019).
[27]
Jiahao Pang, Muhammad Asad Lodhi, and Dong Tian. 2022. GRASP-Net: Geometric Residual Analysis and Synthesis for Point Cloud Compression. In International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 11--19.
[28]
Aaditya Prakash, Nick Moran, Solomon Garber, Antonella DiLillo, and James Storer. 2017. Semantic Perceptual Image Compression using Deep Convolution Networks. In Data Compression Conference. IEEE, 250--259.
[29]
Charles R Qi, Or Litany, Kaiming He, and Leonidas J Guibas. 2019. Deep Hough Voting for 3D Object Detection in Point Clouds. In IEEE/CVF International Conference on Computer Vision. 9277--9286.
[30]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In IEEE conference on computer vision and pattern recognition. 652--660.
[31]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Advances in neural information processing systems, Vol. 30 (2017).
[32]
Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2019. Learning Convolutional Transforms for Lossy Point Cloud Geometry Compression. In IEEE international conference on image processing. IEEE, 4320--4324.
[33]
Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2020. Improved Deep Point Cloud Geometry Compression. In IEEE International Workshop on Multimedia Signal Processing. IEEE, 1--6.
[34]
Zizheng Que, Guo Lu, and Dong Xu. 2021. Voxelcontext-Net: An Octree Based Framework for Point Cloud Compression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6042--6051.
[35]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention. Springer, 234--241.
[36]
Rui Song, Chunyang Fu, Shan Liu, and Ge Li. 2023. Efficient Hierarchical Entropy Model for Learned Point Cloud Compression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14368--14377.
[37]
Shuran Song, Samuel P Lichtenberg, and Jianxiong Xiao. 2015. SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite. In IEEE Conference on Computer Vision and Pattern Recognition. 567--576.
[38]
Naftali Tishby, Fernando C Pereira, and William Bialek. 2000. The Information Bottleneck Method. arXiv preprint physics/0004057 (2000).
[39]
Mateen Ulhaq and Ivan V Bajić. 2023. Learned Point Cloud Compression for Classification. arXiv preprint arXiv:2308.05959 (2023).
[40]
Jianqiang Wang, Dandan Ding, Zhu Li, Xiaoxing Feng, Chuntong Cao, and Zhan Ma. 2021. Sparse Tensor-based Multiscale Representation for Point Cloud Geometry Compression. ArXiv Preprint ArXiv:2111.10633 (2021).
[41]
Jianqiang Wang, Dandan Ding, Zhu Li, and Zhan Ma. 2021. Multiscale Point Cloud Geometry Compression. In Data Compression Conference. IEEE, 73--82.
[42]
Jianqiang Wang, Hao Zhu, Haojie Liu, and Zhan Ma. 2021. Lossy Point Cloud Geometry Compression via End-to-End Learning. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 31, 12 (2021), 4909--4923.
[43]
Louis Wiesmann, Andres Milioto, Xieyuanli Chen, Cyrill Stachniss, and Jens Behley. 2021. Deep Compression for Dense Point Cloud Maps. IEEE Robotics and Automation Letters, Vol. 6, 2 (2021), 2060--2067.
[44]
Yuyang Wu, Liang Xie, Shangkun Sun, Wei Gao, and Yiqiang Yan. 2024. Adaptive Intra Period Size for Deep Learning-based Screen Content Video Coding. In IEEE International Conference on Multimedia and Expo Workshops. IEEE.
[45]
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, and Xiaoou Tang. 2015. 3D Shapenets: A Deep Representation for Volumetric Shapes. In IEEE Conference on Computer Vision and Pattern Recognition. 1912--1920.
[46]
Liang Xie and Wei Gao. 2024. LearningPCC: A PyTorch Library for Learning-Based Point Cloud Compression. In Proceedings of the 32nd ACM International Conference on Multimedia.
[47]
Liang Xie and Wei Gao. 2024. PCHMVision: An Open-Source Library of Point Cloud Compression for Human and Machine Vision. In Proceedings of the 32nd ACM International Conference on Multimedia.
[48]
Liang Xie, Wei Gao, and Songlin Fan. 2024. PDNet: Parallel Dual-branch Network for Point Cloud Geometry Compression and Analysis. In 2024 Data Compression Conference. 596--596. https://doi.org/10.1109/DCC58796.2024.00113
[49]
Liang Xie, Wei Gao, and Huiming Zheng. 2022. End-to-End Point Cloud Geometry Compression and Analysis with Sparse Tensor. In International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 27--32.
[50]
Liang Xie, Wei Gao, Huiming Zheng, and Ge Li. 2024. SPCGC: Scalable Point Cloud Geometry Compression for Machine Vision. In IEEE International Conference on Robotics and Automation. 594--595.
[51]
Liang Xie, Wei Gao, Huiming Zheng, and Hua Ye. 2024. Semantic-Aware Visual Decomposition for Point Cloud Geometry Compression. In 2024 Data Compression Conference. 595--595. https://doi.org/10.1109/DCC58796.2024.00112
[52]
Liang Xie, MengHao Hu, and XinBei Bai. 2022. Online Improved Vehicle Tracking Accuracy via Unsupervised Route Generation. In IEEE 34th International Conference on Tools with Artificial Intelligence. IEEE, 788--792.
[53]
Liang Xie, MengHao Hu, XinBei Bai, and WenKe Huang. 2022. Towards Hardware-Friendly and Robust Facial Landmark Detection Method. In International Conference on Neural Information Processing. Springer, 432--444.
[54]
Liang Xie, Xingming Mu, and Wei Gao. 2024 d. PKU-DPCC: A New Dataset for Dynamic Point Cloud Compression. In APSIPA Transactions on Signal and Information Processing.
[55]
Jiacheng Xu, Zhijun Fang, Yongbin Gao, Siwei Ma, Yaochu Jin, Heng Zhou, and Anjie Wang. 2021. Point AE-DCGAN: A Deep Learning Model for 3D Point Cloud Lossy Geometry Compression. In Data Compression Conference. IEEE, 379--379.
[56]
Wei Yan, Shan Liu, Thomas H Li, Zhu Li, Ge Li, et al. 2019. Deep Autoencoder-based Lossy Geometry Compression for Point Clouds. ArXiv Preprint ArXiv:1905.03691 (2019).
[57]
Wenhan Yang, Haofeng Huang, Yueyu Hu, Ling-Yu Duan, and Jiaying Liu. 2021. Video Coding for Machine: Compact Visual Representation Compression for Intelligent Collaborative Analytics. arXiv preprint arXiv:2110.09241 (2021).
[58]
Vladyslav Zakharchenko. January, 2019. V-PCC Codec Description. MPEG, N18190, Marrakech (January, 2019).

Cited By

View all
  • (2024)Point Cloud Compression, Enhancement and Applications: From 3D Perception to Large ModelsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3689172(11292-11293)Online publication date: 28-Oct-2024
  • (2024)Open-Source Projects for 3D Point CloudsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_9(255-272)Online publication date: 10-Oct-2024
  • (2024)Point Cloud-Language Multi-modal LearningDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_8(227-254)Online publication date: 10-Oct-2024
  • Show More Cited By

Index Terms

  1. ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. human and machine vision.
    2. point cloud geometry compression

    Qualifiers

    • Research-article

    Funding Sources

    • Guangdong Province Pearl River Talent Program
    • Natural Science Foundation of China
    • CAAI-MindSpore Open Fund, developed on OpenI Community
    • Shenzhen Science and Technology Program
    • The Major Key Project of PCL
    • Guangdong Basic and Applied Basic Research Foundation

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)162
    • Downloads (Last 6 weeks)34
    Reflects downloads up to 14 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Point Cloud Compression, Enhancement and Applications: From 3D Perception to Large ModelsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3689172(11292-11293)Online publication date: 28-Oct-2024
    • (2024)Open-Source Projects for 3D Point CloudsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_9(255-272)Online publication date: 10-Oct-2024
    • (2024)Point Cloud-Language Multi-modal LearningDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_8(227-254)Online publication date: 10-Oct-2024
    • (2024)Point Cloud Pre-trained Models and Large ModelsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_7(195-225)Online publication date: 10-Oct-2024
    • (2024)Deep-Learning-Based Point Cloud Analysis IIDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_6(163-193)Online publication date: 10-Oct-2024
    • (2024)Deep-Learning-Based Point Cloud Analysis IDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_5(131-162)Online publication date: 10-Oct-2024
    • (2024)Deep-Learning-Based Point Cloud Enhancement IIDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_4(99-130)Online publication date: 10-Oct-2024
    • (2024)Deep-Learning-based Point Cloud Enhancement IDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_3(71-97)Online publication date: 10-Oct-2024
    • (2024)Learning Basics for 3D Point CloudsDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_2(29-70)Online publication date: 10-Oct-2024
    • (2024)Future Work on Deep Learning-Based Point Cloud TechnologiesDeep Learning for 3D Point Clouds10.1007/978-981-97-9570-3_11(301-315)Online publication date: 10-Oct-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media