OctPCGC-Net: Learning Octree-Structured Context Entropy Model for Point Cloud Geometry Compression

Wang, Xinjie; Wang, Hanyun; Xu, Ke; Wan, Jianwei; Guo, Yulan

doi:10.1007/978-981-99-8432-9_28

Xinjie Wang¹⁵,
Hanyun Wang¹⁶,
Ke Xu¹⁵,
Jianwei Wan¹⁵ &
…
Yulan Guo¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14426))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

569 Accesses

Abstract

In Point Cloud Geometry Compression (PCGC), an accurate context entropy model is necessary to reduce spatial redundancy. The octree-based auto-regressive context entropy model has great potential to explore large-scale context dependency. However, over-concentrated attention maps and instability of training process usually occur in large-scale context entropy models. To address these problems, we propose a novel OctPCGC-Net for PCGC based on deep learning framework. Specifically, we introduce a scaled cosine attention method in a large-scale context entropy model to alleviate the problem of over-concentrated attention maps caused by self-attention mechanism, thereby improving the model's prediction accuracy. In order to improve the stability of model training, we further introduce a residual post normalization strategy to alleviate the phenomenon of accumulating activation scores as the network deepens, which makes the activation scores of different layers smoother and more stable. Experimental results show that compared with the state-of-the-art large-scale auto-regressive entropy models, our method saves 6.3%, 8.7%, and 6.3% bitrates in terms of Bjøntegaard Delta Bit Rate (BDBR) on benchmark datasets SemanticKITTI, 8iVFB, and Owlii, respectively. Additionally, our method also achieves higher reconstruction quality (D1 PSNR) and smaller Chamfer distance (CD) under similar bits per point (BPP) on SemanticKITTI dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Biswas, S., Liu, J., Wong, K., Wang, S. L., Urtasun, R.: MuSCLE: multi sweep compression of LiDAR using deep entropy models. In: Advances in Neural Information Processing Systems, pp. 22170–22181 (2020)
Google Scholar
Huang, L.L., Wang, S.L., Wong, K., Liu, J., Urtasun, R.: OctSqueeze: octree-structured entropy model for LiDAR compression. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1313–1323 (2020)
Google Scholar
Que, Z.Z., Lu, G., Xu, D.: VoxelContext-Net: an octree based framework for point cloud compression. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6042–6051 (2021)
Google Scholar
Nguyen, D.T., Quach, M., Valenzise, G., Duhamel, P.: Lossless coding of point cloud geometry using a deep generative model. IEEE Trans. Circuits Syst. Video Technol. 31(12), 4617–4629 (2021)
Article Google Scholar
Wang, J.Q., Zhu, H., Liu, H.J., Ma, Z.: Lossy point cloud geometry compression via end-to-end learning. IEEE Trans. Circuits Syst. Video Technol. 31(12), 4909–4923 (2021)
Article Google Scholar
Fu, C.Y., Li, G., Song, R., Gao, W., Liu, S.: OctAttention: Octree-based large-scale contexts model for point cloud compression. In: AAAI Conference on Artificial Intelligence, pp. 625–633 (2022)
Google Scholar
He, Y., Ren, X.L., Tang, D.H., Zhang, Y.D., Xue, X.Y., Fu, Y.W.: Density-preserving deep point cloud compression. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2333–2342 (2022)
Google Scholar
Fang, G.C., Hu, Q.Y., Wang, H.Y., Xu, Y.L., Guo, Y.L.: 3DAC: learning attribute compression for point clouds. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14799–14808 (2022)
Google Scholar
Wang, J.Q., Ding, D.D., Li, Z., Feng, X.X., Cao, C.T., Ma, Z.: Sparse tensor-based multiscale representation for point cloud geometry compression. IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 9055–9071 (2023)
Google Scholar
Witten, I.H., Neal, R.M., Cleary, J.G.: Arithmetic coding for data compression. Commun. ACM 30(6), 520–540 (1987)
Article Google Scholar
Behley, J., et al.: SemanticKITTI: a dataset for semantic scene understanding of LiDAR sequences. In: IEEE/CVF International Conference on Computer Vision, pp. 9296–9306 (2019)
Google Scholar
Xu, Y., Lu, Y., Wen, Z.Y.: Owlii Dynamic human mesh sequence dataset. ISO/IEC JTC1/SC29/WG11 m41658, 120th MPEG Meeting, Macau, October (2017)
Google Scholar
Eugene, D., Bob, H., Taos, M., Philip, A.C.: 8i Voxelized Full Bodies - A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006 (2017)
Google Scholar
Schwarz, S., et al.: Emerging MPEG standards for point cloud compression. IEEE J. Emerg. Sel. Topics Circuits Syst. 9(1), 133–148 (2019)
Article Google Scholar
Huang, T.X., Liu, Y.: 3D point cloud geometry compression on deep learning. In: ACM International Conference on Multimedia, pp. 890–898 (2019)
Google Scholar
Quach, M., Valenzise, G., Dufaux, F.: Learning convolutional transforms for lossy point cloud geometry compression. In: IEEE International Conference on Image Processing, pp. 4320–4324 (2019)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)
Google Scholar
Kaya, E.C., Tabus, I.: Neural network modeling of probabilities for coding the octree representation of point clouds. In: IEEE International Workshop on Multimedia Signal Processing, pp. 1–6 (2021)
Google Scholar
Queiroz, R.L.D., Chou, P.A.: Compression of 3D point clouds using a region-adaptive hierarchical transform. IEEE Trans. Image Process. 25(8), 3947–3956 (2016)
Article MathSciNet Google Scholar
Queiroz, R.L.D., Chou, P.A.: Transform coding for point clouds using a gaussian process model. IEEE Trans. Image Process. 26(7), 3507–3517 (2017)
Article MathSciNet Google Scholar
Zhao, H.S., Jiang, L., Jia, J.Y., Torr, P.H.S., Koltun, V.: Point transformer. In: IEEE/CVF International Conference on Computer Vision, pp. 16239–16248 (2021)
Google Scholar
Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: PCT: point cloud transformer. Comput. Vis. Media 7(2), 187–199 (2021)
Article Google Scholar
Charles, L., Qin, C., Sergio, O.E., Philip, A.C.: Microsoft voxelized upper bodies - a voxelized point cloud dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m38673/M7201 (2016)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)
Google Scholar
Liu, Z., et al.: Swin Transformer V2: scaling up capacity and resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11999–12009 (2022)
Google Scholar
Google: Draco 3d data compression (2017). https://github.com/google/draco
Devillers, O., Gandoin, P.M.: Geometric compression for interactive transmission. In: IEEE Visualization, pp. 319–326 (2000)
Google Scholar
Bjontegaard, G.: Calculation of average PSNR differences between rd-curves. In: ITU-T SG 16/Q6, 13th VCEG Meeting. document VCEG-M33 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Electronic Science and Technology, National University of Defense Technology, Changsha, China
Xinjie Wang, Ke Xu, Jianwei Wan & Yulan Guo
School of Surveying and Mapping, Information Engineering University, Zhengzhou, China
Hanyun Wang

Authors

Xinjie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hanyun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jianwei Wan
View author publications
You can also search for this author in PubMed Google Scholar
Yulan Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanyun Wang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Wang, H., Xu, K., Wan, J., Guo, Y. (2024). OctPCGC-Net: Learning Octree-Structured Context Entropy Model for Point Cloud Geometry Compression. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14426. Springer, Singapore. https://doi.org/10.1007/978-981-99-8432-9_28

Download citation

DOI: https://doi.org/10.1007/978-981-99-8432-9_28
Published: 24 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8431-2
Online ISBN: 978-981-99-8432-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics