Abstract
In Point Cloud Geometry Compression (PCGC), an accurate context entropy model is necessary to reduce spatial redundancy. The octree-based auto-regressive context entropy model has great potential to explore large-scale context dependency. However, over-concentrated attention maps and instability of training process usually occur in large-scale context entropy models. To address these problems, we propose a novel OctPCGC-Net for PCGC based on deep learning framework. Specifically, we introduce a scaled cosine attention method in a large-scale context entropy model to alleviate the problem of over-concentrated attention maps caused by self-attention mechanism, thereby improving the model's prediction accuracy. In order to improve the stability of model training, we further introduce a residual post normalization strategy to alleviate the phenomenon of accumulating activation scores as the network deepens, which makes the activation scores of different layers smoother and more stable. Experimental results show that compared with the state-of-the-art large-scale auto-regressive entropy models, our method saves 6.3%, 8.7%, and 6.3% bitrates in terms of Bjøntegaard Delta Bit Rate (BDBR) on benchmark datasets SemanticKITTI, 8iVFB, and Owlii, respectively. Additionally, our method also achieves higher reconstruction quality (D1 PSNR) and smaller Chamfer distance (CD) under similar bits per point (BPP) on SemanticKITTI dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Biswas, S., Liu, J., Wong, K., Wang, S. L., Urtasun, R.: MuSCLE: multi sweep compression of LiDAR using deep entropy models. In: Advances in Neural Information Processing Systems, pp. 22170–22181 (2020)
Huang, L.L., Wang, S.L., Wong, K., Liu, J., Urtasun, R.: OctSqueeze: octree-structured entropy model for LiDAR compression. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1313–1323 (2020)
Que, Z.Z., Lu, G., Xu, D.: VoxelContext-Net: an octree based framework for point cloud compression. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6042–6051 (2021)
Nguyen, D.T., Quach, M., Valenzise, G., Duhamel, P.: Lossless coding of point cloud geometry using a deep generative model. IEEE Trans. Circuits Syst. Video Technol. 31(12), 4617–4629 (2021)
Wang, J.Q., Zhu, H., Liu, H.J., Ma, Z.: Lossy point cloud geometry compression via end-to-end learning. IEEE Trans. Circuits Syst. Video Technol. 31(12), 4909–4923 (2021)
Fu, C.Y., Li, G., Song, R., Gao, W., Liu, S.: OctAttention: Octree-based large-scale contexts model for point cloud compression. In: AAAI Conference on Artificial Intelligence, pp. 625–633 (2022)
He, Y., Ren, X.L., Tang, D.H., Zhang, Y.D., Xue, X.Y., Fu, Y.W.: Density-preserving deep point cloud compression. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2333–2342 (2022)
Fang, G.C., Hu, Q.Y., Wang, H.Y., Xu, Y.L., Guo, Y.L.: 3DAC: learning attribute compression for point clouds. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14799–14808 (2022)
Wang, J.Q., Ding, D.D., Li, Z., Feng, X.X., Cao, C.T., Ma, Z.: Sparse tensor-based multiscale representation for point cloud geometry compression. IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 9055–9071 (2023)
Witten, I.H., Neal, R.M., Cleary, J.G.: Arithmetic coding for data compression. Commun. ACM 30(6), 520–540 (1987)
Behley, J., et al.: SemanticKITTI: a dataset for semantic scene understanding of LiDAR sequences. In: IEEE/CVF International Conference on Computer Vision, pp. 9296–9306 (2019)
Xu, Y., Lu, Y., Wen, Z.Y.: Owlii Dynamic human mesh sequence dataset. ISO/IEC JTC1/SC29/WG11 m41658, 120th MPEG Meeting, Macau, October (2017)
Eugene, D., Bob, H., Taos, M., Philip, A.C.: 8i Voxelized Full Bodies - A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006 (2017)
Schwarz, S., et al.: Emerging MPEG standards for point cloud compression. IEEE J. Emerg. Sel. Topics Circuits Syst. 9(1), 133–148 (2019)
Huang, T.X., Liu, Y.: 3D point cloud geometry compression on deep learning. In: ACM International Conference on Multimedia, pp. 890–898 (2019)
Quach, M., Valenzise, G., Dufaux, F.: Learning convolutional transforms for lossy point cloud geometry compression. In: IEEE International Conference on Image Processing, pp. 4320–4324 (2019)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)
Kaya, E.C., Tabus, I.: Neural network modeling of probabilities for coding the octree representation of point clouds. In: IEEE International Workshop on Multimedia Signal Processing, pp. 1–6 (2021)
Queiroz, R.L.D., Chou, P.A.: Compression of 3D point clouds using a region-adaptive hierarchical transform. IEEE Trans. Image Process. 25(8), 3947–3956 (2016)
Queiroz, R.L.D., Chou, P.A.: Transform coding for point clouds using a gaussian process model. IEEE Trans. Image Process. 26(7), 3507–3517 (2017)
Zhao, H.S., Jiang, L., Jia, J.Y., Torr, P.H.S., Koltun, V.: Point transformer. In: IEEE/CVF International Conference on Computer Vision, pp. 16239–16248 (2021)
Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: PCT: point cloud transformer. Comput. Vis. Media 7(2), 187–199 (2021)
Charles, L., Qin, C., Sergio, O.E., Philip, A.C.: Microsoft voxelized upper bodies - a voxelized point cloud dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m38673/M7201 (2016)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)
Liu, Z., et al.: Swin Transformer V2: scaling up capacity and resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11999–12009 (2022)
Google: Draco 3d data compression (2017). https://github.com/google/draco
Devillers, O., Gandoin, P.M.: Geometric compression for interactive transmission. In: IEEE Visualization, pp. 319–326 (2000)
Bjontegaard, G.: Calculation of average PSNR differences between rd-curves. In: ITU-T SG 16/Q6, 13th VCEG Meeting. document VCEG-M33 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, X., Wang, H., Xu, K., Wan, J., Guo, Y. (2024). OctPCGC-Net: Learning Octree-Structured Context Entropy Model for Point Cloud Geometry Compression. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14426. Springer, Singapore. https://doi.org/10.1007/978-981-99-8432-9_28
Download citation
DOI: https://doi.org/10.1007/978-981-99-8432-9_28
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8431-2
Online ISBN: 978-981-99-8432-9
eBook Packages: Computer ScienceComputer Science (R0)