skip to main content
10.1145/3704323.3704330acmotherconferencesArticle/Chapter ViewFull TextPublication PagesiccprConference Proceedingsconference-collections
research-article
Open access

Enhancing the Encoding Process in Point Cloud Completion

Published: 07 January 2025 Publication History

Abstract

Point cloud completion aims to reconstruct a complete 3D shape from sparse or incomplete point clouds. Existing methods focus more on the decoder design, generating more reasonable details from a latent code embedded by a relatively simple encoder with an aggregation operation. However, such simple encoding might ignore the important points in a local region. In this work, we propose to apply hierarchical self-distillation (HSD) as an alternative encoding plan to the conventional multi-scale grouping (MSG), to comprehensively generate more representative code for completion tasks. Different from commonly applied multi-scale grouping methods, our HSD approach learns the dissociation of high-level features among different encoding scales, which is particularly suitable for encoding the information of incomplete point clouds. Experiments show that our approach outperforms state-of-the-art learning-based methods by at least 1% by incorporating a self-aware loop within the network’s knowledge path.

1 Introduction

Figure 1:
Figure 1: Comparison between a vanilla encoder and our proposed HSD-based encoder. Traditionally, a set abstraction (SA) layer compresses the input point cloud into a latent code that represents the geometric shape. In our encoding module, multiple SAs from PointNet++ are utilized to formulate their corresponding shape codes, and the last SA serves as a guide to the preceding ones, with k1 < k2 < k3 to ensure that the last SA always has the largest perceptual field. The knowledge is thereafter propagated back in a self-supervised manner during the forward training process and is therefore dubbed as self-distillation. MLPs are employed to map the features to the same dimension (D4).
Studies on point clouds have started emerging in recent years, and the categories are divided primarily into reconstruction and understanding. PointNet [9] and PointNet++ [10] established the rudimentary foundation for point cloud understanding, while PCN [25] and PU-Net [24] formulated the pipeline for point cloud reconstruction (including completion and upsampling). Their common property is that the point cloud (either partial or complete) should always be encoded as a latent code with high-level information before the subsequent understanding or reconstruction modules. These pioneering efforts for encoding primarily apply simple aggregation operations. This is because of the order invariant prerequisite for encoding orderless point clouds, which is generally accomplished by a symmetric function for aggregation, e.g., max pooling, summation, or a global set abstraction layer. As such, the information loss [2] inevitably prevents these methods from better shape awareness or point recovery.
Generally, point cloud completion serves as a pre-processing step for the downstream tasks, including autonomous driving, virtual reality (VR), augmented reality (AR), industrial manufacturing, and robotic gripping, which might be primarily subject to local geometric details of the recovered object. In autonomous driving, it enhances the perception systems of self-driving cars by accurately reconstructing occluded or incomplete 3D data, ensuring safer navigation. In VR and AR, it enables more realistic and immersive experiences by reconstructing detailed virtual environments. In healthcare, it aids in creating complete 3D models from partial scans, facilitating better diagnosis and treatment planning. Additionally, in robotics, it improves the ability of robots to understand and interact with their environment, enabling precise manipulation and obstacle avoidance. These applications demonstrate the significant potential of deep learning-based point cloud completion in advancing technology and improving efficiency and safety in various domains.
In deep learning-based point cloud processing algorithms, including both understanding [10, 16] and reconstruction [11, 20, 24, 25], multi-scale grouping (MSG) and k-nearest neighbors grouping (kNN) are widely adopted techniques, to extract features in Euclidean or topological spaces. Nevertheless, we argue that naively encoding a point cloud, especially a partial cloud, can result in the loss of high-level shape details. In real-world scanning, the quality of a point cloud varies remarkably, which can be sparse, discontinuous, or incomplete. This significantly increases the need for encoding a consistent and robust representation from point clouds with missing regions.
In this paper, we aim to explore the importance of the encoding step in the point cloud completion task by proposing a hierarchical structure to be the encoder for preserving more geometrical details. Worth mentioning, our scheme can be extended to other point cloud tasks other than completion. Our intuition is that a simple encoding procedure, especially for a partial point cloud, is insufficient for recovering shape details. Specifically, to avoid a single point with a higher weight selected by the symmetric function dominating the representative role of a shape, we leverage multiple encoders that focus on different perceptual scales to form a more stable representation. This scheme, as demonstrated in our experiments, achieves better reconstruction and understanding performances when attached to the state-of-the-art (SOTA) methods. More importantly, it can be adopted for more point cloud networks as a plug-and-play module. An overall illustration is demonstrated in Figure 1, where a vanilla encoder constructed with a set abstraction (SA) layer and our hierarchical self-distillation (HSD) based encoder are compared. Similar to MSG, the proposed HSD also encodes features based on a hierarchical structure. However, our method differentiates the features across layers instead of concatenating them.
Therefore, we summarize our contributions as follows:
We propose a hierarchical structure concentrating on encoding a more representative latent code for point clouds. This plug-and-play module can be integrated into most point cloud processing networks as an encoder.
Experiments on the PCN dataset indicate that the proposed encoding module can simply elevate the baseline SOTA models, further improving the performance of the completion task.

2 Related Work

Figure 2:
Figure 2: The architecture of the proposed network. The encoder consists of cascaded aggregation modules, i.e., SA from PointNet++. The knowledge from the last aggregation module flows back, maximizing the mutual information among them. For simplicity of illustration, the MLPs responsible for mapping the aggregated features to the same length are not shown. The decoder is borrowed from SnowflakeNet directly, which is composed of three cascaded SPD modules.

2.1 Point Cloud Completion

2.1.1 Supervised.

As the first work of point cloud completion, PCN [25] tackles the problem by explicitly learning the mapping between the sparse input and the dense target in a coarse-to-fine manner. After that, a tremendous number of works followed this strategy. SnowflakeNet [20] implements the point-wise splitting module to learn the offsets between the upsampled points and ground truth points. A global attention module is leveraged to refine the points in each individual generation step. In the same manner, PMP-Net [17] and PMP-Net++ [18] gradually move points along the shortest paths to the target, learning the geometric relationships between the source and target point clouds. Another branch in this category is based on implicitly generating the points. FoldingNet [23] and AtlasNet [5] propose folding the 2D grids into various surfaces representing the particular parts of a shape. TopNet [13] proposes a tree decoder to generate points for representing various geometric structures. However, these works have focused heavily on the decoder design, such that the importance of the encoder has been seriously under-explored. For instance, SnowflakeNet applies the encoder in the same way as PointNet++ [10]. We argue that for partial or incomplete input, the encoder should be considered more carefully.

2.1.2 Un-supervised or self-supervised.

Recently, communities have tended to compensate for sparse point clouds without ground truth coordinates for supervision. ACL-SPC [6] proposed the first self-supervised scheme for point cloud completion. It employs a learnable generator to form a complete target, and a sampler is subsequently utilized to generate a set of synthetic partial point clouds as the input for further closed loops. Without the explicit viewpoint prior, VAPCNet [4] considers viewpoint representations as samples in contrastive learning, where a rotated viewpoint of the current scan is regarded as the positive sample and other novel viewpoints as negative ones. By dividing the partial point cloud into different partitions, P2C [3] first produces an output to match an unseen partition. Further constraints are applied, such as prediction re-encoding, to force the latent code to be consistent with the partial code. This finding inspires us to believe that the encoding from a partial point cloud can be insufficient for representing the latent code of that complete counterpart.

2.2 Point Cloud Encoding

Initially, PointNet [9] trivially encodes features with a global aggregation, which lacks the local awareness of each point. Hierarchy-based feature aggregation was first proposed by PointNet++ [10] and DGCNN [16], i.e., multi-scale grouping (MSG) and kNN, respectively, aiming at extracting more representative local features for classification and segmentation tasks. Such techniques are thereafter commonly utilized in reconstruction tasks, e.g., PU-Net [24] for upsampling, and SnowflakeNet [20] for completion. On the other track, Point Transformer [27] has led a trend of applying self-attention on point clouds, such as PU-Transformer [11] for upsampling and PointAttN [7] for completion. It first encodes local features by the subtractive relations between a query point and it’s neighbors, and then utilize transformer to facilitate the point-wise correspondence between these localized features.
In addition to these foundational techniques widely adopted across various tasks, point cloud understanding methods generally place greater emphasis on encoding diverse geometric representations. CurveNet calculates hypothetical curves based on point-wise features [21]. Similarly, representations such as triangular and umbrella surfaces with orientations are introduced by RepSurf [12]. PointMLP provides evidence that simple residual MLPs are can be also effective for encoding point clouds [8]. Kernel-based methods are also well-studied. KPConv adapts extra kernel points to establish convolution to local geometry [14]. PointConv treats convolution kernels as nonlinear functions to approximate weight and density of the local coordinates [19].
In general, the task of point cloud completion necessitates a more meaningful embedding, given that encoding a partial point cloud can potentially result in less-determinant or even incorrect representations. Motivated by this, our objective is to design a comprehensive point cloud encoder tailored specifically for completion tasks.

3 Method

The encoding step in the network forward procedure is commonly overlooked in traditional point cloud completion methods, which tend to encode with fundamental architectures such as PointNet++ [10] and DGCNN [16]. Thereafter, the back propagation step during the backward process gives rise to holistic optimization via the Chamfer distance (CD) loss to constrain the geometric features in a reasonable domain. In this work, we argue that such simple encoding can be further improved and is important for the point cloud completion task. Therefore, inspired by PointHSD [28, 29] in joint learning for simultaneous point cloud completion and understanding, we introduce the hierarchical self-distillation-based encoder during the forward step to enhance the self-recognition capacity in a self-supervised manner. The difference lies in the fact that PointHSD was proposed to be a post-encoder to further regularize the optimization for the task of sparse point cloud completion and understanding, while our method applies HSD as a comprehensive encoder as an alternative to PointNet++-like encoders to understand local details. The overall architecture is illustrated in Figure 2. Unlike PointHSD, which encodes information for both completion and understanding, our method applies HSD solely for the completion task, demonstrating its effectiveness across various point cloud processing methods. Comparing with MSG that concatenates features in a additive manner, our proposed HSD formulates features in a subtractive way.

3.1 Hierarchical Self-Distillation-based Encoder

Table 1:
ModelAveragePlaneCabinetCarChairLampCouchTableBoat
FoldingNet [23]14.319.4915.8012.6115.5516.4115.9713.6514.99
TopNet [13]12.157.6113.3110.9013.8214.4414.7811.2211.12
AtlasNet [5]10.856.3711.9410.1012.0612.3712.9910.3310.61
PCN [25]9.645.5022.7010.638.7011.0011.3411.688.59
GRNet [22]8.836.4510.379.459.417.9610.518.448.04
CDN [15]8.514.799.978.319.498.9410.697.818.05
PMP-Net [17]8.735.6511.249.649.516.9510.838.727.25
NSFA [26]8.064.7610.188.638.537.0310.537.357.48
SnowflakeNet [20]7.214.299.168.087.896.079.236.556.40
w. MSG7.1424.2129.2438.2077.6366.1319.1926.5146.401
w. HSD7.1394.1899.2428.1447.7016.1029.1936.5306.396
Table 1: Performance on the PCN dataset in terms of L2-CD (× 1000).
Conventionally, various aggregation operations are leveraged to encode a global or local shape, e.g., pooling and set abstraction. Hence, these methods inevitably ignore particular low-level geometric features. This phenomenon can be further amplified, especially when the input is incomplete. To alleviate the feature degradation caused by aggregation, we borrow the idea of PointHSD [28, 29], which learns the distinction of different local features by maximizing the mutual information I(Zl;ZL) among them, where Zl and ZL denote the intermediate and deepest latent representations.
Such subtractive operation is on the contrary of the additive MSG encoding strategy used in PointNet++ [10]. However, both of them aim to formulate the representation that encodes the most local details. Specifically, we set the number of neighbors in each layer of the hierarchical encoder to be monotonically increasing as illustrated in Figure 2, which is 8, 16, and 24, respectively. The physical intuition behind it is straightforward: for an incomplete point cloud with largely missing regions, a larger perceptual field corresponds to a richer representation of local neighbor information. Therefore, I(Zl;ZL) endows smoothness to the probabilistic distribution using information from the teacher (the last layer), functioning as a feature regularization of the students (the intermediate layers), to alleviate overfitting problem of the entire model.
As the proposed HSD can be integrated into any hierarchy-based structure, it can be substituted for encoders in various point cloud completion methods. Specifically, we apply PointHSD to SnowflakeNet [20] to substitute for its set abstraction modules.
Although only geometric priors are available, it is still possible to represent an object with its latent code. Therefore, we propose to map each code from different aggregation operations at layer l to a probabilistic space by the softmax activation to formulate yl, such that the distribution yL in the the deepest layer functions as a fake objective. yl serves as the supervision to former distributions. Let \(\mathcal {L}_\mathit {KL}\) be the Kullback-Leibler (KL) divergence, the self-distillation loss is practically implemented as:
\begin{equation}\mathcal {L}_\mathit {dis} = \sum _{l=1}^{L-1} \mathcal {L}_\mathit {KL}(y_l, y_L), \end{equation}
(1)
where yl is the prediction and we set the total layer number L = 3. Note that in PointHSD [28], the predicted label distributions yl are also compared with the ground truth distribution ygt, which is unavailable in the completion-only task. Similar to PointHSD, knowledge can therefore flow back via Equation 1 during the forward training step to provide stronger supervision.

3.2 Reconstruction

To validate the effectiveness of the proposed HSD-based encoder, we keep the decoders of the baseline models unchanged. In detail, three splitting-based deconvolution (SPD) modules are integrated to recover a point cloud from coarse to fine. SPD smoothly rearranges the generated points based upon each point in a coarse version, rather than simply shuffling the high-level representations like PU-Net [24] or grid composition like PCN [25]. In general, the network aims to reconstruct the full point cloud \(Q \in \mathbb {R}^N\) from the partial input \(P^{\prime } \in \mathbb {R}^{N^{\prime }}\), where N and N′ are the number of points in the ground truth and partial input, respectively. Specifically, the reconstruction loss is formulated as:
\begin{equation}\mathcal {L}_\mathit {CD} (P,Q) = \frac{1}{|P|} \sum _{p \in P} \underset {q \in Q}{\min } {||p-q||}_2^2 + \frac{1}{|Q|} \sum _{q \in Q} \underset {p \in P}{\min } {||q-p||}_2^2, \end{equation}
(2)
where \({||\cdot ||}_2^2\) denotes the L2-norm of CD. Here, we omit the subscripts in Eq. 2 for simplicity.

3.3 Optimization

Facilitated by self-distillation (Sec. 3.1) and reconstruction (Sec. 3.2), the overall procedure can be optimized by:
\begin{equation}\mathcal {L} = \lambda \sum _{i=1}^{3} {\mathcal {L}_\mathit {CD}}(P_i,Q_i) + \mathcal {L}_\mathit {dis}, P_i \subset P, Q_i \subset Q, \end{equation}
(3)
where λ is set to 1000 empirically; Pi is the completed prediction after the ith SPD in the decoder, matching the size of Qi; and Q3 = Q denotes the complete ground truth.

4 Experiment

Figure 3:
Figure 3: Qualitative comparison between the original SnowflakeNet and the one with our HSD. For these geometrically more complex shapes, our HSD further encodes the local details and achieves plausible completion results.

4.1 Dataset

Our data are based on a commonly used benchmark, namely PCN [25], a subset derived from ShapeNet [1]. The PCN dataset contains 8 objects, consisting of 28,974 training and 1,200 testing samples in total. We follow the same split and preparation settings as in previous works [17, 18, 20] for fair comparisons. There are 8 different views of an object for training and only 1 view for testing.

4.2 Implementation Detail

Our networks are trained on a server deployed with 10 Nvidia RTX 2080Ti GPUs unless otherwise stated. The batch sizes are 240 for training and 16 for testing. The upsampling ratio of the SnowflakeNet baseline is (1, 4, 8) and remains the same for our advanced variant. Following the criterion of the community, we report the Chamfer distance (CD) as the evaluation metric, i.e., a lower value denotes better performance. Our source Code is available at https://github.com/ky-zhou/PointHSD.

4.3 Quantitative Results

Table 1 shows comparisons of previous works and our different variants based on SnowflakeNet. We directly refer to the results reported in the existing works if not stated otherwise. Next, we conduct experiments for variants using MSG and HSD as substitutes of encoder. It can be observed that 5 out of the 8 individual objects outperform the baseline model with respect to reconstruction error.
Furthermore, the MSG-based encoder outperforms the vanilla SnowflakeNet, indicating that both MSG and HSD improve the encoding capacity through the use of more perceptual fields or information dissociation based on scale. In this scenario, HSD behaves analogously to MSG. However, MSG concatenates features to a larger dimension, thereby increasing the GPU memory required for computation. In contrast, our method calculates the feature differences among multiple layers, which minimally increases the computational cost.
Meanwhile, the self-looping knowledge path facilitates the network’s capacity to transfer information from the deepest layers back to earlier ones, thereby functioning as a stronger regularization term (not only for joint completion and classification tasks [28, 29], but also for the classification-only task). This mechanism is predicated on the notion that a larger perceptual field for an incomplete object inherently encompasses greater informational content.

4.4 Qualitative Visualization

In Figure 3, we present visualizations comparing the outcomes of the original SnowflakeNet and our proposed method. Specifically, we intentionally select two objects with intricate shapes from each category within the PCN dataset [25] to clearly illustrate the distinctions between the two methods. The results demonstrate the effectiveness of the proposed encoding strategy. For example, the gaps or holes present in airplanes and cabinets are more clearly revealed by the incorporation of HSD.

4.5 Ablation Study

Table 2:
One step of PMP-NetCD
W/o. HSD12.620
w. HSD12.596
Table 2: Performance on the PCN dataset in terms of L2-CD × 1000 for one step of PMP-Net. The encoding strategy with HSD slightly outperforms the original setting. However, this indicates that HSD might be inappropriate to such structures, where the encoding processes are isolated among compression and recovery.
The ablation study is naturally available in Table 1, where the model SnowflakeNet can be regarded as the baseline. It can be clearly observed that the proposed HSD encoding module is able to encode more powerful representations than the baseline, resulting in improved reconstruction capacity in the decoding process. Meanwhile, HSD exhibits competitive performance when compared with MSG, underscoring its comparable efficacy in encoding for completion tasks. This parallel effectiveness positions HSD as a noteworthy alternative, emphasizing its potential in tasks related to point cloud completion.
Although SnowflakeNet and PMP-Net are both completion networks, they have fundamental structural differences. Originally, SnowflakeNet behaved as a single autoencoder that consists of an encoder and a decoder such that the completed points are generated in one step directly. In contrast, PMP-Net is composed of three cascaded autoencoders that rearrange the points in three steps continuously. In such cases, HSD cannot be applied to the three encoders to make the latent code consistent, as they encode significantly different shapes. Therefore, we evaluate one step from PMP-Net with the proposed HSD encoding strategy. The results for the PCN dataset are shown in Table 2. More discussion will be explicitly described in the following section.

4.6 Discussion

Figure 4:
Figure 4: Illustration of PMP-Net, where three PointNet++ based autoencoders are cascaded for gradual completion.
As discussed in Section 4.5, the direct application of PMP-Net with an HSD is inadequate. The large difference among low-level shapes leads to messy latent codes, which cannot be accurately perceived by the network. See Figure 4 for details. This suggests the proposed should be applied to hierarchy-based encoders, rather than the cascaded encoding-decoding streams like PMP-Net. This observation also constrains the potential application of proposed method to complicated network structures that encompass more than one encoder, since the dissociated information pattern may become entangled among different encoding stages.
Moreover, the improvement in our classifier-free network is not as pronounced as that achieved by classifier-based methods  [28, 29], indicating that encoding incomplete point clouds remains a significant challenge in the field.
Finally, similar to MSG, HSD still requires aggregation to encode the hierarchical features into a latent code, which suffers from information loss principally [2]. As a result, learning the lost information due to aggregation will be a key focus in our future work. The correspondence between points and the global code can be established if a differentiable sampling is leveraged, which has the potential to benefit processes in both seed generation and point upsampling. We plan to explore the information in addition to the aggregated features in the future, which requires more comprehensive point-wise feature encoding.

5 Conclusion

In this paper, we emphasize the importance of encoding in the task of point cloud completion by substituting the vanilla feature extraction module with HSD. The applied HSD aims to transfer the rich knowledge accumulated in the deepest aggregation back to former ones. Our experiment shows that HSD can improve the completion performance by just boosting the hierarchy-based encoder, therefore, HSD is a promising alternative to MSG. Also, we demonstrate the enhancement on point cloud completion by simply improving the encoder, highlighting the potential to apply more advanced encoding strategies in this domain. In future, instead of just learning the dissociation from the latent codes extracted by various scales, we plan to make the sampled points through encoding not only differentiable with respect to the encoded representation, but also with respect to the input.

References

[1]
Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:https://arXiv.org/abs/1512.03012 (2015).
[2]
Jiajing Chen, Burak Kakillioglu, Huantao Ren, and Senem Velipasalar. 2022. Why discard if you can recycle?: A recycling max pooling module for 3d point cloud analysis. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 559–567.
[3]
Ruikai Cui, Shi Qiu, Saeed Anwar, Jiawei Liu, Chaoyue Xing, Jing Zhang, and Nick Barnes. 2023. P2C: Self-Supervised Point Cloud Completion from Single Partial Clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14351–14360.
[4]
Zhiheng Fu, Longguang Wang, Lian Xu, Zhiyong Wang, Hamid Laga, Yulan Guo, Farid Boussaid, and Mohammed Bennamoun. 2023. VAPCNet: Viewpoint-Aware 3D Point Cloud Completion. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12108–12118.
[5]
Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, and Mathieu Aubry. 2018. A papier-mâché approach to learning 3d surface generation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 216–224.
[6]
Sangmin Hong, Mohsen Yavartanoo, Reyhaneh Neshatavar, and Kyoung Mu Lee. 2023. ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9435–9444.
[7]
Dongyan Guo Junxia Li Qingshan Liu Chunhua Shen Jun Wang, Ying Cui. 2024. PointAttN: You Only Need Attention for Point Cloud Completion. In Association for the Advancement of Artificial Intelligence (AAAI).
[8]
Xu Ma, Can Qin, Haoxuan You, Haoxi Ran, and Yun Fu. 2022. Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework. In International Conference on Learning Representations.
[9]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652–660.
[10]
Charles R Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Advances in neural information processing systems. 5099–5108.
[11]
Shi Qiu, Saeed Anwar, and Nick Barnes. 2022. Pu-transformer: Point cloud upsampling transformer. In Proceedings of the Asian Conference on Computer Vision. 2475–2493.
[12]
Haoxi Ran, Jun Liu, and Chengjie Wang. 2022. Surface representation for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18942–18952.
[13]
Lyne P Tchapmi, Vineet Kosaraju, Hamid Rezatofighi, Ian Reid, and Silvio Savarese. 2019. Topnet: Structural point cloud decoder. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 383–392.
[14]
Hugues Thomas, Charles R Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, and Leonidas J Guibas. 2019. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision. 6411–6420.
[15]
Xiaogang Wang, Marcelo H Ang Jr, and Gim Hee Lee. 2020. Cascaded refinement network for point cloud completion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 790–799.
[16]
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog) 38, 5 (2019), 1–12.
[17]
Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, and Yu-Shen Liu. 2021. Pmp-net: Point cloud completion by learning multi-step point moving paths. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7443–7452.
[18]
Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, and Yu-Shen Liu. 2022. PMP-Net++: Point cloud completion by transformer-enhanced multi-step point moving paths. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 1 (2022), 852–867.
[19]
Wenxuan Wu, Zhongang Qi, and Li Fuxin. 2019. Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 9621–9630.
[20]
Peng Xiang, Xin Wen, Yu-Shen Liu, Yan-Pei Cao, Pengfei Wan, Wen Zheng, and Zhizhong Han. 2021. Snowflakenet: Point cloud completion by snowflake point deconvolution with skip-transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 5499–5509.
[21]
Tiange Xiang, Chaoyi Zhang, Yang Song, Jianhui Yu, and Weidong Cai. 2021. Walk in the cloud: Learning curves for point clouds shape analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 915–924.
[22]
Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, and Wenxiu Sun. 2020. Grnet: Gridding residual network for dense point cloud completion. In European Conference on Computer Vision. Springer, 365–381.
[23]
Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. 2018. Foldingnet: Point cloud auto-encoder via deep grid deformation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 206–215.
[24]
Lequan Yu, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and Pheng-Ann Heng. 2018. Pu-net: Point cloud upsampling network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2790–2799.
[25]
Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, and Martial Hebert. 2018. Pcn: Point completion network. In 2018 International Conference on 3D Vision (3DV). IEEE, 728–737.
[26]
Wenxiao Zhang, Qingan Yan, and Chunxia Xiao. 2020. Detail preserved point cloud completion via separated feature aggregation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16. Springer, 512–528.
[27]
Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip HS Torr, and Vladlen Koltun. 2021. Point transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 16259–16268.
[28]
Kaiyue Zhou, Ming Dong, Peiyuan Zhi, and Shengjin Wang. 2023. Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation. arXiv preprint arXiv:https://arXiv.org/abs/2312.16902 (2023).
[29]
Kaiyue Zhou, Ming Dong, Peiyuan Zhi, and Shengjin Wang. 2024. Cascaded Network with Hierarchical Self-Distillation for Sparse Point Cloud Classification. In 2024 IEEE International Conference on Multimedia and Expo (ICME). IEEE.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICCPR '24: Proceedings of the 2024 13th International Conference on Computing and Pattern Recognition
October 2024
448 pages
ISBN:9798400717482
DOI:10.1145/3704323

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 January 2025

Check for updates

Author Tags

  1. point cloud completion
  2. hierarchical self-distillation
  3. encoding

Qualifiers

  • Research-article

Funding Sources

Conference

ICCPR 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 124
    Total Downloads
  • Downloads (Last 12 months)124
  • Downloads (Last 6 weeks)69
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media