skip to main content
10.1145/3581783.3612469acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Graph Spectral Perturbation for 3D Point Cloud Contrastive Learning

Published: 27 October 2023 Publication History

Abstract

3D point cloud contrastive learning has attracted increasing attention due to its efficient learning ability. By distinguishing the similarity relationship between positive and negative samples in the feature space, it can learn effective point cloud feature representations without manual annotation. However, most point cloud contrastive learning methods construct contrastive samples by perturbing point clouds in data space or introducing multi-modality/format data, which may be difficult to control the intensity of the perturbation or introduce interference from different modalities/formats. To this end, in this paper, we propose a novel graph spectral perturbation based contrastive learning framework (GSPCon) for efficient and robust self-supervised 3D point cloud representation learning. It aims to perform perturbations in the graph spectral domain to construct contrastive samples of the point cloud. Specifically, we first naturally represent the point cloud as a k-nearest neighbors (KNN) graph, and adaptively transform the coordinates of the points into the graph spectral domain based on the graph Fourier transform (GFT). Then we implement data augmentation in the graph spectral domain by perturbing the spectral representations. Finally, the contrastive samples are generated by employing the inverse graph Fourier transform (IGFT) to transform the augmented spectral representations back to the point clouds. Experimental results show that our method achieves the state-of-the-art performance on various downstream tasks. Source code is available at https://github.com/yh-han/GSPCon.git.

References

[1]
Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, and Leonidas Guibas. 2018. Learning representations and generative models for 3D point clouds. In International conference on machine learning. PMLR, 40--49.
[2]
Mohamed Afham, Isuru Dissanayake, Dinithi Dissanayake, Amaya Dharmasiri, Kanchana Thilakarathna, and Ranga Rodrigo. 2022. Crosspoint: Self-supervised cross-modal contrastive learning for 3D point cloud understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9902--9912.
[3]
Jens Behley, Martin Garbade, Andres Milioto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, and Jurgen Gall. 2019. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9297--9307.
[4]
Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015).
[5]
Siheng Chen, Dong Tian, Chen Feng, Anthony Vetro, and Jelena Kovačević. 2017. Fast resampling of three-dimensional point clouds via graphs. IEEE Transactions on Signal Processing (2017), 666--681.
[6]
Fan RK Chung. 1997. Spectral graph theory. Vol. 92. American Mathematical Soc.
[7]
Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017. Scannet: Richly-annotated 3D reconstructions of indoor scenes. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 5828--5839.
[8]
Bi'an Du, Xiang Gao, Wei Hu, and Xin Li. 2021. Self-contrastive learning with hard negative sampling for self-supervised point cloud learning. In Proceedings of the 29th ACM International Conference on Multimedia. 3133--3142.
[9]
Matheus Gadelha, Aruni RoyChowdhury, Gopal Sharma, Evangelos Kalogerakis, Liangliang Cao, Erik Learned-Miller, Rui Wang, and Subhransu Maji. 2020. Label-efficient learning on point clouds using approximate convex decompositions. In Proceedings of the European Conference on Computer Vision. 473--491.
[10]
Matheus Gadelha, Rui Wang, and Subhransu Maji. 2018. Multiresolution tree networks for 3D point cloud processing. In Proceedings of the European Conference on Computer Vision. 103--118.
[11]
Xiang Gao, Wei Hu, and Guo-Jun Qi. 2020. Graphter: Unsupervised learning of graph transformation equivariant representations via auto-encoding node-wise transformations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7163--7172.
[12]
Zhizhong Han, Mingyang Shang, Yu-Shen Liu, and Matthias Zwicker. 2019. View inter-prediction gan: Unsupervised representation learning for 3D shapes by learning global shape memories to support local view predictions. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 8376--8384.
[13]
Kaveh Hassani and Mike Haley. 2019. Unsupervised multi-task feature learning on point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8160--8171.
[14]
Qianjiang Hu, Daizong Liu, and Wei Hu. 2022. Exploring the devil in graph spectral domain for 3D point cloud attacks. In Proceedings of the European Conference on Computer Vision. 229--248.
[15]
Siyuan Huang, Yichen Xie, Song-Chun Zhu, and Yixin Zhu. 2021. Spatio-temporal self-supervised representation learning for 3D point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6535--6545.
[16]
Longlong Jing, Ling Zhang, and Yingli Tian. 2021. Self-supervised feature learning by cross-modality and cross-view correspondences. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1581--1591.
[17]
Nikhil Ketkar and Eder Santana. 2017. Deep learning with Python. Vol. 1. Springer.
[18]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[19]
Jiaxin Li, Ben M Chen, and Gim Hee Lee. 2018b. So-net: Self-organizing network for point cloud analysis. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 9397--9406.
[20]
Lanxiao Li and Michael Heizmann. 2022. A closer look at invariances in self-supervised pre-training for 3D vision. In Proceedings of the European Conference on Computer Vision. 656--673.
[21]
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018a. Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems, Vol. 31 (2018).
[22]
Daizong Liu, Wei Hu, and Xin Li. 2022. Point cloud attacks in graph spectral domain: When 3D geometry meets graph signal processing. arXiv preprint arXiv:2207.13326 (2022).
[23]
Fayao Liu, Guosheng Lin, Chuan-Sheng Foo, Chaitanya K Joshi, and Jie Lin. 2021. Point discriminative learning for unsupervised representation learning on 3D point clouds. arXiv preprint arXiv:2108.02104 (2021).
[24]
Yongcheng Liu, Bin Fan, Shiming Xiang, and Chunhong Pan. 2019. Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 8895--8904.
[25]
Ilya Loshchilov and Frank Hutter. 2016. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016).
[26]
Srikanth Malla and Yi-Ting Chen. 2023. CLR-GAM: Contrastive Point Cloud Learning with Guided Augmentation and Feature Mapping. arXiv preprint arXiv:2302.14306 (2023).
[27]
Ishan Misra, Rohit Girdhar, and Armand Joulin. 2021. An end-to-end transformer model for 3D object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2906--2917.
[28]
Yatian Pang, Wenxiao Wang, Francis EH Tay, Wei Liu, Yonghong Tian, and Li Yuan. 2022. Masked autoencoders for point cloud self-supervised learning. In Proceedings of the European Conference on Computer Vision. 604--621.
[29]
Omid Poursaeed, Tianxing Jiang, Han Qiao, Nayun Xu, and Vladimir G Kim. 2020. Self-supervised learning of point clouds via orientation estimation. In Proceedings of the IEEE Conference on International Conference on 3D Vision. 1018--1028.
[30]
Charles R Qi, Or Litany, Kaiming He, and Leonidas J Guibas. 2019. Deep hough voting for 3D object detection in point clouds. In proceedings of the IEEE/CVF International Conference on Computer Vision. 9277--9286.
[31]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017a. Pointnet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 652--660.
[32]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017b. Pointnet: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, Vol. 30 (2017).
[33]
Aditya Sanghi. 2020. Info3D: Representation learning on 3D objects using mutual information maximization and contrastive learning. In Proceedings of the European Conference on Computer Vision. 626--642.
[34]
Jonathan Sauder and Bjarne Sievers. 2019. Self-supervised deep learning on point clouds by reconstructing space. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[35]
Charu Sharma and Manohar Kaul. 2020. Self-supervised few-shot learning on point clouds. Advances in Neural Information Processing Systems, Vol. 33 (2020), 7212--7221.
[36]
David I Shuman, Sunil K Narang, Pascal Frossard, Antonio Ortega, and Pierre Vandergheynst. 2013. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine, Vol. 30, 3 (2013), 83--98.
[37]
Hugues Thomas, Charles R Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, Francc ois Goulette, and Leonidas J Guibas. 2019. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6411--6420.
[38]
Mikaela Angelina Uy, Quang-Hieu Pham, Binh-Son Hua, Thanh Nguyen, and Sai-Kit Yeung. 2019. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1588--1597.
[39]
Hanchen Wang, Qi Liu, Xiangyu Yue, Joan Lasenby, and Matt J Kusner. 2021. Unsupervised point cloud pre-training via occlusion completion. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9782--9792.
[40]
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics, Vol. 38, 5 (2019), 1--12.
[41]
Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. Advances in neural information processing systems, Vol. 29 (2016).
[42]
Wenxuan Wu, Zhongang Qi, and Li Fuxin. 2019. Pointconv: Deep convolutional networks on 3D point clouds. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 9621--9630.
[43]
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1912--1920.
[44]
Aoran Xiao, Jiaxing Huang, Dayan Guan, Xiaoqin Zhang, Shijian Lu, and Ling Shao. 2023. Unsupervised Point Cloud Representation Learning with Deep Neural Networks: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[45]
Saining Xie, Jiatao Gu, Demi Guo, Charles R Qi, Leonidas Guibas, and Or Litany. 2020. Pointcontrast: Unsupervised pre-training for 3D point cloud understanding. In Proceedings of the European Conference on Computer Vision. 574--591.
[46]
Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, and Silvio Savarese. 2023. ULIP: Learning a unified representation of language, images, and point clouds for 3D understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1179--1189.
[47]
Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. 2018. Foldingnet: Point cloud auto-encoder via deep grid deformation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 206--215.
[48]
Li Yi, Vladimir G Kim, Duygu Ceylan, I-Chao Shen, Mengyan Yan, Hao Su, Cewu Lu, Qixing Huang, Alla Sheffer, and Leonidas Guibas. 2016. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics, Vol. 35, 6 (2016), 1--12.
[49]
Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie Zhou, and Jiwen Lu. 2022. Point-bert: Pre-training 3D point cloud transformers with masked point modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 19313--19322.
[50]
Ling Zhang and Zhigang Zhu. 2019. Unsupervised feature learning for point cloud understanding by contrasting and clustering using graph convolutional neural networks. In 2019 international conference on 3D vision. IEEE, 395--404.
[51]
Renrui Zhang, Ziyu Guo, Peng Gao, Rongyao Fang, Bin Zhao, Dong Wang, Yu Qiao, and Hongsheng Li. 2022a. Point-M2AE: multi-scale masked autoencoders for hierarchical point cloud pre-training. arXiv preprint arXiv:2205.14401 (2022).
[52]
Renrui Zhang, Liuhui Wang, Yu Qiao, Peng Gao, and Hongsheng Li. 2022b. Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders. arXiv preprint arXiv:2212.06785 (2022).
[53]
Songyang Zhang, Shuguang Cui, and Zhi Ding. 2020. Hypergraph spectral analysis and processing in 3D point cloud. IEEE Transactions on Image Processing, Vol. 30 (2020), 1193--1206.
[54]
Zaiwei Zhang, Rohit Girdhar, Armand Joulin, and Ishan Misra. 2021. Self-supervised pretraining of 3D features on any point-cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10252--10263.
[55]
Yongheng Zhao, Tolga Birdal, Haowen Deng, and Federico Tombari. 2019. 3D point capsule networks. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 1009--1018.

Cited By

View all
  • (2024)Diverse Consensuses Paired with Motion Estimation-Based Multi-Model FittingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681646(9281-9290)Online publication date: 28-Oct-2024
  • (2024)Masked Motion Prediction with Semantic Contrast for Point Cloud Sequence LearningComputer Vision – ECCV 202410.1007/978-3-031-73116-7_24(414-431)Online publication date: 31-Oct-2024

Index Terms

  1. Graph Spectral Perturbation for 3D Point Cloud Contrastive Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3d point clouds
    2. contrastive learning
    3. graph spectral perturbation
    4. self-supervised representation learning

    Qualifiers

    • Research-article

    Funding Sources

    • the National Science Fund of China

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)119
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Diverse Consensuses Paired with Motion Estimation-Based Multi-Model FittingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681646(9281-9290)Online publication date: 28-Oct-2024
    • (2024)Masked Motion Prediction with Semantic Contrast for Point Cloud Sequence LearningComputer Vision – ECCV 202410.1007/978-3-031-73116-7_24(414-431)Online publication date: 31-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media