skip to main content
10.1145/3581783.3612224acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Exploring Dual Representations in Large-Scale Point Clouds: A Simple Weakly Supervised Semantic Segmentation Framework

Published: 27 October 2023 Publication History

Abstract

Existing work shows that 3D point clouds produce only about a 4% drop in semantic segmentation even at 1% random point annotation, which inspires us to further explore how to achieve better results at lower cost. As scene point clouds provide position and color information and often used in tandem as the only input, with little work going into segmentation by fusing information from dual spaces. To optimize point cloud representations, we propose a novel framework for the dual representation query network (DRQNet). The proposed framework partitions the input point cloud into position and color spaces, using the separately extracted geometric structure and semantic context to create an internal supervisory mechanism that bridges the dual spaces and fuses the information. Adopting sparsely annotated points as the query set, DRQNet provide guidance and perceptual information for multi-stage point clouds through random sampling. More, to differentiate and enhance the features generated by local neighbourhoods within multiple perceptual fields, we design a representation selection module to identify the contributions made by the position and color of each query point, and weight them adaptively according to reliability. The proposed DRQNet is robust to point cloud analysis and eliminates the effects of irregularities and disorder. Our method achieves significant performance gains on three mainstream benchmarks.

References

[1]
Eren Erdal Aksoy, Saimir Baci, and Selcuk Cavdar. 2020. Salsanet: Fast road and vehicle segmentation in lidar point clouds for autonomous driving. In Proceedings of the IEEE Intelligent Vehicles Symposium. 926--932.
[2]
Iro Armeni, Sasha Sax, Amir R Zamir, and Silvio Savarese. 2017. Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105 (2017).
[3]
Iro Armeni, Ozan Sener, Amir R Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 2016. 3D semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1534--1543.
[4]
Jens Behley, Martin Garbade, Andres Milioto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, and Jurgen Gall. 2019. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9297--9307.
[5]
Alexandre Boulch, Bertrand Le Saux, Nicolas Audebert, et al. 2017. Unstructured point cloud semantic labeling using deep segmentation networks. 3dor@ eurographics, Vol. 3 (2017), 17--24.
[6]
Mingmei Cheng, Le Hui, Jin Xie, and Jian Yang. 2021. Sspc-net: Semi-supervised semantic 3d point cloud segmentation network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 1140--1147.
[7]
Christopher Choy, JunYoung Gwak, and Silvio Savarese. 2019. 4D spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3075--3084.
[8]
Tiago Cortinhal, George Tzelepis, and Eren Erdal Aksoy. 2020. Salsanext: Fast semantic segmentation of lidar point clouds for autonomous driving. arXiv preprint arXiv:2003.03653, Vol. 3, 7 (2020).
[9]
Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5828--5839.
[10]
Hehe Fan, Xiaojun Chang, Wanyue Zhang, Yi Cheng, Ying Sun, and Mohan Kankanhalli. 2022. Self-supervised global-local structure modeling for point cloud domain adaptation with reliable voted pseudo labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6377--6386.
[11]
Siqi Fan, Qiulei Dong, Fenghua Zhu, Yisheng Lv, Peijun Ye, and Fei-Yue Wang. 2021. SCF-Net: Learning spatial contextual features for large-scale point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14504--14513.
[12]
Benjamin Graham, Martin Engelcke, and Laurens Van Der Maaten. 2018. 3d semantic segmentation with submanifold sparse convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9224--9232.
[13]
Timo Hackel, Nikolay Savinov, Lubor Ladicky, Jan D Wegner, Konrad Schindler, and Marc Pollefeys. 2017. Semantic3d. net: A new large-scale point cloud classification benchmark. arXiv preprint arXiv:1704.03847 (2017).
[14]
Ji Hou, Benjamin Graham, Matthias Nießner, and Saining Xie. 2021. Exploring data-efficient 3d scene understanding with contrastive scene contexts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15587--15597.
[15]
Qingyong Hu, Bo Yang, Guangchi Fang, Yulan Guo, Alevs Leonardis, Niki Trigoni, and Andrew Markham. 2022. Sqn: Weakly-supervised semantic segmentation of large-scale 3d point clouds. In Proceedings of the European Conference on Computer Vision. 600--619.
[16]
Qingyong Hu, Bo Yang, Sheikh Khalid, Wen Xiao, Niki Trigoni, and Andrew Markham. 2021. Towards semantic segmentation of urban-scale 3d point clouds: A dataset, benchmarks and challenges. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4977--4987.
[17]
Qingyong Hu, Bo Yang, Linhai Xie, Stefano Rosa, Yulan Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham. 2020. Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11108--11117.
[18]
Le Hui, Jia Yuan, Mingmei Cheng, Jin Xie, Xiaoya Zhang, and Jian Yang. 2021. Superpoint network for point cloud oversegmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5510--5519.
[19]
Chenru Jiang, Kaizhu Huang, Junwei Wu, Xinheng Wang, Jimin Xiao, and Amir Hussain. 2023. PointGS: Bridging and fusing geometric and semantic space for 3d point cloud analysis. Information Fusion, Vol. 91 (2023), 316--326.
[20]
Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, and Caroline Pantofaru. 2020. Virtual multi-view fusion for 3d semantic segmentation. In Proceedings of the European Conference on Computer Vision. 518--535.
[21]
Loic Landrieu and Mohamed Boussaha. 2019. Point cloud oversegmentation with graph-structured deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7440--7449.
[22]
Loic Landrieu and Martin Simonovsky. 2018. Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4558--4567.
[23]
Minhyeok Lee, Chaewon Park, Suhwan Cho, and Sangyoun Lee. 2022. Spsn: Superpixel prototype sampling network for rgb-d salient object detection. In Proceedings of the European Conference on Computer Vision. 630--647.
[24]
Huan Lei, Naveed Akhtar, and Ajmal Mian. 2020. Spherical kernel for efficient graph convolution on 3d point clouds. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 43, 10 (2020), 3664--3680.
[25]
Mengtian Li, Yuan Xie, Yunhang Shen, Bo Ke, Ruizhi Qiao, Bo Ren, Shaohui Lin, and Lizhuang Ma. 2022. Hybridcr: Weakly-supervised 3d point cloud semantic segmentation via hybrid contrastive regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14930--14939.
[26]
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. Pointcnn: Convolution on x-transformed points. Advances in Neural Information Processing Systems, Vol. 31 (2018).
[27]
Zhengzhe Liu, Xiaojuan Qi, and Chi-Wing Fu. 2021. One thing one click: A self-training approach for weakly supervised 3d semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1726--1736.
[28]
Jilin Mei, Biao Gao, Donghao Xu, Wen Yao, Xijun Zhao, and Huijing Zhao. 2019. Semantic segmentation of 3d lidar data in dynamic scene using semi-supervised learning. IEEE Transactions on Intelligent Transportation Systems, Vol. 21, 6 (2019), 2496--2509.
[29]
Hsien-Yu Meng, Lin Gao, Yu-Kun Lai, and Dinesh Manocha. 2019. Vv-net: Voxel vae net with group convolutions for point cloud segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8500--8508.
[30]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017a. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 652--660.
[31]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017b. Pointnet: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems, Vol. 30 (2017).
[32]
Zhongzheng Ren, Ishan Misra, Alexander G Schwing, and Rohit Girdhar. 2021. 3d spatial recognition without spatially labeled 3d. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13204--13213.
[33]
Xavier Roynard, Jean-Emmanuel Deschaud, and Francc ois Goulette. 2018. Paris-Lille-3D: A large and high-quality ground-truth urban point cloud dataset for automatic segmentation and classification. The International Journal of Robotics Research, Vol. 37, 6 (2018), 545--557.
[34]
Laine Samuli and Aila Timo. 2017. Temporal ensembling for semi-supervised learning. In Proceedings of the International Conference on Learning Representations, Vol. 4. 6.
[35]
Feifei Shao, Yawei Luo, Ping Liu, Jie Chen, Yi Yang, Yulei Lu, and Jun Xiao. 2022. Active learning for point cloud semantic segmentation via spatial-structural diversity reasoning. In Proceedings of the 30th ACM International Conference on Multimedia. 2575--2585.
[36]
Xian Shi, Xun Xu, Ke Chen, Lile Cai, Chuan Sheng Foo, and Kui Jia. 2021. Label-efficient point cloud semantic segmentation: An active learning approach. arXiv preprint arXiv:2101.06931 (2021).
[37]
Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, and Xi Li. 2021. Deep RGB-D saliency detection with depth-sensitive attention and automatic multi-modal fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1407--1417.
[38]
Weikai Tan, Nannan Qin, Lingfei Ma, Ying Li, Jing Du, Guorong Cai, Ke Yang, and Jonathan Li. 2020. Toronto-3D: A large-scale mobile lidar dataset for semantic segmentation of urban roadways. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 202--203.
[39]
Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, and Song Han. 2020. Searching efficient 3d architectures with sparse point-voxel convolution. In Proceedings of the European Conference on Computer Vision. 685--702.
[40]
An Tao, Yueqi Duan, Yi Wei, Jiwen Lu, and Jie Zhou. 2022. Seggroup: Seg-level supervision for 3d instance and semantic segmentation. IEEE Transactions on Image Processing, Vol. 31 (2022), 4952--4965.
[41]
Wen-xu Tao, Gang-yi Jiang, Zhi-di Jiang, and Mei Yu. 2021. Point cloud projection and multi-scale feature fusion network based blind quality assessment for colored point clouds. In Proceedings of the 29th ACM International Conference on Multimedia. 5266--5272.
[42]
Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in Neural Information Processing Systems, Vol. 30 (2017).
[43]
Hugues Thomas, Charles R Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, Francc ois Goulette, and Leonidas J Guibas. 2019. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6411--6420.
[44]
Giang Truong, Syed Zulqarnain Gilani, Syed Mohammed Shamsul Islam, and David Suter. 2019. Fast point cloud registration using semantic segmentation. In Proceedings of the Digital Image Computing: Techniques and Applications. 1--8.
[45]
Lei Wang, Yuchun Huang, Yaolin Hou, Shenman Zhang, and Jie Shan. 2019a. Graph attention convolution for point cloud semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10296--10305.
[46]
Yunhai Wang, Shmulik Asafi, Oliver Van Kaick, Hao Zhang, Daniel Cohen-Or, and Baoquan Chen. 2012. Active co-analysis of a set of shapes. ACM Transactions on Graphics, Vol. 31, 6 (2012), 1--10.
[47]
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019b. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics, Vol. 38, 5 (2019), 1--12.
[48]
Jiacheng Wei, Guosheng Lin, Kim-Hui Yap, Tzu-Yi Hung, and Lihua Xie. 2020. Multi-path region mining for weakly supervised 3d semantic segmentation on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4384--4393.
[49]
Bichen Wu, Xuanyu Zhou, Sicheng Zhao, Xiangyu Yue, and Kurt Keutzer. 2019. Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In Proceedings of the International Conference on Robotics and Automation. 4376--4382.
[50]
Tsung-Han Wu, Yueh-Cheng Liu, Yu-Kai Huang, Hsin-Ying Lee, Hung-Ting Su, Ping-Chia Huang, and Winston H Hsu. 2021. Redal: Region-based and diversity-aware active learning for point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 15510--15519.
[51]
Yue Wu, Xidao Hu, Yue Zhang, Maoguo Gong, Wenping Ma, and Qiguang Miao. 2023 a. SACF-Net: Skip-attention Based Correspondence Filtering Network for Point Cloud Registration. IEEE Transactions on Circuits and Systems for Video Technology (2023).
[52]
Yue Wu, Jiaming Liu, Maoguo Gong, Peiran Gong, Xiaolong Fan, AK Qin, Qiguang Miao, and Wenping Ma. 2023 b. Self-Supervised Intra-Modal and Cross-Modal Contrastive Learning for Point Cloud Understanding. IEEE Transactions on Multimedia (2023).
[53]
Yue Wu, Qianlin Yao, Xiaolong Fan, Maoguo Gong, Wenping Ma, and Qiguang Miao. 2023 c. Panet: A point-attention based multi-scale feature fusion network for point cloud registration. IEEE Transactions on Instrumentation and Measurement (2023).
[54]
Yue Wu, Yue Zhang, Wenping Ma, Maoguo Gong, Xiaolong Fan, Mingyang Zhang, AK Qin, and Qiguang Miao. 2023 d. RORNet: Partial-to-Partial Registration Network With Reliable Overlapping Representations. IEEE Transactions on Neural Networks and Learning Systems (2023).
[55]
Saining Xie, Jiatao Gu, Demi Guo, Charles R Qi, Leonidas Guibas, and Or Litany. 2020. Pointcontrast: Unsupervised pre-training for 3d point cloud understanding. In Proceedings of the European Conference on Computer Vision. 574--591.
[56]
Jianyun Xu, Ruixiang Zhang, Jian Dou, Yushi Zhu, Jie Sun, and Shiliang Pu. 2021. Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 16024--16033.
[57]
Xun Xu and Gim Hee Lee. 2020. Weakly supervised semantic point cloud segmentation: Towards 10x fewer labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13706--13715.
[58]
Xu Yan, Jiantao Gao, Jie Li, Ruimao Zhang, Zhen Li, Rui Huang, and Shuguang Cui. 2021. Sparse single sweep lidar point cloud segmentation via learning contextual shape priors from scene completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3101--3109.
[59]
Zhen Ye, Yusheng Xu, Rong Huang, Xiaohua Tong, Xin Li, Xiangfeng Liu, Kuifeng Luan, Ludwig Hoegner, and Uwe Stilla. 2020. Lasdu: A large-scale aerial lidar dataset for semantic labeling in dense urban areas. ISPRS International Journal of Geo-Information, Vol. 9, 7 (2020), 450.
[60]
Yachao Zhang, Zonghao Li, Yuan Xie, Yanyun Qu, Cuihua Li, and Tao Mei. 2021b. Weakly supervised semantic segmentation for large-scale point cloud. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3421--3429.
[61]
Yachao Zhang, Yanyun Qu, Yuan Xie, Zonghao Li, Shanshan Zheng, and Cuihua Li. 2021c. Perturbed self-distillation: Weakly supervised large-scale point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 15520--15528.
[62]
Zaiwei Zhang, Rohit Girdhar, Armand Joulin, and Ishan Misra. 2021a. Self-supervised pretraining of 3d features on any point-cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10252--10263.
[63]
Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip HS Torr, and Vladlen Koltun. 2021. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 16259--16268.
[64]
Ziyu Zhao, Zhenyao Wu, Xinyi Wu, Canyu Zhang, and Song Wang. 2022. Crossmodal few-shot 3d point cloud semantic segmentation. In Proceedings of the 30th ACM International Conference on Multimedia. 4760--4768.

Cited By

View all
  • (2024)Gait Recognition in Large-scale Free Environment via Single LiDARProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681166(380-389)Online publication date: 28-Oct-2024
  • (2024)Towards Practical Human Motion Prediction with LiDAR Point CloudsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680720(7629-7638)Online publication date: 28-Oct-2024
  • (2024)BSTS: A Weakly-Supervised Method for Semantic Learning of 3D Point CloudsIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.342015034:11(11386-11399)Online publication date: Nov-2024
  • Show More Cited By

Index Terms

  1. Exploring Dual Representations in Large-Scale Point Clouds: A Simple Weakly Supervised Semantic Segmentation Framework
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Information & Contributors

              Information

              Published In

              cover image ACM Conferences
              MM '23: Proceedings of the 31st ACM International Conference on Multimedia
              October 2023
              9913 pages
              ISBN:9798400701085
              DOI:10.1145/3581783
              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Sponsors

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              Published: 27 October 2023

              Permissions

              Request permissions for this article.

              Check for updates

              Author Tags

              1. dual representation
              2. point cloud segmentation
              3. semantic query
              4. waekly supervised learning

              Qualifiers

              • Research-article

              Funding Sources

              Conference

              MM '23
              Sponsor:
              MM '23: The 31st ACM International Conference on Multimedia
              October 29 - November 3, 2023
              Ottawa ON, Canada

              Acceptance Rates

              Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • Downloads (Last 12 months)88
              • Downloads (Last 6 weeks)5
              Reflects downloads up to 05 Mar 2025

              Other Metrics

              Citations

              Cited By

              View all
              • (2024)Gait Recognition in Large-scale Free Environment via Single LiDARProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681166(380-389)Online publication date: 28-Oct-2024
              • (2024)Towards Practical Human Motion Prediction with LiDAR Point CloudsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680720(7629-7638)Online publication date: 28-Oct-2024
              • (2024)BSTS: A Weakly-Supervised Method for Semantic Learning of 3D Point CloudsIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.342015034:11(11386-11399)Online publication date: Nov-2024
              • (2024)Joint Semantic Segmentation using representations of LiDAR point clouds and camera imagesInformation Fusion10.1016/j.inffus.2024.102370108:COnline publication date: 17-Jul-2024
              • (2024)Neighborhood Feature Enhancement Flow Diffusion Model for Point Cloud GenerationPattern Recognition10.1007/978-3-031-78389-0_23(339-354)Online publication date: 5-Dec-2024

              View Options

              Login options

              View options

              PDF

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader

              Figures

              Tables

              Media

              Share

              Share

              Share this Publication link

              Share on social media