research-article

Joint Point Clouds Semantic and Instance Segmentation by Local Aggregation and Clustering

Authors:

Zongmin LiAuthors Info & Claims

ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing

Article No.: 25, Pages 1 - 9

https://doi.org/10.1145/3604078.3604103

Published: 26 October 2023 Publication History

Abstract

Existing methods fuse semantic and instance information directly so that they interfere with each other and cannot accurately identify semantic and instance edges. To address the above issues, we propose a novel network of local aggregation and clustering for joint point clouds semantic and instance segmentation task, named LAC-SIS. First, the point cloud features is extracted by edge convolution, and the geometric descriptors are generated in Euclidean and feature space to aggregate local features. The two geometric descriptors are fused to obtain the local information in two spaces. After that, the semantic and instance feature fusion module is used to selectively fuse the semantic and instance information to obtain the final semantic segmentation result and instance embedding. Finally, the points are clustered in the Euclidean and the feature space to get the instance labels of the global points. The segmentation results on the public datasets S3DIS and ScanNet V2 show that our model effectively improves the ability to distinguish point classes. Our model can accurately predict semantic and instance labels and identifies object edges clearly. Compared with other methods, our method has a significant improvement.

References

[1]

Li, X.; Yao, X.; Fang, Y. Building-A-Nets: Robust Building Extraction From High Resolution Remote Sensing Images With Adversarial Networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2018, 11, 3680–3687.

[2]

Liu, Y.S.; Fang, Y.; Ramani, K. Using least median of squares for structural superposition of flexible proteins. BMC bioinformatics 2009, 10, 29.

[3]

Ren, M.; Niu, L.; Fang, Y. 3D-A-Nets: 3D Deep Dense Descriptor for Volumetric Shapes with Adversarial Networks 2017.

[4]

Heimann, V.; Spruck, A.; Kaup, A. Frequency-Selective Geometry Upsampling of Point Clouds. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022.

[5]

Kusari, A.; Sun, W. Graph-theoretical approach to robust 3D normal extraction of LiDAR data. ArXiv 2022, abs/2205.11460.

[6]

Qiu, S.; Anwar, S.; Barnes, N. Geometric Back-Projection Network for Point Cloud Classification. IEEE Transactions on Multimedia 2022, 24, 1943–1955.

[7]

Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view Convolutional Neural Networks for 3D Shape Recognition. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 945–953.

Digital Library

[8]

Qi, C.R.; Su, H.; Nießner, M.; Dai, A.; Yan, M.; Guibas, L.J. Volumetric and Multi-view CNNs for Object Classification on 3D Data. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 5648–5656.

[9]

Guerry, J.; Boulch, A.; Le Saux, B.; Moras, J.; Plyer, A.; Filliat, D. SnapNet-R: Consistent 3D Multi-view Semantic Labeling for Robotics. In Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), 2017, pp. 669–678.

[10]

Charles, R.Q.; Su, H.; Kaichun, M.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 77–85.

[11]

Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space; Curran Associates Inc.: Red Hook, NY, USA, 2017; NIPS’17, p. 5105–5114.

[12]

Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans. Graph. 2019, 38.

[13]

Thomas, H.; Qi, C.R.; Deschaud, J.E.; Marcotegui, B.; Goulette, F.; Guibas, L. KPConv: Flexible and Deformable Convolution for Point Clouds. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6410–6419.

[14]

Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds. CoRR 2019, abs/1911.11236.

[15]

Wang, Z.; Rao, Y.; Yu, X.; Zhou, J.; Lu, J. SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[16]

Tang, L.; Zhan, Y.; Chen, Z.; Yu, B.; Tao, D. Contrastive Boundary Learning for Point Cloud Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 8479–8489.

[17]

Guo, M.; Cai, J.; Liu, Z.; Mu, T.; Martin, R.R.; Hu, S. PCT: Point Cloud Transformer. CoRR 2020, abs/2012.09688.

[18]

Zhao, H.; Jiang, L.; Jia, J.; Torr, P.; Koltun, V. Point Transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 16239–16248.

[19]

Zhang, C.; Wan, H.; Shen, X.; Wu, Z. PatchFormer: An Efficient Point Transformer with Patch Attention. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 11789–11798.

[20]

Wang, W.; Yu, R.; Huang, Q.; Neumann, U. SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation. CoRR 2017, abs/1711.08588.

[21]

Jiang, L.; Zhao, H.; Shi, S.; Liu, S.; Fu, C.; Jia, J. PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation. CoRR 2020, abs/2004.01658.

[22]

Han, L.; Zheng, T.; Xu, L.; Fang, L. OccuSeg: Occupancy-Aware 3D Instance Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2937–2946.

[23]

He, T.; Shen, C.; van den Hengel, A. DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution. CoRR 2020, abs/2011.13328.

[24]

Schult, J.; Engelmann, F.; Hermans, A.; Litany, O.; Tang, S.; Leibe, B. Mask3D for 3D Semantic Instance Segmentation 2023.

[25]

Vu, T.; Kim, K.; Luu, T.M.; Nguyen, X.T.; Yoo, C.D. SoftGroup for 3D Instance Segmentation on 3D Point Clouds. In Proceedings of the CVPR, 2022.

[26]

Zhong, M.; Chen, X.; Chen, X.; Zeng, G.; Wang, Y. MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2022.

[27]

Yi, L.; Zhao, W.; Wang, H.; Sung, M.; Guibas, L.J. GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud. CoRR 2018, abs/1812.03320.

[28]

Hou, J.; Dai, A.; Nießner, M. 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans. CoRR 2018, abs/1812.07003.

[29]

Yang, B.; Wang, J.; Clark, R.; Hu, Q.; Wang, S.; Markham, A.; Trigoni, N., Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds. In Proceedings of the 33rd International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2019.

[30]

Liu, S.; Yu, S.; Wu, S.; Chen, H.; Liu, T. Learning Gaussian Instance Segmentation in Point Clouds. CoRR 2020, abs/2007.09860.

[31]

Pham, Q.H.; Nguyen, T.; Hua, B.S.; Roig, G.; Yeung, S.K. JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds With Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 8819–8828.

[32]

Wang, X.; Liu, S.; Shen, X.; Shen, C.; Jia, J. Associatively Segmenting Instances and Semantics in Point Clouds. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4091–4100.

[33]

Tan, J.; Chen, L.; Wang, K.; Peng, J.; Li, J.; Zhang, X. SASO: Joint 3D Semantic-Instance Segmentation via Multi-scale Semantic Association and Salient Point Clustering Optimization. CoRR 2020, abs/2006.15015.

[34]

Wen, X.; Han, Z.; Youk, G.; Liu, Y.S. CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention. In Proceedings of the Proceedings of the 28th ACM International Conference on Multimedia; Association for Computing Machinery: New York, NY, USA, 2020; MM ’20, p. 1661–1669.

Digital Library

[35]

Dai, A.; Chang, A.X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. In Proceedings of the Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 2017.

[36]

Armeni, I.; Sener, O.; Zamir, A.R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S. 3D Semantic Parsing of Large-Scale Indoor Spaces. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1534–1543.

[37]

Qiu, S.; Anwar, S.; Barnes, N. Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion. CoRR 2021, abs/2103.07074.

[38]

Wu, Y.; Shi, M.; Du, S.; Lu, H.; Cao, Z.; Zhong, W. 3D Instances as 1D Kernels. In Proceedings of the Proceedings of European Conference on Computer Vision (ECCV), 2022.

[39]

Wu, W.; Qi, Z.; Fuxin, L. PointConv: Deep Convolutional Networks on 3D Point Clouds. arXiv preprint arXiv:1811.07246 2018.

[40]

Yan, X.; Zheng, C.; Li, Z.; Wang, S.; Cui, S. Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 5589–5598.

[41]

Graham, B.; van der Maaten, L. Submanifold Sparse Convolutional Networks. arXiv preprint arXiv:1706.01307 2017.

[42]

Gong, J.; Xu, J.; Tan, X.; Song, H.; Qu, Y.; Xie, Y.; Ma, L. Omni-supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning. CoRR 2021, abs/2105.10203.

[43]

Liu, C.; Furukawa, Y. MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation. CoRR 2019, abs/1902.04478.

[44]

Lahoud, J.; Ghanem, B.; Pollefeys, M.; Oswald, M.R. 3D Instance Segmentation via Multi-task Metric Learning. CoRR 2019, abs/1906.08650.

[45]

Liu, H.; Liu, R.; Yang, K.; Zhang, J.; Peng, K.; Stiefelhagen, R. HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor. 2021, pp. 1780–1790.Martin A. Fischler and Robert C. Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 6 (June 1981), 381–395. https://doi.org/10.1145/358669.358692

Digital Library

Index Terms

Joint Point Clouds Semantic and Instance Segmentation by Local Aggregation and Clustering
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding

Recommendations

CF-SIS: Semantic-Instance Segmentation of 3D Point Clouds by Context Fusion with Self-Attention
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

3D Semantic-Instance Segmentation (SIS) is a newly emerging research direction that aims to understand visual information of 3D scene on both semantic and instance level. The main difficulty lies in how to coordinate the paradox between mutual aid and ...
Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds
Computer Vision – ECCV 2020
Abstract
We develop a novel learning scheme named Self-Prediction for 3D instance and semantic segmentation of point clouds. Distinct from most existing methods that focus on designing convolutional operators, our method designs a new learning scheme to ...
MFFAE-Net: semantic segmentation of point clouds using multi-scale feature fusion and attention enhancement networks
Abstract
Point cloud data can reflect more information about the real 3D space, which has gained increasing attention in computer vision field. But the unstructured and unordered nature of point clouds poses many challenges in their study. How to learn the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing

May 2023

711 pages

ISBN:9798400708237

DOI:10.1145/3604078

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China
National key r&d program
the Shandong Provincial Natural Science Foundation

Conference

ICDIP 2023

ICDIP 2023: The 15th International Conference on Digital Image Processing

May 19 - 22, 2023

Nanjing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
26
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten