ABSTRACT
3D point clouds processing is a significant technical direction of autonomous driving, computer vision, and 3D mapping. However, due to the disorder and irregularity of 3D point clouds, it brings some challenges to its development. In recent years, Transformer, as an important technology in natural language processing, has been successfully applied in 2D image processing and achieved excellent results. Recently, relevant research on the application of Transformer on 3D point clouds has also been published. In this paper, we refer to the self-attention mechanism in the transformer architecture and propose a U-shaped network based on Transformer for 3D point clouds segmentation. And we do semantic segmentation experiments on the Stanford Large-Scale 3D Indoor Spaces Dataset (S3DIS). Experiments show that the performance of our proposed network is better than some semantic segmentation algorithms in common evaluation metrics.
- Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, and Mohammed Bennamoun. 2021. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans Pattern Anal Mach Intell (2021), 4338-4364. https://doi.org/10.1109/TPAMI.2020.3005434Google ScholarDigital Library
- Lawin, F. J., Danelljan, M., Tosteberg, P., Bhat, G., Khan, F. S., and Felsberg, M. 2017. Deep projective 3D semantic segmentation. In International Conference on Computer Analysis of Images and Patterns. Springer, Cham, 95-107 https://doi.org/10.1007/978-3-319-64689-3_8Google Scholar
- Lyne P. Tchapmi, Christopher B. Choy, Iro Armeni, JunYoung Gwak, and Silvio Savarese. 2017. SEGCloud: Semantic Segmentation of 3D Point Clouds. In 2017 international conference on 3D vision (3DV). IEEE, 537-547.https://doi.org/10.1109/3DV.2017.00067Google ScholarCross Ref
- Benjamin Graham, Martin Engelcke, and Laurens van der Maaten. 2018. 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9224-9232.https://doi.org/10.1109/CVPR.2018.00961Google Scholar
- Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2016. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition.652-660. https://doi.org/10.1109/CVPR.2017.16Google Scholar
- Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv preprint arXiv:1706.02413. http://arxiv.org.nudtproxy.yitlink.com:80/abs/1706.02413Google Scholar
- Noam Shazeer Niki Parmar Ashish Vaswani and Polosukhin. 2017. Attention Is All You Need. In Advances in neural information processing systems.5998-6008.http://arxiv.org.nudtproxy.yitlink.com:80/abs/1706.03762Google Scholar
- Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. PointCNN: Convolution On X-Transformed Points. Advances in neural information processing systems, 31: 820-830Google Scholar
- Wenxuan Wu, Zhongang Qi, and Li Fuxin. 2019. PointConv: Deep Convolutional Networks on 3D Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.9621-9630.https://doi.org/10.1109/CVPR.2019.00985Google ScholarCross Ref
- Xiaoqing Ye, Jiamao Li, Hexiao Huang, Liang Du, and Xiaolin Zhang. 2018. 3d recurrent neural networks with context fusion for point cloud semantic segmentation. In Proceedings of the European Conference on Computer Vision.403-41.https://doi.org/10.1007/978-3-030-01234-2_25Google ScholarDigital Library
- Lei Wang, Yuchun Huang, Yaolin Hou, Shenman Zhang, and Jie Shan. 2019. Graph attention convolution for point cloud semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition .10296-10305.https://doi.org/10.1109/CVPR.2019.01054Google ScholarCross Ref
- Yanni Ma, Yulan Guo, Hao Liu, Yinjie Lei, and Gongjian Wen. 2020. Global Context Reasoning for Semantic Segmentation of 3D Point Clouds. IEEE,2920-2929. https://doi.org/10.1109/WACV45572.2020.9093411Google Scholar
- Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... and Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.https://arxiv.org/abs/2010.11929Google Scholar
- Nico Engel, Vasileios Belagiannis, and Klaus Dietmayer. 2021. Point Transformer. IEEE Access (2021), 134826-134840. https://doi.org/10.1109/ACCESS.2021.3116304Google Scholar
- Menghao Guo, Junxiong Cai, Zhengning Liu, Taijiang Mu, Ralph R. Martin, and Shi-Min Hu. 2021. PCT: Point cloud transformer. Computational Visual Media (2021), 187-199. https://doi.org/10.1007/s41095-021-0229-5Google Scholar
- Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, and Vladlen Koltun. 2021. Point Transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision .16259-16268. https://doi.org/10.1109/ACCESS.2021.3116304Google Scholar
- Qiangui Huang, Weiyue Wang, and Ulrich Neumann. 2018. Recurrent Slice Networks for 3D Segmentation of Point Clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2626-2635.https://doi.org/10.1109/CVPR.2018.00278Google ScholarCross Ref
- Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, and Qian-Yi Zhou. 2018. Tangent Convolutions for Dense Prediction in 3D. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3887-3896Google ScholarCross Ref
- Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. 2019. Dynamic Graph CNN for Learning on Point Clouds. ACM T. Graphic. (2019), 1-12. https://doi.org/10.1145/3326362Google ScholarDigital Library
- Francis Engelmann, Theodora Kontogianni, Jonas Schult, and Bastian Leibe. 2019. Know What Your Neighbors Do: 3D Semantic Segmentation of Point Clouds. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-030-11015-4_29Google ScholarDigital Library
- Tianyang Lin, Yuxin Wang, Xiangyang Liu, and Xipeng Qiu. 2021. A Survey of Transformers. arXiv preprint arXiv:2106.04554. https://arxiv.org/abs/2106.04554Google Scholar
Recommendations
Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds
Computer Vision – ECCV 2020AbstractWe develop a novel learning scheme named Self-Prediction for 3D instance and semantic segmentation of point clouds. Distinct from most existing methods that focus on designing convolutional operators, our method designs a new learning scheme to ...
Multi-view Network with Transformer for Point Cloud Semantic Segmentation
ICIAI '22: Proceedings of the 2022 6th International Conference on Innovation in Artificial IntelligenceThe input of most point cloud semantic segmentation networks is the reconstructed complete point cloud, but in practical application scenarios, the vision devices often capture single frame point cloud data. In order to better adapt to the actual ...
JSENet: Joint Semantic Segmentation and Edge Detection Network for 3D Point Clouds
Computer Vision – ECCV 2020AbstractSemantic segmentation and semantic edge detection can be seen as two dual problems with close relationships in computer vision. Despite the fast evolution of learning-based 3D semantic segmentation methods, little attention has been drawn to the ...
Comments