skip to main content
10.1145/3343031.3351076acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving

Published: 15 October 2019 Publication History

Abstract

Semantic understanding of 3D scenes is essential for autonomous driving. Although a number of efforts have been devoted to semantic segmentation of dense point clouds, the great sparsity of 3D LiDAR data poses significant challenges in autonomous driving. In this paper, we work on the semantic segmentation problem of extremely sparse LiDAR point clouds with specific consideration of the ground as reference. In particular, we propose a ground-aware framework that well solves the ambiguity caused by data sparsity. We employ a multi-section plane fitting approach to roughly extract ground points to assist segmentation of objects on the ground. Based on the roughly extracted ground points, our approach implicitly integrates the ground information in a weakly-supervised manner and utilizes ground-aware features with a new ground-aware attention module. The proposed ground-aware attention module captures long-range dependence between ground and objects, which significantly facilitates the segmentation of small objects that only consist of a few points in extremely sparse point clouds. Extensive experiments on two large-scale LiDAR point cloud datasets for autonomous driving demonstrate that the proposed method achieves state-of-the-art performance both quantitatively and qualitatively.

References

[1]
A Boulch, B Le Saux, and N Audebert. 2017. Unstructured point cloud semantic labeling using deep segmentation networks. In Proceedings of the Workshop on 3D Object Retrieval. Eurographics Association, 17--24.
[2]
Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. 2003. On visual similarity based 3D model retrieval. In Computer Graphics Forum . 223--232.
[3]
Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2017. Multi-view 3D object detection network for autonomous driving. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 1907--1915.
[4]
Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 551--561.
[5]
DataFountain. [n. d.]. https://www.datafountain.cn/competitions/314/details/rank'sch=1367&page=1&type=A . Sep 4, 2018.
[6]
Martin A. Fischler and Robert C. Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of The ACM (1981), 381--395.
[7]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) . 3354--3361.
[8]
Timo Hackel, Nikolay Savinov, Lubor Ladicky, Jan Dirk Wegner, Konrad Schindler, and Marc Pollefeys. 2017. Semantic3d. net: A new large-scale point cloud classification benchmark. In ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences . 91--98.
[9]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 1125--1134.
[10]
Mingyang Jiang, Yiran Wu, and Cewu Lu. 2018. PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation. CoRR (2018).
[11]
Jianbo Jiao, Yunchao Wei, Zequn Jie, Honghui Shi, Rynson W.H. Lau, and Thomas S. Huang. 2019. Geometry-Aware Distillation for Indoor Semantic Segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .
[12]
Michael Kampffmeyer, Arnt-Borre Salberg, and Robert Jenssen. 2016. Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops(CVPRW). 1--9.
[13]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR) .
[14]
Loic Landrieu and Martin Simonovsky. 2018. Large-scale point cloud semantic segmentation with superpoint graphs. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 4558--4567.
[15]
Bo Li. 2017. 3d fully convolutional network for vehicle detection in point cloud. In IEEE International Conference on Intelligent Robots and Systems (IROS). IEEE, 1513--1518.
[16]
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2017. SEGCloud: Semantic segmentation of 3D point clouds. In International Conference on 3D Vision. 537--547.
[17]
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. PointCNN: Convolution On X-Transformed Points. In Proceedings of the International Conference on Neural Information Processing Systems. 820--830.
[18]
Tsung-Yi Lin, Piotr Dollár, Ross B Girshick, Kaiming He, Bharath Hariharan, and Serge J Belongie. 2017a. Feature Pyramid Networks for Object Detection. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 2117--2125.
[19]
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017b. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017).
[20]
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015), 1412--1421.
[21]
Daniel Maturana and Sebastian Scherer. 2015. VoxNet: A 3D convolutional neural network for real-time object recognition. In IEEE International Conference on Intelligent Robots and Systems (IROS). IEEE, 922--928.
[22]
Ankur P Parikh, Oscar T"ackström, Dipanjan Das, and Jakob Uszkoreit. 2016. A decomposable attention model for natural language inference. Proceedings of the Conference on Empirical Methods in Natural Language Processing (2016), 2249--2255.
[23]
Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran. 2018. Image transformer. In International Conference on Learning Representations. 4052--4061.
[24]
Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Al-ban Desmaison Luca Antiga Paszke, Sam Gross and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of the International Conference on Neural Information Processing Systems Workshop .
[25]
Charles R Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan, and Leonidas J Guibas. 2016. Volumetric and multi-view cnns for object classification on 3D data. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) . 5648--5656.
[26]
R Qi Charles, Hao Su, Mo Kaichun, and Leonidas Guibas. 2017a. PointNet
[27]
: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of the International Conference on Neural Information Processing Systems . 5105--5114.
[28]
R Qi Charles, Hao Su, Mo Kaichun, and Leonidas Guibas. 2017b. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) . 77--85.
[29]
Scott E Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, and Honglak Lee. 2016. Learning what and where to draw. In Proceedings of the International Conference on Neural Information Processing Systems. 217--225.
[30]
Gernot Riegler, Ali Osman Ulusoy, and Andreas Geiger. 2017. OctNet: Learning Deep 3D Representations at High Resolutions. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 3577--3586.
[31]
Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. 2019. Pointrcnn: 3d object proposal generation and detection from point cloud. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 770--779.
[32]
Shuran Song and Jianxiong Xiao. 2016. Deep sliding shapes for amodal 3d object detection in rgb-d images. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 808--816.
[33]
Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. 2009. A concise and provably informative multi-scale signature based on heat diffusion. In Computer Graphics Forum, Vol. 28. Wiley Online Library, 1383--1392.
[34]
Marvin Teichmann, Michael Weber, Marius Zoellner, Roberto Cipolla, and Raquel Urtasun. 2018. Multinet: Real-time joint semantic reasoning for autonomous driving. In IEEE Intelligent Vehicles Symposium (IV). IEEE, 1013--1020.
[35]
Robert Varga, Arthur Costea, Horatiu Florea, Ion Giosan, and Sergiu Nedevschi. 2017. Super-sensor for 360-degree environment perception: Point cloud segmentation using image features. In IEEE International Conference on Intelligent Transportation Systems (ITSC). IEEE, 1--8.
[36]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Neural Information Processing Systems. 5998--6008.
[37]
Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. 2018a. Understanding convolution for semantic segmentation. In IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1451--1460.
[38]
Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018b. Non-local neural networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 7794--7803.
[39]
Yuan Wang, Tianyue Shi, Peng Yun, Lei Tai, and Ming Liu. 2018c. PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud. CoRR (2018).
[40]
Zhixin Wang and Kui Jia. 2019. Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection. arXiv preprint arXiv:1903.01864 (2019).
[41]
Bichen Wu, Alvin Wan, Xiangyu Yue, and Kurt Keutzer. 2018a. SqueezeSeg: Convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D lidar point cloud. In IEEE International Conference on Robotics and Automation. IEEE, 1887--1893.
[42]
Bichen Wu, Xuanyu Zhou, Sicheng Zhao, Xiangyu Yue, and Kurt Keutzer. 2018b. Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. arXiv preprint arXiv:1809.08495 (2018).
[43]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In IEEE International Conference on Machine Learning. 2048--2057.
[44]
Guorun Yang, Hengshuang Zhao, Jianping Shi, Zhidong Deng, and Jiaya Jia. 2018. SegStereo: Exploiting semantic information for disparity estimation. In Proceedings of the European Conference on Computer Vision. 636--651.
[45]
Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. IEEE International Conference on Machine Learning (2019), 7354--7363.

Cited By

View all
  • (2024)MTSNet: Joint Feature Adaptation and Enhancement for Text-Guided Multi-view Martian Terrain SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681430(5092-5101)Online publication date: 28-Oct-2024
  • (2024)Adaptive LPU Decision for Dynamic Point Cloud CompressionIEEE Signal Processing Letters10.1109/LSP.2024.344922631(2370-2374)Online publication date: 2024
  • (2023)Saint Petersburg 3D: Creating a Large-Scale Hybrid Mobile LiDAR Point Cloud Dataset for Geospatial ApplicationsRemote Sensing10.3390/rs1511273515:11(2735)Online publication date: 24-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '19: Proceedings of the 27th ACM International Conference on Multimedia
October 2019
2794 pages
ISBN:9781450368896
DOI:10.1145/3343031
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. autonomous driving
  2. point clouds
  3. semantic segmentation
  4. sparse lidar

Qualifiers

  • Research-article

Funding Sources

  • the National Key Research & Development Plan of China
  • the National Natural Science Foundation of China
  • the Fundamental Research Funds for the Central Universities

Conference

MM '19
Sponsor:

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)3
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)MTSNet: Joint Feature Adaptation and Enhancement for Text-Guided Multi-view Martian Terrain SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681430(5092-5101)Online publication date: 28-Oct-2024
  • (2024)Adaptive LPU Decision for Dynamic Point Cloud CompressionIEEE Signal Processing Letters10.1109/LSP.2024.344922631(2370-2374)Online publication date: 2024
  • (2023)Saint Petersburg 3D: Creating a Large-Scale Hybrid Mobile LiDAR Point Cloud Dataset for Geospatial ApplicationsRemote Sensing10.3390/rs1511273515:11(2735)Online publication date: 24-May-2023
  • (2023)SPSD: Semantics and Deep Reinforcement Learning Based Motion Planning for Supermarket RobotIEICE Transactions on Information and Systems10.1587/transinf.2022DLP0057E106.D:5(765-772)Online publication date: 1-May-2023
  • (2023)Learning local contextual features for 3D point clouds semantic segmentation by attentive kernel convolutionThe Visual Computer10.1007/s00371-023-02819-940:2(831-847)Online publication date: 19-Mar-2023
  • (2022)Obstacle Detection for Autonomous Guided Vehicles through Point Cloud Clustering Using Depth DataMachines10.3390/machines1005033210:5(332)Online publication date: 2-May-2022
  • (2022)You Only Hypothesize OnceProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548023(1630-1641)Online publication date: 10-Oct-2022
  • (2022)Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label DiffusionIEEE Transactions on Image Processing10.1109/TIP.2022.317220831(3525-3540)Online publication date: 2022
  • (2022)Local and Global Structure for Urban ALS Point Cloud Semantic Segmentation With Ground-Aware AttentionIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2022.315836260(1-15)Online publication date: 2022
  • (2022)Image Segmentation Techniques: Statistical, Comprehensive, Semi-Automated Analysis and an Application Perspective Analysis of Mathematical ExpressionsArchives of Computational Methods in Engineering10.1007/s11831-022-09805-930:1(457-495)Online publication date: 22-Aug-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media