research-article

Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving

Authors:

Qingxiong Yang,

Xuejin ChenAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 971 - 979

https://doi.org/10.1145/3343031.3351076

Published: 15 October 2019 Publication History

Abstract

Semantic understanding of 3D scenes is essential for autonomous driving. Although a number of efforts have been devoted to semantic segmentation of dense point clouds, the great sparsity of 3D LiDAR data poses significant challenges in autonomous driving. In this paper, we work on the semantic segmentation problem of extremely sparse LiDAR point clouds with specific consideration of the ground as reference. In particular, we propose a ground-aware framework that well solves the ambiguity caused by data sparsity. We employ a multi-section plane fitting approach to roughly extract ground points to assist segmentation of objects on the ground. Based on the roughly extracted ground points, our approach implicitly integrates the ground information in a weakly-supervised manner and utilizes ground-aware features with a new ground-aware attention module. The proposed ground-aware attention module captures long-range dependence between ground and objects, which significantly facilitates the segmentation of small objects that only consist of a few points in extremely sparse point clouds. Extensive experiments on two large-scale LiDAR point cloud datasets for autonomous driving demonstrate that the proposed method achieves state-of-the-art performance both quantitatively and qualitatively.

References

[1]

A Boulch, B Le Saux, and N Audebert. 2017. Unstructured point cloud semantic labeling using deep segmentation networks. In Proceedings of the Workshop on 3D Object Retrieval. Eurographics Association, 17--24.

[2]

Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. 2003. On visual similarity based 3D model retrieval. In Computer Graphics Forum . 223--232.

[3]

Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2017. Multi-view 3D object detection network for autonomous driving. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 1907--1915.

[4]

Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 551--561.

[5]

DataFountain. [n. d.]. https://www.datafountain.cn/competitions/314/details/rank'sch=1367&page=1&type=A . Sep 4, 2018.

[6]

Martin A. Fischler and Robert C. Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of The ACM (1981), 381--395.

[7]

Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) . 3354--3361.

[8]

Timo Hackel, Nikolay Savinov, Lubor Ladicky, Jan Dirk Wegner, Konrad Schindler, and Marc Pollefeys. 2017. Semantic3d. net: A new large-scale point cloud classification benchmark. In ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences . 91--98.

[9]

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 1125--1134.

[10]

Mingyang Jiang, Yiran Wu, and Cewu Lu. 2018. PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation. CoRR (2018).

[11]

Jianbo Jiao, Yunchao Wei, Zequn Jie, Honghui Shi, Rynson W.H. Lau, and Thomas S. Huang. 2019. Geometry-Aware Distillation for Indoor Semantic Segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .

[12]

Michael Kampffmeyer, Arnt-Borre Salberg, and Robert Jenssen. 2016. Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops(CVPRW). 1--9.

[13]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR) .

[14]

Loic Landrieu and Martin Simonovsky. 2018. Large-scale point cloud semantic segmentation with superpoint graphs. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 4558--4567.

[15]

Bo Li. 2017. 3d fully convolutional network for vehicle detection in point cloud. In IEEE International Conference on Intelligent Robots and Systems (IROS). IEEE, 1513--1518.

[16]

Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2017. SEGCloud: Semantic segmentation of 3D point clouds. In International Conference on 3D Vision. 537--547.

[17]

Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. PointCNN: Convolution On X-Transformed Points. In Proceedings of the International Conference on Neural Information Processing Systems. 820--830.

[18]

Tsung-Yi Lin, Piotr Dollár, Ross B Girshick, Kaiming He, Bharath Hariharan, and Serge J Belongie. 2017a. Feature Pyramid Networks for Object Detection. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 2117--2125.

[19]

Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017b. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017).

[20]

Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015), 1412--1421.

[21]

Daniel Maturana and Sebastian Scherer. 2015. VoxNet: A 3D convolutional neural network for real-time object recognition. In IEEE International Conference on Intelligent Robots and Systems (IROS). IEEE, 922--928.

[22]

Ankur P Parikh, Oscar T"ackström, Dipanjan Das, and Jakob Uszkoreit. 2016. A decomposable attention model for natural language inference. Proceedings of the Conference on Empirical Methods in Natural Language Processing (2016), 2249--2255.

[23]

Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran. 2018. Image transformer. In International Conference on Learning Representations. 4052--4061.

[24]

Soumith Chintala Gregory Chanan Edward Yang Zachary DeVito Zeming Lin Al-ban Desmaison Luca Antiga Paszke, Sam Gross and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of the International Conference on Neural Information Processing Systems Workshop .

[25]

Charles R Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan, and Leonidas J Guibas. 2016. Volumetric and multi-view cnns for object classification on 3D data. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) . 5648--5656.

[26]

R Qi Charles, Hao Su, Mo Kaichun, and Leonidas Guibas. 2017a. PointNet

[27]

: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of the International Conference on Neural Information Processing Systems . 5105--5114.

[28]

R Qi Charles, Hao Su, Mo Kaichun, and Leonidas Guibas. 2017b. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) . 77--85.

[29]

Scott E Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, and Honglak Lee. 2016. Learning what and where to draw. In Proceedings of the International Conference on Neural Information Processing Systems. 217--225.

[30]

Gernot Riegler, Ali Osman Ulusoy, and Andreas Geiger. 2017. OctNet: Learning Deep 3D Representations at High Resolutions. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 3577--3586.

[31]

Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. 2019. Pointrcnn: 3d object proposal generation and detection from point cloud. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 770--779.

[32]

Shuran Song and Jianxiong Xiao. 2016. Deep sliding shapes for amodal 3d object detection in rgb-d images. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 808--816.

[33]

Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. 2009. A concise and provably informative multi-scale signature based on heat diffusion. In Computer Graphics Forum, Vol. 28. Wiley Online Library, 1383--1392.

[34]

Marvin Teichmann, Michael Weber, Marius Zoellner, Roberto Cipolla, and Raquel Urtasun. 2018. Multinet: Real-time joint semantic reasoning for autonomous driving. In IEEE Intelligent Vehicles Symposium (IV). IEEE, 1013--1020.

[35]

Robert Varga, Arthur Costea, Horatiu Florea, Ion Giosan, and Sergiu Nedevschi. 2017. Super-sensor for 360-degree environment perception: Point cloud segmentation using image features. In IEEE International Conference on Intelligent Transportation Systems (ITSC). IEEE, 1--8.

[36]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Neural Information Processing Systems. 5998--6008.

[37]

Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. 2018a. Understanding convolution for semantic segmentation. In IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1451--1460.

[38]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018b. Non-local neural networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR). 7794--7803.

[39]

Yuan Wang, Tianyue Shi, Peng Yun, Lei Tai, and Ming Liu. 2018c. PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud. CoRR (2018).

[40]

Zhixin Wang and Kui Jia. 2019. Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection. arXiv preprint arXiv:1903.01864 (2019).

[41]

Bichen Wu, Alvin Wan, Xiangyu Yue, and Kurt Keutzer. 2018a. SqueezeSeg: Convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D lidar point cloud. In IEEE International Conference on Robotics and Automation. IEEE, 1887--1893.

[42]

Bichen Wu, Xuanyu Zhou, Sicheng Zhao, Xiangyu Yue, and Kurt Keutzer. 2018b. Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. arXiv preprint arXiv:1809.08495 (2018).

[43]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In IEEE International Conference on Machine Learning. 2048--2057.

[44]

Guorun Yang, Hengshuang Zhao, Jianping Shi, Zhidong Deng, and Jiaya Jia. 2018. SegStereo: Exploiting semantic information for disparity estimation. In Proceedings of the European Conference on Computer Vision. 636--651.

[45]

Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-attention generative adversarial networks. IEEE International Conference on Machine Learning (2019), 7354--7363.

Cited By

Fang YRao XGao XLi WMin ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)MTSNet: Joint Feature Adaptation and Enhancement for Text-Guided Multi-view Martian Terrain SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681430(5092-5101)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681430
Mu XGao WYuan HWang SLi G(2024)Adaptive LPU Decision for Dynamic Point Cloud CompressionIEEE Signal Processing Letters10.1109/LSP.2024.344922631(2370-2374)Online publication date: 2024
https://doi.org/10.1109/LSP.2024.3449226
Lytkin SBadenko VFedotov AVinogradov KChervak AMilanov YZotov D(2023)Saint Petersburg 3D: Creating a Large-Scale Hybrid Mobile LiDAR Point Cloud Dataset for Geospatial ApplicationsRemote Sensing10.3390/rs1511273515:11(2735)Online publication date: 24-May-2023
https://doi.org/10.3390/rs15112735
Show More Cited By

Index Terms

Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Numerical Analysis of Tractor Accidents using Driving Simulator for Autonomous Driving Tractor
ICMRE'19: Proceedings of the 5th International Conference on Mechatronics and Robotics Engineering

Autonomous driving of automobiles is a hot research topic in recent years. The autonomous driving tractor also has been studied in the agricultural field as well as an autonomous driving automobile. On the other hand, tractor accidents frequently occur ...
Semantic segmentation in autonomous driving—an example of FCN
ICIAI '23: Proceedings of the 2023 7th International Conference on Innovation in Artificial Intelligence

With the continuous development of artificial intelligence, autonomous driving technology is becoming more and more mature. The accuracy and real-time judgment of road conditions in autonomous driving technology are very important. In autonomous driving,...
Autonomous Driving: Investigating the Feasibility of Bimodal Take-Over Requests

Autonomous vehicles will need de-escalation strategies to compensate when reaching system limitations. Car-driver handovers can be considered one possible method to deal with system boundaries. The authors suggest a bimodal auditory and visual handover ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the National Key Research & Development Plan of China
the National Natural Science Foundation of China
the Fundamental Research Funds for the Central Universities

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
500
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)3

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fang YRao XGao XLi WMin ZCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)MTSNet: Joint Feature Adaptation and Enhancement for Text-Guided Multi-view Martian Terrain SegmentationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681430(5092-5101)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681430
Mu XGao WYuan HWang SLi G(2024)Adaptive LPU Decision for Dynamic Point Cloud CompressionIEEE Signal Processing Letters10.1109/LSP.2024.344922631(2370-2374)Online publication date: 2024
https://doi.org/10.1109/LSP.2024.3449226
Lytkin SBadenko VFedotov AVinogradov KChervak AMilanov YZotov D(2023)Saint Petersburg 3D: Creating a Large-Scale Hybrid Mobile LiDAR Point Cloud Dataset for Geospatial ApplicationsRemote Sensing10.3390/rs1511273515:11(2735)Online publication date: 24-May-2023
https://doi.org/10.3390/rs15112735
CAI JHUANG WYOU YCHEN ZREN BLIU H(2023)SPSD: Semantics and Deep Reinforcement Learning Based Motion Planning for Supermarket RobotIEICE Transactions on Information and Systems10.1587/transinf.2022DLP0057E106.D:5(765-772)Online publication date: 1-May-2023
https://doi.org/10.1587/transinf.2022DLP0057
Tong GShao YPeng H(2023)Learning local contextual features for 3D point clouds semantic segmentation by attentive kernel convolutionThe Visual Computer10.1007/s00371-023-02819-940:2(831-847)Online publication date: 19-Mar-2023
https://doi.org/10.1007/s00371-023-02819-9
Pires MCouto PSantos AFilipe V(2022)Obstacle Detection for Autonomous Guided Vehicles through Point Cloud Clustering Using Depth DataMachines10.3390/machines1005033210:5(332)Online publication date: 2-May-2022
https://doi.org/10.3390/machines10050332
Wang HLiu YDong ZWang WMagalhães Jdel Bimbo ASatoh SSebe NAlameda-Pineda XJin QOria VToni L(2022)You Only Hypothesize OnceProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548023(1630-1641)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3503161.3548023
Liao LChen WXiao JWang ZLin CSatoh S(2022)Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label DiffusionIEEE Transactions on Image Processing10.1109/TIP.2022.317220831(3525-3540)Online publication date: 2022
https://doi.org/10.1109/TIP.2022.3172208
Jiang TWang YLiu SCong YDai LSun J(2022)Local and Global Structure for Urban ALS Point Cloud Semantic Segmentation With Ground-Aware AttentionIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2022.315836260(1-15)Online publication date: 2022
https://doi.org/10.1109/TGRS.2022.3158362
Sakshi Kukreja V(2022)Image Segmentation Techniques: Statistical, Comprehensive, Semi-Automated Analysis and an Application Perspective Analysis of Mathematical ExpressionsArchives of Computational Methods in Engineering10.1007/s11831-022-09805-930:1(457-495)Online publication date: 22-Aug-2022
https://doi.org/10.1007/s11831-022-09805-9
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten