research-article

Domain Generalization-Aware Uncertainty Introspective Learning for 3D Point Clouds Segmentation

Authors:

Ronghua ShangAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 651 - 660

https://doi.org/10.1145/3664647.3681574

Published: 28 October 2024 Publication History

Abstract

Domain generalization 3D segmentation aims to learn the point clouds with unknown distributions. Feature augmentation has been proven to be effective for domain generalization. However, each point of the 3D segmentation scene contains uncertainty in the target domain, which affects model generalization. This paper proposes the Domain Generalization-Aware Uncertainty Introspective Learning (DGUIL) method, including Potential Uncertainty Modeling (PUM) and Momentum Introspective Learning (MIL), to deal with the point uncertainty in domain shift. Specifically, PUM explores the underlying uncertain point cloud features and generates the different distributions for each point. The PUM enhances the point features over an adaptive range, which provides various information for simulating the distribution of the target domain. Then, MIL is designed to learn generalized feature representation in uncertain distributions. The MIL utilizes uncertainty correlation representation to measure the predicted divergence of knowledge accumulation, which learns to carefully judge and understand divergence through uncertainty introspection loss. Finally, extensive experiments verify the advantages of the proposed method over current state-of-the-art methods. The code will be available at https://github.com/ChicalH/DGUIL.

References

[1]

Jens Behley, Martin Garbade, Andres Milioto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, and Jurgen Gall. 2019. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF international conference on computer vision. 9297--9307.

[2]

Mario Bijelic, Tobias Gruber, Fahim Mannan, Florian Kraus, Werner Ritter, Klaus Dietmayer, and Felix Heide. 2020. Seeing through fog without seeing fog: Deep multimodal sensor fusion in unseen adverse weather. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11682--11692.

[3]

Christopher Choy, JunYoung Gwak, and Silvio Savarese. 2019. 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3075--3084.

[4]

Tuo Feng, Licheng Jiao, Hao Zhu, and Long Sun. 2020. A novel object re-track framework for 3D point clouds. In Proceedings of the 28th ACM International Conference on Multimedia. 3118--3126.

Digital Library

[5]

Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, and Mohammed Bennamoun. 2020. Deep learning for 3d point clouds: A survey. IEEE transactions on pattern analysis and machine intelligence, Vol. 43, 12 (2020), 4338--4364.

Digital Library

[6]

Pei He, Licheng Jiao, Fang Liu, Xu Liu, Ronghua Shang, and Shuang Wang. 2024. Cross-Domain Scene Unsupervised Learning Segmentation with Dynamic Subdomains. IEEE Transactions on Multimedia (2024), 1--16.

Digital Library

[7]

Pei He, Licheng Jiao, Ronghua Shang, Xu Liu, Fang Liu, Shuyuan Yang, Xiangrong Zhang, and Shuang Wang. 2023. A Patch Diversity Transformer for Domain Generalized Semantic Segmentation. IEEE Transactions on Neural Networks and Learning Systems (2023), 1--13.

[8]

Pei He, Licheng Jiao, Ronghua Shang, Shuang Wang, Xu Liu, Dou Quan, Kun Yang, and Dong Zhao. 2022. MANet: Multi-scale aware-relation network for semantic segmentation in aerial scenes. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022), 1--15.

[9]

Qingyong Hu, Bo Yang, Linhai Xie, Stefano Rosa, Yulan Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham. 2020. Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11108--11117.

[10]

Shengshan Hu, Wei Liu, Minghui Li, Yechao Zhang, Xiaogeng Liu, Xianlong Wang, Leo Yu Zhang, and Junhui Hou. 2023. Pointcrt: Detecting backdoor in 3d point cloud via corruption robustness. In Proceedings of the 31st ACM International Conference on Multimedia. 666--675.

Digital Library

[11]

Hyeonseong Kim, Yoonsu Kang, Changgyoon Oh, and Kuk-Jin Yoon. 2023. Single Domain Generalization for LiDAR Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 17587--17598.

[12]

Abhijit Kundu, Xiaoqi Yin, Alireza Fathi, David Ross, Brian Brewington, Thomas Funkhouser, and Caroline Pantofaru. 2020. Virtual multi-view fusion for 3d semantic segmentation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXIV 16. Springer, 518--535.

[13]

Haoliang Li, Sinno Jialin Pan, Shiqi Wang, and Alex C Kot. 2018. Domain generalization with adversarial feature learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5400--5409.

[14]

Haoyu Ma, Xiangru Lin, and Yizhou Yu. 2022. I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 46 (2022), 1695--1710.

Digital Library

[15]

Wenping Ma, Mingyu Yue, Yue Wu, Yongzhe Yuan, Hao Zhu, Biao Hou, and Licheng Jiao. 2023. Explore the Influence of Shallow Information on Point Cloud Registration. IEEE Transactions on Neural Networks and Learning Systems (2023), 1--13.

[16]

Yanbiao Ma, Licheng Jiao, Fang Liu, Shuyuan Yang, Xu Liu, and Lingling Li. 2023. Orthogonal Uncertainty Representation of Data Manifold for Robust Long-Tailed Learning. In Proceedings of the 31st ACM International Conference on Multimedia. 4848--4857.

Digital Library

[17]

Daniel Maturana and Sebastian Scherer. 2015. Voxnet: A 3d convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 922--928.

Digital Library

[18]

Xing Nie, Yongcheng Liu, Shaohong Chen, Jianlong Chang, Chunlei Huo, Gaofeng Meng, Qi Tian, Weiming Hu, and Chunhong Pan. 2021. Differentiable convolution search for point cloud processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7437--7446.

[19]

Zhenhua Ning, Zhuotao Tian, Guangming Lu, and Wenjie Pei. 2023. Boosting Few-shot 3D Point Cloud Segmentation via Query-Guided Enhancement. In Proceedings of the 31st ACM International Conference on Multimedia. 1895--1904.

Digital Library

[20]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).

[21]

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652--660.

[22]

Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, Vol. 30 (2017).

[23]

Cristiano Saltori, Fabio Galasso, Giuseppe Fiameni, Nicu Sebe, Elisa Ricci, and Fabio Poiesi. 2022. Cosmix: Compositional semantic mix for domain adaptation in 3d lidar segmentation. In European Conference on Computer Vision. Springer, 586--602.

Digital Library

[24]

Jules Sanchez, Jean-Emmanuel Deschaud, and Franccois Goulette. 2023. Domain Generalization of 3D Semantic Segmentation in Autonomous Driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 18077--18087.

[25]

N Senin, S Catalucci, M Moretti, and RK Leach. 2021. Statistical point cloud model to investigate measurement uncertainty in coordinate metrology. Precision Engineering, Vol. 70 (2021), 44--62.

[26]

Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, and Song Han. 2020. Searching efficient 3d architectures with sparse point-voxel convolution. In European conference on computer vision. Springer, 685--702.

Digital Library

[27]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).

[28]

Bichen Wu, Xuanyu Zhou, Sicheng Zhao, Xiangyu Yue, and Kurt Keutzer. 2019. Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In 2019 international conference on robotics and automation (ICRA). IEEE, 4376--4382.

Digital Library

[29]

Aoran Xiao, Jiaxing Huang, Dayan Guan, Kaiwen Cui, Shijian Lu, and Ling Shao. 2022. Polarmix: A general data augmentation technique for lidar point clouds. Advances in Neural Information Processing Systems, Vol. 35 (2022), 11035--11048.

[30]

Aoran Xiao, Jiaxing Huang, Dayan Guan, Fangneng Zhan, and Shijian Lu. 2022. Transfer learning from synthetic to real lidar point cloud for semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 2795--2803.

[31]

Aoran Xiao, Jiaxing Huang, Weihao Xuan, Ruijie Ren, Kangcheng Liu, Dayan Guan, Abdulmotaleb El Saddik, Shijian Lu, and Eric P Xing. 2023. 3d semantic segmentation in the wild: Learning generalized models for adverse-condition point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9382--9392.

[32]

Chenfeng Xu, Bichen Wu, Zining Wang, Wei Zhan, Peter Vajda, Kurt Keutzer, and Masayoshi Tomizuka. 2020. Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXVIII 16. Springer, 1--19.

[33]

Zongyi Xu, Bo Yuan, Shanshan Zhao, Qianni Zhang, and Xinbo Gao. 2023. Hierarchical Point-based Active Learning for Semi-supervised Point Cloud Semantic Segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 18098--18108.

[34]

Xu Yan, Jiantao Gao, Jie Li, Ruimao Zhang, Zhen Li, Rui Huang, and Shuguang Cui. 2021. Sparse single sweep lidar point cloud segmentation via learning contextual shape priors from scene completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3101--3109.

[35]

Jingyi Yang, Licheng Jiao, Ronghua Shang, Xu Liu, Ruiyang Li, and Longchang Xu. 2023. EPT-Net: Edge Perception Transformer for 3D Medical Image Segmentation. IEEE Transactions on Medical Imaging (2023), 1--1.

[36]

Yuwei Yang, Munawar Hayat, Zhao Jin, Chao Ren, and Yinjie Lei. 2023. Geometry and Uncertainty-Aware 3D Point Cloud Class-Incremental Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21759--21768.

[37]

Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, and Bei Yu. 2022. Pcl: Proxy-based contrastive learning for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7097--7107.

[38]

Li Yi, Boqing Gong, and Thomas Funkhouser. 2021. Complete & label: A domain adaptation approach to semantic segmentation of lidar point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 15363--15373.

[39]

Gege Zhang, Qinghua Ma, Licheng Jiao, Fang Liu, and Qigong Sun. 2021. AttAN: Attention adversarial networks for 3D point cloud semantic segmentation. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 789--796.

[40]

Sicheng Zhao, Yezhen Wang, Bo Li, Bichen Wu, Yang Gao, Pengfei Xu, Trevor Darrell, and Kurt Keutzer. 2021. epointda: An end-to-end simulation-to-real domain adaptation framework for lidar point cloud segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3500--3509.

[41]

Yu Zheng, Xiuwei Xu, Jie Zhou, and Jiwen Lu. 2022. PointRas: Uncertainty-aware multi-resolution learning for point cloud segmentation. IEEE Transactions on Image Processing, Vol. 31 (2022), 6002--6016.

Digital Library

[42]

Zhedong Zheng and Yi Yang. 2021. Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision, Vol. 129, 4 (2021), 1106--1120.

Digital Library

Index Terms

Domain Generalization-Aware Uncertainty Introspective Learning for 3D Point Clouds Segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
      2. Image and video acquisition
        3D imaging
    2. Planning and scheduling
      1. Planning under uncertainty

Recommendations

Learning Cross-Domain Features for Domain Generalization on Point Clouds
Pattern Recognition and Computer Vision
Abstract
Modern deep neural networks trained on a set of source domains are generally difficult to perform well on an unseen target domain with different data statistics. Domain generalization (DG) aims to learn a generalized model that performs well on ...
Calibration-based Dual Prototypical Contrastive Learning Approach for Domain Generalization Semantic Segmentation
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Prototypical contrastive learning (PCL) has been widely used to learn class-wise domain-invariant features recently. These methods are based on the assumption that the prototypes, which are represented as the central value of the same class in a certain ...
Learning to Learn with Variational Information Bottleneck for Domain Generalization
Computer Vision – ECCV 2020
Abstract
Domain generalization models learn to generalize to previously unseen domains, but suffer from prediction uncertainty and domain shift. In this paper, we address both problems. We introduce a probabilistic meta-learning model for domain ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
75
Total Downloads

Downloads (Last 12 months)75
Downloads (Last 6 weeks)21

Reflects downloads up to 21 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents