loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Mohammed Hassoubah 1 ; 2 ; Ibrahim Sobh 2 and Mohamed Elhelw 1

Affiliations: 1 Center for Informatics Science, Nile University, Egypt ; 2 Valeo, Egypt

Keyword(s): Epistemic Uncertainty, LiDAR, Self-supervision Training, Semantic Segmentation, Transformer.

Abstract: For the task of semantic segmentation of 2D or 3D inputs, Transformer architecture suffers limitation in the ability of localization because of lacking low-level details. Also for the Transformer to function well, it has to be pre-trained first. Still pre-training the Transformer is an open area of research. In this work, Transformer is integrated into the U-Net architecture as (Chen et al., 2021). The new architecture is trained to conduct semantic segmentation of 2D spherical images generated from projecting the 3D LiDAR point cloud. Such integration allows capturing the the local dependencies from CNN backbone processing of the input, followed by Transformer processing to capture the long range dependencies. To define the best pre-training settings, multiple ablations have been executed to the network architecture, the self-training loss function and self-training procedure, and results are observed. It’s proved that, the integrated architecture and self-training improve the mIoU by +1.75% over U-Net architecture only, even with self-training it too. Corrupting the input and self-train the network for reconstruction of the original input improves the mIoU by highest difference = 2.9% over using reconstruction plus contrastive training objective. Self-training the model improves the mIoU by 0.48% over initialising with imageNet pre-trained model even with self-training the pre-trained model too. Random initialisation of the Batch Normalisation layers improves the mIoU by 2.66% over using selftrained parameters. Self supervision training of the segmentation network reduces the model’s epistemic uncertainty. The integrated architecture and self-training outperformed the SalsaNext (Cortinhal et al., 2020) (to our knowledge it’s the best projection based semantic segmentation network) by 5.53% higher mIoU, using the SemanticKITTI (Behley et al., 2019) validation dataset with 2D input dimension 1024×64. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.149.213.209

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Hassoubah, M.; Sobh, I. and Elhelw, M. (2022). Study of LiDAR Segmentation and Model's Uncertainty using Transformer for Different Pre-trainings. In Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 4: VISAPP; ISBN 978-989-758-555-5; ISSN 2184-4321, SciTePress, pages 1010-1019. DOI: 10.5220/0010969700003124

@conference{visapp22,
author={Mohammed Hassoubah. and Ibrahim Sobh. and Mohamed Elhelw.},
title={Study of LiDAR Segmentation and Model's Uncertainty using Transformer for Different Pre-trainings},
booktitle={Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 4: VISAPP},
year={2022},
pages={1010-1019},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010969700003124},
isbn={978-989-758-555-5},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - Volume 4: VISAPP
TI - Study of LiDAR Segmentation and Model's Uncertainty using Transformer for Different Pre-trainings
SN - 978-989-758-555-5
IS - 2184-4321
AU - Hassoubah, M.
AU - Sobh, I.
AU - Elhelw, M.
PY - 2022
SP - 1010
EP - 1019
DO - 10.5220/0010969700003124
PB - SciTePress