Loading [MathJax]/extensions/MathMenu.js
Bird's-Eye-View Semantic Segmentation With Two-Stream Compact Depth Transformation and Feature Rectification | IEEE Journals & Magazine | IEEE Xplore

Bird's-Eye-View Semantic Segmentation With Two-Stream Compact Depth Transformation and Feature Rectification


Abstract:

Bird's-eye-view (BEV) perception has gained popularity since it provides a 3D world representation with scale consistency. Although existing camera-based solutions achiev...Show More

Abstract:

Bird's-eye-view (BEV) perception has gained popularity since it provides a 3D world representation with scale consistency. Although existing camera-based solutions achieve excellent performance, the BEV positions related to features are still less accurate. In this article, a BEV semantic segmentation framework with two-stream compact depth transformation and feature rectification is proposed. To balance the conflict that the feature maps ensemble tends to use two temporal frames with long interval, while shorter temporal frames are more beneficial to depth prediction, a two-stream compact depth transformation is designed. Between original temporal frames, we introduce an intermediate frame to decouple the joint depth estimation of original frames. The local representations of the intermediate frame are respectively matched with each original temporal frame to achieve stereo depth predictions, where compact cost volumes are built to significantly reduce memory usage with high discriminability in depth-dimension. Further, virtual camera intrinsic parameters are derived to realize adaption of compact cost volume to various 2D data augmentation and improve generalization. On this basis, BEV feature maps are obtained via feature transformation. With the influence of depth distribution errors to BEV feature map, a feature rectified segmentation network is proposed to dynamically adjust the position offsets of input features via deformable convolution and semantic information-guided feature learning. As a result, a dense and accurate BEV semantic map is obtained. In addition, a self-supervised depth estimation teacher is adopted to provide extra supervision for depth prediction of our segmentation framework. The effectiveness of the proposed method is verified on public datasets.
Published in: IEEE Transactions on Intelligent Vehicles ( Volume: 8, Issue: 11, November 2023)
Page(s): 4546 - 4558
Date of Publication: 15 May 2023

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.