Abstract:
The sparsity of plane point clouds remains human detection based on 2-D LiDAR a challenging task. Recent deep-learning-based methods have made significant progress, but t...Show MoreMetadata
Abstract:
The sparsity of plane point clouds remains human detection based on 2-D LiDAR a challenging task. Recent deep-learning-based methods have made significant progress, but their unidimensional feature modeling hinders the flow of information across different axes (i.e., temporal, spatial, and batched). Besides, convolution operators limit the effective receptive field (ERF) of the model. To promote multidimensional interaction, we propose an omni-dimensional aggregation (ODA) module composed of a main aggregation stream and an auxiliary feature stream to assist in the temporal-spatial joint encoding as well as implicitly models the potential relationships between samples. By cascading multiple ODA blocks, the semantic gaps between different features are gradually bridged. In addition, to improve network convergence and generalization, an adaptive focus mechanism (AFM) is designed to guide the model to optimize more on difficult samples instead of simple or low-quality samples. Extensive experiments demonstrate the effectiveness of the proposed pipeline, achieving 76.4% and 81.4% AP on the DROW and JRDB benchmark datasets, respectively, surpassing the existing state-of-the-art (SOTA) results. Competitive results are also achieved in inference speed. Furthermore, we successfully validate the proposed model through real-world environments. Our code is available at https://github.com/ai-winter/Li2Former-v1.
Published in: IEEE Transactions on Instrumentation and Measurement ( Volume: 73)