Abstract:
Deep learning-based object detection is a critical technology for autonomous driving, as it enables vehicles to perceive and react to the environment. However, most model...Show MoreMetadata
Abstract:
Deep learning-based object detection is a critical technology for autonomous driving, as it enables vehicles to perceive and react to the environment. However, most models are heavily dependent on annotated data, which can limit their ability to detect rare or unusual objects, also known as corner cases. In this paper, we apply a state-of-the-art text-prompted object detection model to handle this problem. This model leverages a text-image multimodal cross-attention mechanism to incorporate semantic information from natural language. Therefore, it can detect objects under limited data and has strong open-set detection and zero-shot generalization capabilities. We generated text prompts based on the prior distribution of objects and performed joint detection with images. Our experimental results on the CODA dataset demonstrate that our proposed approach significantly outperforms baseline models in terms of mean average precision (mAP) and mean average recall (mAR) scores. Our work presents a new detection approach that can contribute to the development of safer and more intelligent autonomous driving systems.
Date of Conference: 24-28 September 2023
Date Added to IEEE Xplore: 13 February 2024
ISBN Information: