Abstract:
The representation of objects as 2D bounding boxes in monocular RGB images limits the faculty of current computer vision systems to 2D object detection. It fails to provi...Show MoreMetadata
Abstract:
The representation of objects as 2D bounding boxes in monocular RGB images limits the faculty of current computer vision systems to 2D object detection. It fails to provide crucial information such as the orientation of other vehicles, which is vital for autonomous driving. At the same time, real-time performance is essential to qualify an approach for deployment in a productive environment. In order to tackle this problem, we present an approach that predicts several key points selected from a virtual 3D bounding box around a vehicle instead of a pure 2D bounding box. These key points can be interpreted as a bounding shape. With this novel representation we can calculate the actual 3D bounding box of the corresponding object. Thanks to the straightforward implementation of bounding shape in any current state-of-the-art 2D object detector both for singleshot frameworks like YOLO or SSD as well as for two-stage detectors like Faster-RCNN with a minimum of computational overhead, it is able to be run in real-time while providing additional useful information for vehicle detection. We exemplify the extension of SSD to Bounding Shape SSD ( BS3D) and evaluate our approach using the challenging KITTI as well as the novel VIPER dataset.
Published in: 2019 IEEE Intelligent Vehicles Symposium (IV)
Date of Conference: 09-12 June 2019
Date Added to IEEE Xplore: 29 August 2019
ISBN Information: