Authors:
Dongfang Liu
;
Yaqin Wang
and
Eric T. Matson
Affiliation:
Computer and Information Technology, Purdue University, West Lafayette, IN, U.S.A.
Keyword(s):
Autonomous Driving, Video Object Detection, Pixel Feature Calibration, Instance Feature Calibration.
Abstract:
Object detection is a critical task for autonomous driving. The latest progress in deep learning research for object detection has built a solid contribution to the development of autonomous driving. However, direct employment of the state-of-the-art object detectors from image to video is problematic. Object appearances in videos have more variations, e.g., video defocus, motion blur, truncation, etc. Such variations of objects in a video have fewer occurrences in still image and could compromise the detection results. To address these problems, we build a fast and accurate deep learning framework, motion-assist calibration network (MFCN) for video detection. Our model leverages the motion pattern of temporal coherence on video features. It calibrates and aggregates features of detected objects from previous frames along the spatial changes to improve the feature representations on the current frame. The whole model architecture is trained end-to-end which boosts the detection accur
acy. Validations on the KITTI and ImageNet dataset show that MFCN can improve the baseline results by 10.3% and 9.9% mAP respectively. Compared with other state-of-the-art models, MFCN achieves a leading performance on KITTI benchmark competition. Results indicate the effectiveness of the proposed model which could facilitate the autonomous driving system.
(More)