Abstract:
This paper introduces an end-to-end object detection hardware accelerator that directly processes RAW video signals to generate detection results, enabling a holistic app...Show MoreMetadata
Abstract:
This paper introduces an end-to-end object detection hardware accelerator that directly processes RAW video signals to generate detection results, enabling a holistic approach to optimization. Unlike existing works that primarily concentrate on the back-end object detector, we explore the redundancy present across multiple stages of the processing pipeline such as the image signal processing (ISP), the temporal correlation in consecutive frames and the back-end detector. A prototype of Deformable Parts Models (DPM)-based accelerator has been successfully validated on the Altera TR5 field-programmable gate array (FPGA) platform. This accelerator demonstrates efficient processing of high-resolution ( 1920\times 1080 ) videos at 60 frames per second (FPS) while incorporating a 12-scale gradient pyramid and consuming only 130.9 KB blocks of memory. To optimize the search process for motion estimation, we adopt the time division multiplexing (TDM) technology, which effectively reduces both multiplexer usage and memory access. Compared to conventional methods that scan a 1080p frame, the proposed head-based motion search hardware consumes 6.82% of the processing cycles and utilizes merely 6.9 KB of block memory. Evaluation and comparison results demonstrate the effectiveness of the proposed system.
Published in: IEEE Transactions on Circuits and Systems I: Regular Papers ( Volume: 71, Issue: 11, November 2024)