Accelerating Video Object Detection by Exploiting Prior Object Locations

Ulker, Berk; Stuijk, Sander; Corporaal, Henk; Wijnhoven, Rob

doi:10.1007/978-3-031-06430-2_55

Accelerating Video Object Detection by Exploiting Prior Object Locations

Berk Ulker¹²,
Sander Stuijk¹²,
Henk Corporaal¹² &
…
Rob Wijnhoven¹³

Conference paper
First Online: 17 May 2022

1691 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13232))

Abstract

We provide a set of generic modifications to improve the execution efficiency of single-shot object detectors by exploiting prior object locations in video sequences. We propose a crop-based method to accelerate object detection tasks. It dynamically generates crop regions based on prior information and exploits scene sparsity enabling focused use of computational resources. In contrast to prior work, smaller input resolutions for processing crop regions are used to further reduce computational load. The execution efficiency is increased by avoiding multiple executions of the detector in full resolution. Data augmentations are used to successfully train these lower-resolution networks and maintain their accuracy at the baseline level while reducing inference time. Experiments with two public datasets, UA-DETRAC [13] and UAVDT [2], using the SSD-ML [19] object detection architecture with \(128\times 128\), \(64\times 64\) and \(32\times 32\) input resolutions show that we can achieve a maximum speedup by a factor of 1.7 on the UA-DETRAC dataset, and 1.6 on the UAVDT dataset while delivering the same level of accuracy as the base method. An extensive set of experiments demonstrates the speed-accuracy trade-off and shows that our method can achieve accuracy comparable to state-of-the-art methods at lower execution time.

This work is funded by the NWO Perspectief program ZERO.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Experiments performed with PyTorch v1.8, CUDA and cuDNN 10.2.

References

Abdulla, W.: Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow (2017). https://github.com/matterport/Mask_RCNN
Du, D., et al.: The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 370–386 (2018)
Google Scholar
Li, C., Yang, T., Zhu, S., Chen, C., Guan, S.: Density map guided object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 190–191 (2020)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Ozge Unel, F., Ozkalayci, B.O., Cigla, C.: The power of tiling for small object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Google Scholar
Rŭžička, V., Franchetti, F.: Fast and accurate object detection in high resolution 4K and 8K video using GPUs. In: 2018 IEEE High Performance extreme Computing Conference (HPEC), pp. 1–7. IEEE (2018)
Google Scholar
Sun, X., Wu, P., Hoi, S.C.: Face detection using deep learning: an improved faster RCNN approach. Neurocomputing 299, 42–50 (2018)
Article Google Scholar
Wang, Y., Mao, K., Chen, T., Yin, Y., He, S., Chen, G.: Accelerating real-time object detection in high-resolution video surveillance. Concurr. Comput. Pract. Exp., e6307 (2021)
Google Scholar
Wen, L., et al.: UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking. Comput. Vis. Image Underst. 193, 102907 (2020)
Article Google Scholar
Xu, J., Li, Y., Wang, S.: AdaZoom: adaptive zoom network for multi-scale object detection in large scenes. arXiv preprint arXiv:2106.10409 (2021)
Yang, F., Fan, H., Chu, P., Blasch, E., Ling, H.: Clustered object detection in aerial images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8311–8320 (2019)
Google Scholar
Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. (CSUR) 38(4), 13-es (2006)
Google Scholar
Zhang, J., Huang, J., Chen, X., Zhang, D.: How to fully exploit the abilities of aerial image detectors. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Google Scholar
Zhang, X., Izquierdo, E., Chandramouli, K.: Dense and small object detection in UAV vision based on cascade network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Google Scholar
Zwemer, M., Wijnhoven, R.G., et al.: SSD-ML: hierarchical object classification for traffic surveillance. In: 15th International Conference on Computer Vision. Imaging and Computer Graphics Theory and Applications (VISAPP2020), pp. 250–259. SCITEPRESS-Science and Technology Publications, LDA (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Eindhoven University of Technology, Eindhoven, Netherlands
Berk Ulker, Sander Stuijk & Henk Corporaal
ViNotion BV, Eindhoven, Netherlands
Rob Wijnhoven

Authors

Berk Ulker
View author publications
You can also search for this author in PubMed Google Scholar
Sander Stuijk
View author publications
You can also search for this author in PubMed Google Scholar
Henk Corporaal
View author publications
You can also search for this author in PubMed Google Scholar
Rob Wijnhoven
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Berk Ulker .

Editor information

Editors and Affiliations

Boston University, Boston, MA, USA
Stan Sclaroff
National Research Council, Lecce, Italy
Cosimo Distante
National Research Council, Lecce, Italy
Marco Leo
University of Catania, Catania, Italy
Giovanni M. Farinella
Technische Universität München, Garching, Germany
Federico Tombari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ulker, B., Stuijk, S., Corporaal, H., Wijnhoven, R. (2022). Accelerating Video Object Detection by Exploiting Prior Object Locations. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13232. Springer, Cham. https://doi.org/10.1007/978-3-031-06430-2_55

Download citation

DOI: https://doi.org/10.1007/978-3-031-06430-2_55
Published: 17 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06429-6
Online ISBN: 978-3-031-06430-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics