Constant-time monocular object detection using scene geometry

Nieto, Marcos; Ortega, Juan Diego; Leškovský, Peter; Senderos, Orti

doi:10.1007/s10044-017-0625-8

Constant-time monocular object detection using scene geometry

Theoretical Advances
Published: 31 May 2017

Volume 21, pages 1053–1066, (2018)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Marcos Nieto ORCID: orcid.org/0000-0001-9879-0992¹,
Juan Diego Ortega¹,
Peter Leškovský¹ &
…
Orti Senderos¹

358 Accesses
1 Citation
Explore all metrics

Abstract

This paper presents a structured approach for efficiently exploiting the perspective information of a scene to enhance the detection of objects in monocular systems. It defines a finite grid of 3D positions on the dominant ground plane and computes occupancy maps from which object location estimates are extracted . This method works on the top of any detection method, either pixel-wise (e.g. background subtraction) or region-wise (e.g. detection-by-classification) technique, which can be linked to the proposed scheme with minimal fine tuning. Its flexibility thus allows for applying this approach in a wide variety of applications and sectors, such as surveillance applications (e.g. person detection) or driver assistance systems (e.g. vehicle or pedestrian detection). Extensive results provide evidence of its excellent performance and its ease of use in combination with different image processing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pedestrian Verification for Multi-Camera Detection

Automatic Calibration of Stationary Surveillance Cameras in the Wild

Leveraging Object Recognition in Reliable Vehicle Localization from Monocular Images

References

Sobral A, Bouwmans T (2014) BGS library: a library framework for algorithm’s evaluation in foreground/background segmentation. In: Bouwmans T et al (eds) Background modeling and foreground detection for video surveillance. Chapman and Hall/CRC, UK. doi:10.1201/b17223-29
Chapter Google Scholar
Bouwmans T (2015) Traditional and recent approaches in background modeling for foreground detection: an overview. Comput Sci Rev 11–12:31–36
MATH Google Scholar
Cheng L, Gong M (2009) Real time background subtraction from dynamics scenes. In: International conference on computer vision (ICCV). pp 2066–2073
Kryjak T, Komorkiewicz M, Gorgon M (2012) Real-time background generation and foreground object segmentation for high-definition colour video stream in FPGA device. J Real Time Image Proc 9(1):61–77
Article Google Scholar
Del Bimbo A, Lisanti G, Masi I, Pernici F (2010) Person detection using temporal and geometric context with a pan tilt zoom camera. In: 20th International conference on pattern recognition (ICPR). pp 3886–3889
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Ortega JD, Nieto M, Cortes A, Florez J (2013) Perspective multiscale detection of vehicles for real-time forward collision avoidance systems. In: Advanced concepts for intelligent vision systems. Lecture notes in computer science, vol 8192. pp 645–656
Chapter Google Scholar
Carr P, Sheikh Y, Matthews I (2012) Monocular object detection using 3D geometric primitives. In: European conference on computer vision (ECCV). Lecture notes in computer science, vol 7572. pp 864–878
Chapter Google Scholar
Buch N, Cracknell M, Orwell J, Velastin SA (2009) Vehicle localisation and classification in urban CCTV streams. In: 16th World congress on intelligent transport systems
Gonzalez A, Villalonga G, Ros G, Vazquez D, Lopez AM (2015) 3D-guided multiscale sliding window for pedestrian detection. In: Pattern recognition and image analysis. Lecture notes in computer science, vol 9117. pp 560–568
Chapter Google Scholar
Brown L, Feris R, Pankanti S (2014) Temporal non-maximum suppression for pedestrian detection using self-calibration. In: 22nd International conference on pattern recognition (ICPR). pp 2239–2244
Hoeim D, Efros AA, Hebert M (2008) Putting objects in perspective. Int J Comput Vis 80(1):3–15
Article Google Scholar
Pan J, Kanade T (2013) Coherent object detection with 3D geometric context from a single image. In: IEEE international conference on computer vision (ICCV). pp 2576–2583
Bartoli F, Lisanti G, Karaman S, Bagdanov A, Del Bimbo A (2014) Unsupervised scene adaptation for faster multi-scale pedestrian detection. In: 22nd International conference on pattern recognition (ICPR). pp 3534–3539
Cai Y (2006) Robust visual tracking for multiple targets. In: European conference on computer vision (ECCV). pp 107–118
Chapter Google Scholar
Broggi A, Bertozzi M, Fascioli A (2001) Self-calibration of a stereo vision system for automotive applications. IEEE Int Conf Robot Autom (ICRA) 4:3698–3703
Google Scholar
Fleuret F, Berclaz J, Lengagne R, Fua P (2008) Multicamera people tracking with a probabilistic occupancy map. IEEE Trans Pattern Anal Mach Intell 30(2):267–282
Article Google Scholar
Benenson R, Omran M, Hosang J, Schiele B (2014) Ten years of pedestrian detection, what have we learned? In: ECCV, CVRSUAD workshop
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. IEEE Conf Comput Vis Pattern Recognit (CVPR) 1:886–893
Google Scholar
Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545
Article Google Scholar
Benenson R, Mathias M, Timofte R, Van Gool L (2012) Pedestrian detection at 100 frames per second. In: IEEE conference on computer vision and pattern recognition (CVPR). pp 2903–2910
LeCun Y, Bengio Y, Hinton G (2005) Deep learning. Nature 521(7553):436–444
Article Google Scholar
NVIDIA (2016) DetectNet: deep neural network for object detection in DIGITS. https://devblogs.nvidia.com/parallelforall/detectnet-deep-neural-network-object-detection-digits/
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Conference on neural information processing systems (NIPS)
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Conference on neural information processing systems (NIPS)
Nieto M, Ortega JD, Cortes A, Gaines S (2014) Perspective multiscale detection and tracking of persons. In: Multimedia modeling. Lecture notes in computer science, vol 8326. pp 92–103
Chapter Google Scholar
Hartley RI, Zisserman A (2004) Multiple view geometry in computer vision. Cambridge University Press, Cambridge
Book Google Scholar
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Article Google Scholar
Satzoda RK, Trivedi MM (2014) Efficient lane and vehicle detection with integrated synergies (ELVIS). In: IEEE conference on computer vision and pattern recognition (CVPR) workshops
Benfold B, Reid I (2011) Stable multi-target tracking in real-time surveillance video. In: IEEE conference on computer vision and pattern recognition (CVPR). pp 3457–3464
D’Orazio T, Leo M, Mosca N, Spagnolo P, Mazzeo PL (2009) A semi-automatic system for ground truth generation of soccer video sequences. In: Sixth IEEE international conference on advanced video and signal based surveillance (AVSS). pp 559–564
Blunsden SJ, Fisher RB (2010) The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Ann BMVA 4:1–12
Article Google Scholar
Zivkovic Z (2004) Improved adaptive Gaussian mixture model for background subtraction. In: 17th International conference on pattern recognition (ICPR). pp 28–31
MacFarlane NJB, Schofield CP (1995) Segmentation and tracking of piglets in images. Mach Vis Appl 8(3):187–193
Article Google Scholar
Godbehere AB, Matsukawa A, Goldberg K (2012) Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. In: American control conference (ACC). pp 4305–4312
Garrido-Jurado S, Muñoz-Salinas R, Madrid-Cuevas FJ, Marín-Jiménez MJ (2014) Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn 47(6):2280–2292
Article Google Scholar

Download references

Acknowledgements

This work has been partially supported by the EU projects SAVASA (Grant Agreement 285621) and P-REACT (Grant Agreement 607881) under the 7th Marco Framework, and by the program Basque Government under projects IAB of the ETORGAI framework and EFITRANS of the ETORTEK framework.

Author information

Authors and Affiliations

Vicomtech-IK4, Mikeletegi 57, P. Tecnológico, San Sebastián, Spain
Marcos Nieto, Juan Diego Ortega, Peter Leškovský & Orti Senderos

Authors

Marcos Nieto
View author publications
You can also search for this author in PubMed Google Scholar
Juan Diego Ortega
View author publications
You can also search for this author in PubMed Google Scholar
Peter Leškovský
View author publications
You can also search for this author in PubMed Google Scholar
Orti Senderos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcos Nieto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nieto, M., Ortega, J.D., Leškovský, P. et al. Constant-time monocular object detection using scene geometry. Pattern Anal Applic 21, 1053–1066 (2018). https://doi.org/10.1007/s10044-017-0625-8

Download citation

Received: 26 May 2016
Accepted: 24 May 2017
Published: 31 May 2017
Issue Date: November 2018
DOI: https://doi.org/10.1007/s10044-017-0625-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constant-time monocular object detection using scene geometry

Abstract

Access this article

Similar content being viewed by others

Pedestrian Verification for Multi-Camera Detection

Automatic Calibration of Stationary Surveillance Cameras in the Wild

Leveraging Object Recognition in Reliable Vehicle Localization from Monocular Images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Constant-time monocular object detection using scene geometry

Abstract

Access this article

Similar content being viewed by others

Pedestrian Verification for Multi-Camera Detection

Automatic Calibration of Stationary Surveillance Cameras in the Wild

Leveraging Object Recognition in Reliable Vehicle Localization from Monocular Images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation