A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps

Morimitsu, Henrique; Cesar, Roberto M.; Bloch, Isabelle

doi:10.1007/978-3-319-11755-3_47

Henrique Morimitsu¹⁷,
Roberto M. Cesar Jr.¹⁷ &
Isabelle Bloch¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8815))

Included in the following conference series:

International Conference Image Analysis and Recognition

2308 Accesses
1 Citations

Abstract

This paper presents a novel framework for object detection in videos that considers both structural and temporal information. Detection is performed by first applying low-level feature extraction techniques in each frame of the video. Then, additional robustness is obtained by considering the temporal stability of videos, using particle filters and probability maps, which encode information about the expected location of each object. Lastly, structural information of the scene is described using graphs, which allows us to further improve the results. As a practical application, we evaluate our approach on table tennis sport videos databases: the UCF101 table tennis shots and an in-house one. The observed results indicate that the proposed approach is robust, showing a high hit rate on the two databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Almajai, I., Yan, F., de Campos, T., Khan, A., Christmas, W., Windridge, D., Kittler, J.: Anomaly detection and knowledge transfer in automatic sports video annotation. In: Weinshall, D., Anemüller, J., van Gool, L. (eds.) Detection and Identification of Rare Audiovisual Cues. SCI, vol. 384, pp. 109–117. Springer, Heidelberg (2012)
Chapter Google Scholar
Bloch, I., Colliot, O., Cesar, R.: On the ternary spatial relation between. IEEE Transactions on Systems, Man, and Cybernetics SMC-B 36(2), 312–327 (2006)
Article Google Scholar
Bradski, G.R.: Real time face and object tracking as a component of a perceptual user interface. In: WACV, pp. 214–219 (1998)
Google Scholar
Choi, W., Chao, Y.W., Pantofaru, C., Savarese, S.: Understanding indoor scenes using 3D geometric phrases. In: CVPR, pp. 33–40 (2013)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE PAMI 32(9), 1627–1645 (2010)
Article Google Scholar
Gaur, U., Zhu, Y., Song, B., Chowdhury, A.K.R.: A “string of feature graphs” model for recognition of complex activities in natural videos. In: ICCV, pp. 2595–2602 (2011)
Google Scholar
Isard, M., Blake, A.: CONDENSATION - conditional density propagation for visual tracking. International Journal of Computer Vision 29(1), 5–28 (1998)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Matsushita, Y., Ofek, E., Ge, W., Tang, X., Shum, H.Y.: Full-frame video stabilization with motion inpainting. IEEE PAMI 28(7), 1150–1163 (2006)
Article Google Scholar
Morimitsu, H., Hashimoto, M., Pimentel, R.B., Cesar Jr, R.M., Hirata Jr, R.: Keygraphs for sign detection in indoor environments by mobile phones. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 315–324. Springer, Heidelberg (2011)
Chapter Google Scholar
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR, pp. 2161–2168 (2006)
Google Scholar
Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. CoRR abs/1212.0402 (2012)
Google Scholar
Wang, Z., Shi, Q., Shen, C., van den Hengel, A.: Bilinear programming for human activity recognition with unknown MRF graphs. In: CVPR, pp. 1690–1697 (2013)
Google Scholar
Widynski, N., Dubuisson, S., Bloch, I.: Fuzzy spatial constraints and ranked partitioned sampling approach for multiple object tracking. Computer Vision and Image Understanding (CVIU) 116(10), 1076–1094 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of São Paulo, São Paulo, Brazil
Henrique Morimitsu & Roberto M. Cesar Jr.
Institut Mines Télécom, Télécom ParisTech, CNRS LTCI, Paris, France
Isabelle Bloch

Authors

Henrique Morimitsu
View author publications
You can also search for this author in PubMed Google Scholar
Roberto M. Cesar Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Isabelle Bloch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henrique Morimitsu .

Editor information

Editors and Affiliations

Faculty of Engineering, University of Porto, Porto, Portugal
Aurélio Campilho
Dept. of Electrical and Computer Eng., University of Waterloo, Waterloo, Ontario, Canada
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Morimitsu, H., Cesar, R.M., Bloch, I. (2014). A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2014. Lecture Notes in Computer Science(), vol 8815. Springer, Cham. https://doi.org/10.1007/978-3-319-11755-3_47

Download citation

DOI: https://doi.org/10.1007/978-3-319-11755-3_47
Published: 10 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11754-6
Online ISBN: 978-3-319-11755-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics