Skip to main content

A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps

  • Conference paper
  • First Online:
Image Analysis and Recognition (ICIAR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8815))

Included in the following conference series:

Abstract

This paper presents a novel framework for object detection in videos that considers both structural and temporal information. Detection is performed by first applying low-level feature extraction techniques in each frame of the video. Then, additional robustness is obtained by considering the temporal stability of videos, using particle filters and probability maps, which encode information about the expected location of each object. Lastly, structural information of the scene is described using graphs, which allows us to further improve the results. As a practical application, we evaluate our approach on table tennis sport videos databases: the UCF101 table tennis shots and an in-house one. The observed results indicate that the proposed approach is robust, showing a high hit rate on the two databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Almajai, I., Yan, F., de Campos, T., Khan, A., Christmas, W., Windridge, D., Kittler, J.: Anomaly detection and knowledge transfer in automatic sports video annotation. In: Weinshall, D., Anemüller, J., van Gool, L. (eds.) Detection and Identification of Rare Audiovisual Cues. SCI, vol. 384, pp. 109–117. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  2. Bloch, I., Colliot, O., Cesar, R.: On the ternary spatial relation between. IEEE Transactions on Systems, Man, and Cybernetics SMC-B 36(2), 312–327 (2006)

    Article  Google Scholar 

  3. Bradski, G.R.: Real time face and object tracking as a component of a perceptual user interface. In: WACV, pp. 214–219 (1998)

    Google Scholar 

  4. Choi, W., Chao, Y.W., Pantofaru, C., Savarese, S.: Understanding indoor scenes using 3D geometric phrases. In: CVPR, pp. 33–40 (2013)

    Google Scholar 

  5. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)

    Google Scholar 

  6. Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE PAMI 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  7. Gaur, U., Zhu, Y., Song, B., Chowdhury, A.K.R.: A “string of feature graphs” model for recognition of complex activities in natural videos. In: ICCV, pp. 2595–2602 (2011)

    Google Scholar 

  8. Isard, M., Blake, A.: CONDENSATION - conditional density propagation for visual tracking. International Journal of Computer Vision 29(1), 5–28 (1998)

    Article  Google Scholar 

  9. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  10. Matsushita, Y., Ofek, E., Ge, W., Tang, X., Shum, H.Y.: Full-frame video stabilization with motion inpainting. IEEE PAMI 28(7), 1150–1163 (2006)

    Article  Google Scholar 

  11. Morimitsu, H., Hashimoto, M., Pimentel, R.B., Cesar Jr, R.M., Hirata Jr, R.: Keygraphs for sign detection in indoor environments by mobile phones. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 315–324. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR, pp. 2161–2168 (2006)

    Google Scholar 

  13. Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. CoRR abs/1212.0402 (2012)

    Google Scholar 

  14. Wang, Z., Shi, Q., Shen, C., van den Hengel, A.: Bilinear programming for human activity recognition with unknown MRF graphs. In: CVPR, pp. 1690–1697 (2013)

    Google Scholar 

  15. Widynski, N., Dubuisson, S., Bloch, I.: Fuzzy spatial constraints and ranked partitioned sampling approach for multiple object tracking. Computer Vision and Image Understanding (CVIU) 116(10), 1076–1094 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henrique Morimitsu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Morimitsu, H., Cesar, R.M., Bloch, I. (2014). A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2014. Lecture Notes in Computer Science(), vol 8815. Springer, Cham. https://doi.org/10.1007/978-3-319-11755-3_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11755-3_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11754-6

  • Online ISBN: 978-3-319-11755-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics