Skip to main content

Online Bayesian Video Summarization and Linking

  • Conference paper
  • First Online:
Book cover Image and Video Retrieval (CIVR 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2383))

Included in the following conference series:

Abstract

In this paper, an online Bayesian formulation is presented to detect and describe the most significant key-frames and shot boundaries of a video sequence. Visual information is encoded in terms of a reduced number of degrees of freedom in order to provide robustness to noise, gradual transitions, flashes, camera motion and illumination changes. We present an online algorithm where images are classified according to their appearance contents-pixel values plus shape information- in order to obtain a structured representation from sequential information. This structured representation is presented on a grid where nodes correspond to the location of the representative image for each cluster. Since the estimation process takes simultaneously into account clustering and nodes’ locations in the representation space, key-frames are placed considering visual similarities among neighbors. This fact not only provides a powerful tool for video navigation but also offers an organization for posterior higher-level analysis such as identifying pieces of news, interviews, etc.

This work was supported by CICYT grant TEL99-1206-C02-02 and CERTAP Generalitat de Catalunya.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. F. Arman, A. Hsu, and M.-Y. Chiu. Feature management for large video databases. In Storage and Retrieval for Image and Video Databases (SPIE), pages 2–12, 1993.

    Google Scholar 

  2. F. Arman, A. Hsu, and M.-Y. Chiu. Image processing on compressed data for large video databases. In Computer Graphics (Multimedia’ 93 Proceedings), pages 267–272. Addison-Wesley, 1993.

    Google Scholar 

  3. J. Corridoni and A. D. Bimbo. Structured representation and automatic indexing of movie information contents. Pattern Recognition, 31:2027–2045, 1998.

    Article  Google Scholar 

  4. A. Dempster, N. Lair, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39:1–38, 1977.

    MATH  Google Scholar 

  5. Y. Gong and X. Liu. Generating optimal video summaries. In IEEE International Conference on Multimedia and Expo (III), pages 1559–1562, 2000.

    Google Scholar 

  6. G. E. Hinton, P. Dayan, and M. Revow. Modeling the manifolds of images of handwritten digits. IEEE trans. on Neural Networks, 8(1):65–74, 1997.

    Article  Google Scholar 

  7. J. Kivinen and M. Warmuth. Additive versus exponentiated gradient updates for linear prediction. Journal of Information and Computation, 132:1–64, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  8. B. Moghaddam and A. Pentland. Probabilistic visual learning for object representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):696–710, 1997.

    Article  Google Scholar 

  9. R. Neal and G. Hinton. A new view of the EM algorithm that justifies incremental and other variants. Kluwer, 1998.

    Google Scholar 

  10. M. Smith and M. Christel. Automating video database indexing and retrieval. ACM Int. Conf. on Multimedia, pages 357–358, 1995.

    Google Scholar 

  11. K. Sung and T. Poggio. Example-based learning for view-based human face detection. A.I. Memo 1521, C.B.C.L. Paper 112, 1994.

    Google Scholar 

  12. M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71–86, 1991.

    Article  Google Scholar 

  13. H. Yu and W. Wolf. A visual search system for video and image databases. Int. Conf. on Multimedia Computing and Systems, pages 517–524, 1997.

    Google Scholar 

  14. R. Zabih, J. Miller, and K. Mai. A feature-based algorithm for detecting and classifying scene breaks, 1995.

    Google Scholar 

  15. H. Zhang, A. Kankanhalli, and S. Smoliar. Automatic partitioning of video. Multimedia Systems, 1:10–28, 1993.

    Article  Google Scholar 

  16. W. Zhao, J. Wang, D. Bhat, K. Sakiewicz, N. Nandhakumar, and W. Chang. Improving color based video shot detection. In ICMCS, Vol. 2, pages 752–756, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Orriols, X., Binefa, X. (2002). Online Bayesian Video Summarization and Linking. In: Lew, M.S., Sebe, N., Eakins, J.P. (eds) Image and Video Retrieval. CIVR 2002. Lecture Notes in Computer Science, vol 2383. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45479-9_36

Download citation

  • DOI: https://doi.org/10.1007/3-540-45479-9_36

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43899-1

  • Online ISBN: 978-3-540-45479-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics