Online Bayesian Video Summarization and Linking

Orriols, Xavier; Binefa, Xavier

doi:10.1007/3-540-45479-9_36

Xavier Orriols⁶ &
Xavier Binefa⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2383))

Included in the following conference series:

International Conference on Image and Video Retrieval

610 Accesses
1 Citations

Abstract

In this paper, an online Bayesian formulation is presented to detect and describe the most significant key-frames and shot boundaries of a video sequence. Visual information is encoded in terms of a reduced number of degrees of freedom in order to provide robustness to noise, gradual transitions, flashes, camera motion and illumination changes. We present an online algorithm where images are classified according to their appearance contents-pixel values plus shape information- in order to obtain a structured representation from sequential information. This structured representation is presented on a grid where nodes correspond to the location of the representative image for each cluster. Since the estimation process takes simultaneously into account clustering and nodes’ locations in the representation space, key-frames are placed considering visual similarities among neighbors. This fact not only provides a powerful tool for video navigation but also offers an organization for posterior higher-level analysis such as identifying pieces of news, interviews, etc.

This work was supported by CICYT grant TEL99-1206-C02-02 and CERTAP Generalitat de Catalunya.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

F. Arman, A. Hsu, and M.-Y. Chiu. Feature management for large video databases. In Storage and Retrieval for Image and Video Databases (SPIE), pages 2–12, 1993.
Google Scholar
F. Arman, A. Hsu, and M.-Y. Chiu. Image processing on compressed data for large video databases. In Computer Graphics (Multimedia’ 93 Proceedings), pages 267–272. Addison-Wesley, 1993.
Google Scholar
J. Corridoni and A. D. Bimbo. Structured representation and automatic indexing of movie information contents. Pattern Recognition, 31:2027–2045, 1998.
Article Google Scholar
A. Dempster, N. Lair, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39:1–38, 1977.
MATH Google Scholar
Y. Gong and X. Liu. Generating optimal video summaries. In IEEE International Conference on Multimedia and Expo (III), pages 1559–1562, 2000.
Google Scholar
G. E. Hinton, P. Dayan, and M. Revow. Modeling the manifolds of images of handwritten digits. IEEE trans. on Neural Networks, 8(1):65–74, 1997.
Article Google Scholar
J. Kivinen and M. Warmuth. Additive versus exponentiated gradient updates for linear prediction. Journal of Information and Computation, 132:1–64, 1997.
Article MATH MathSciNet Google Scholar
B. Moghaddam and A. Pentland. Probabilistic visual learning for object representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):696–710, 1997.
Article Google Scholar
R. Neal and G. Hinton. A new view of the EM algorithm that justifies incremental and other variants. Kluwer, 1998.
Google Scholar
M. Smith and M. Christel. Automating video database indexing and retrieval. ACM Int. Conf. on Multimedia, pages 357–358, 1995.
Google Scholar
K. Sung and T. Poggio. Example-based learning for view-based human face detection. A.I. Memo 1521, C.B.C.L. Paper 112, 1994.
Google Scholar
M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71–86, 1991.
Article Google Scholar
H. Yu and W. Wolf. A visual search system for video and image databases. Int. Conf. on Multimedia Computing and Systems, pages 517–524, 1997.
Google Scholar
R. Zabih, J. Miller, and K. Mai. A feature-based algorithm for detecting and classifying scene breaks, 1995.
Google Scholar
H. Zhang, A. Kankanhalli, and S. Smoliar. Automatic partitioning of video. Multimedia Systems, 1:10–28, 1993.
Article Google Scholar
W. Zhao, J. Wang, D. Bhat, K. Sakiewicz, N. Nandhakumar, and W. Chang. Improving color based video shot detection. In ICMCS, Vol. 2, pages 752–756, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Centre de Visió per Computador (CVC), Dept. Informàtica, Universitat Autònoma de Barcelona, Barcelona, Spain, 08193, Bellaterra
Xavier Orriols & Xavier Binefa

Authors

Xavier Orriols
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Binefa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIACS Media Lab, Leiden University, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
Michael S. Lew & Nicu Sebe &
Institute for Image Data Research, University of Northumbria, Newcastle, NE1 8ST, UK
John P. Eakins

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Orriols, X., Binefa, X. (2002). Online Bayesian Video Summarization and Linking. In: Lew, M.S., Sebe, N., Eakins, J.P. (eds) Image and Video Retrieval. CIVR 2002. Lecture Notes in Computer Science, vol 2383. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45479-9_36

Download citation

DOI: https://doi.org/10.1007/3-540-45479-9_36
Published: 04 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43899-1
Online ISBN: 978-3-540-45479-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics