Automatic Video Annotation and Retrieval Based on Bayesian Inference

Wang, Fangshi; Xu, De; Lu, Wei; Wu, Weixin

doi:10.1007/978-3-540-69423-6_28

Automatic Video Annotation and Retrieval Based on Bayesian Inference

Fangshi Wang^21,22,
De Xu²¹,
Wei Lu²² &
…
Weixin Wu²¹

Conference paper

874 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4351))

Abstract

Retrieving videos by key words requires semantic knowledge of the videos. However, manual video annotation is very costly and time consuming. Most works reported in literatures focus on annotating a video shot with either only one semantic concept or a fixed number of words. In this paper, we propose a new approach to automatically annotate a video shot with a non-fixed number of semantic concepts and to retrieve videos based on text queries. First, a simple but efficient method is presented to automatically extract Semantic Candidate Set (SCS) for a video shot based on visual features. Then, the final annotation set is obtained from SCS by Bayesian Inference. Finally, a new way is proposed to rank the retrieved key frames according to the probabilities obtained during Bayesian Inference. Experiments show that our method is useful in automatically annotating video shots and retrieving videos by key words.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1002–1009 (2004)
Google Scholar
Rong, Y.: Probabilistic Models for Combining Diverse Knowledge Sources in Multimedia Retrieval. Dissertation of Carnegie Mellon University ( (2005)
Google Scholar
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D., Jordan, M.I.: Matching Words and Pictures. Journal of Machine Learning Research (JMLR), Special Issue on Text and Images 3, 1107–1135 (2003)
MATH Google Scholar
Tseng, B.T., Lin, C.-Y., Naphade, M.R., Natsev, A., Smith, J.R.: Normalized Classifier Fusion for Semantic Visual Concept Detection. In: Proc. of Int. Conf. on Image Processing (ICIP-2003), Barcelona, Spain, pp. 14–17 (2003)
Google Scholar
Naphade, M.R.: A Probabilistic Framework For Mapping Audio-visual Features to High-Level Semantics in Terms of Concepts and Context. Dissertation of the University of Illinois at Urbana-Champaign (2001)
Google Scholar
Jiménez, A.B.B.: Multimedia Knowledge: Discovery, Classification, Browsing, and Retrieval. Dissertation of Columbia University ( (2005)
Google Scholar
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: Proceedings of the 26th Intl. ACM SIGIR Conf., pp. 119–126 (2003)
Google Scholar
Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. In: The Proceedings of the 16th Conference on Advances in Neural Information Processing Systems NIPS (2004)
Google Scholar
Cheng, J., Greiner, R., Kelly, J., Bell, D., Liu, W.: Learning Belief Networks from Data: An Information Theory Based Approach. Artificial Intelligence 137(1-2), 43–90 (2002)
Article MATH MathSciNet Google Scholar
Huang, C.: Inference in Belief Networks: A Procedural Guide. International Journal of Approximate Reasoning 11, 1–158 (1994)
Article MATH Google Scholar
http://www.research.ibm.com/VideoAnnEx
Fangshi, W., De, X., Weixin, W.: A Cluster Algorithm of Automatic Key Frame Extraction Based on Adaptive Threshold. Journal of Computer Research and Development 42(10), 1752–1757 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer &Information Technology, Beijing Jiaotong University,
Fangshi Wang, De Xu & Weixin Wu
School of Software, Beijing Jiaotong University, Beijing, 100044, China
Fangshi Wang & Wei Lu

Authors

Fangshi Wang
View author publications
You can also search for this author in PubMed Google Scholar
De Xu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Lu
View author publications
You can also search for this author in PubMed Google Scholar
Weixin Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Engineering, Nanyang Technological University, Block N4, Nanyang Avenue, 639798, Singapore
Tat-Jen Cham & Deepu Rajan &
School of Computer Engineering, Nanyang Technological University, 639798, Singapore
Jianfei Cai
IBM T.J. Watson Research Center, Yorktown Heights, P.O. Box 704, 10598, New York, USA
Chitra Dorai
National University of Singapore, 3 Science Dr, 117543, Singapore
Tat-Seng Chua
Center for Multimedia and Network Technology, School of Computer Enginnering, Nanyang Technological University, 639798, Singapore
Liang-Tien Chia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, F., Xu, D., Lu, W., Wu, W. (2006). Automatic Video Annotation and Retrieval Based on Bayesian Inference. In: Cham, TJ., Cai, J., Dorai, C., Rajan, D., Chua, TS., Chia, LT. (eds) Advances in Multimedia Modeling. MMM 2007. Lecture Notes in Computer Science, vol 4351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69423-6_28

Download citation

DOI: https://doi.org/10.1007/978-3-540-69423-6_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69421-2
Online ISBN: 978-3-540-69423-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics