Movie Keyframe Retrieval Based on Cross-Media Correlation Detection and Context Model

Jin, Yukang; Lu, Tong; Su, Feng

doi:10.1007/978-3-642-31087-4_82

Yukang Jin²³,
Tong Lu^23,24 &
Feng Su²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7345))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

2649 Accesses
3 Citations

Abstract

In this paper, we propose a novel cross-media correlation detection method for movie keyframe retrieval. We first compute the temporal saliency on both the video and audio streams in a movie separately, then locate the resonance regions that the saliency changes in these two modalities show strong correlations. Next, starting from resonance regions, we propagate the similarity of visual and auditory characteristics through neighboring movie regions based on a temporal movie context model, segmenting the movie into a sequence of coherent parts from which keyframes are extracted. The experimental results on actual movie clips show that, compared to the single-modality algorithms, our method gives improved retrieval performance in completeness and precision due to the efficient exploitation of the context and correlations between complementary multi-modalities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Yu, B., Ma, W.Y., Nahrstedt, K., Zhang, H.J.: Video summarization based on user log enhanced link analysis. In: ACM Multimedia Conference, pp. 382–391 (2003)
Google Scholar
Feng, S., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: Computer Vision and Pattern Recognition, pp. 1002–1009 (2004)
Google Scholar
Datta, R., Li, J., Wang, J.Z.: Content-based image retrieval: approaches and trends of the new age. In: Multimedia Information Retrieval, pp. 253–262 (2005)
Google Scholar
Chang, E.Y., Goh, K., Sychay, G., Wu, G.: Cbsa: content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Transactions on Circuits and Systems for Video Technology 13, 26–38 (2003)
Article Google Scholar
Beal, M.J., Attias, H., Jojic, N.: Audio-Video Sensor Fusion with Probabilistic Graphical Models. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 736–750. Springer, Heidelberg (2002)
Chapter Google Scholar
Wang, J., Zeng, H.J., Chen, Z., Lu, H., Tao, L., Ma, W.Y.: Recom: reinforcement clustering of multi-type interrelated data objects. In: Research and Development in Information Retrieval, pp. 274–281 (2003)
Google Scholar
Wang, X.J., Ma, W.Y., Xue, G.R., Li, X.: Multi-model similarity propagation and its application for web image retrieval. In: ACM Multimedia Conference, pp. 944–951 (2004)
Google Scholar
Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Research and Development in Information Retrieval, pp. 127–134 (2003)
Google Scholar
Barnard, K., Duygulu, P., Forsyth, D.A., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
MATH Google Scholar
Zhang, H., Zhuang, Y., Wu, F.: Cross-modal correlation learning for clustering on image-audio dataset. In: ACM Multimedia Conference, pp. 273–276 (2007)
Google Scholar
Peng, J., Xiaolin, Q.: Keyframe-based video summary using visual attention clues. IEEE Multimedia 17, 64–73 (2010)
Google Scholar
Kyperountas, M., Kotropoulos, C., Pitas, I.: Enhanced eigen-audioframes for audiovisual scene change detection. IEEE Transactions on Multimedia 9, 785–797 (2007)
Article Google Scholar
Benzit, R., Sutera, A., Vulpiani, A.: The mechanism of stochastic resonance (1981)
Google Scholar
Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: CVPR (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093, China
Yukang Jin, Tong Lu & Feng Su
Jiangyin Institute of Information Technology of Nanjing University, China
Tong Lu

Authors

Yukang Jin
View author publications
You can also search for this author in PubMed Google Scholar
Tong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Su
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Software, Dalian University of Technology, Dalian, China
He Jiang
Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Boulevard, 02125-3393, Boston,, MA, USA
Wei Ding
Department of Computer Science, Texas State University San Marcos, 601 University Drive, 78666-4616, San Marcos, TX, USA
Moonis Ali
Department of Computer Science, University of Vermont, Burlington, VT, USA
Xindong Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jin, Y., Lu, T., Su, F. (2012). Movie Keyframe Retrieval Based on Cross-Media Correlation Detection and Context Model. In: Jiang, H., Ding, W., Ali, M., Wu, X. (eds) Advanced Research in Applied Artificial Intelligence. IEA/AIE 2012. Lecture Notes in Computer Science(), vol 7345. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31087-4_82

Download citation

DOI: https://doi.org/10.1007/978-3-642-31087-4_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31086-7
Online ISBN: 978-3-642-31087-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics