Video Semantic Concept Detection Using Multi-modality Subspace Correlation Propagation

Liu, Yanan; Wu, Fei

doi:10.1007/978-3-540-69423-6_51

Yanan Liu &
Fei Wu

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4351))

Included in the following conference series:

International Conference on Multimedia Modeling

868 Accesses
4 Citations

Abstract

Interaction and integration of multi-modality media types such as visual, audio and textual data in video are the essence of video content analysis. Although any uni-modality type partially expresses limited semantics less or more, video semantics are fully manifested only by interaction and integration of any unimodal. A great deal of research has been focused on utilizing multi-modality features for better understanding of video semantics. In this paper, we propose a new approach to detect semantic concept in video using SimFusion and Locality Preserving Projections (LPP) from temporal-sequenced associated cooccuring multimodal media data in video. SimFusion is an effective algorithm to reinforce or propagate the similarity relations between multi-modalities. LPP is an optimal combination of linear and nonlinear dimensionality reduction method. Our experiments show that by employing the two key techniques, we can improve the performance of video semantic concept detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Snoek, C.G.M., Worring, M., Smeulders, A.W.M.: Early versus Late Fusion in Semantic Video Analysis. In: Proceedings of the 13th annual ACM International Conference on Multimedia, pp. 399–402 (2005)
Google Scholar
Xi, W., Fox, E.A., et al.: SimFusion:Measuring Similarity using Unified Relationship Matrix. In: The 28th Annual International ACM SIGIR Conference (SIGIR 2005) (2005)
Google Scholar
Dumais, S.T., Furnas, G.W., Landauer, T.K.: Using Latent Semantic Analysis to Improve Access to Textual Information. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 281–285 (1988)
Google Scholar
He, X., Niyogi, P.: Locality Preserving Projections. In: Advances in Neural Information Processing Systems (NIPS 2003) (2003)
Google Scholar
Wu, Y., Lin, C.-Y., Chang, E.Y., Smith, J.R.: Multimodal Information Fusion for Video Concept Detection. In: International Conference on Image Processing, pp. 2391–2394 (2004)
Google Scholar
Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)
MATH Google Scholar
Carreira-Perpiñán, M.Á.: A Review of Dimension Reduction Techniques. Technical report CS-96-09, Dept. of Computer Science, University of Sheffield, UK
Google Scholar
Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, New York (2002)
MATH Google Scholar
Nason, G.P.: Design and choice of projection indices. PhD Thesis, University of Bath
Google Scholar
Roweis, S.T., Saul, L.K.: Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 290, 2323–2326 (2000)
Article Google Scholar
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290, 2319–2323 (2000)
Article Google Scholar
Belkin, M., Niyogi, P.: Laplacian Eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), 1373–1396 (2003)
Article MATH Google Scholar
Belkin, M., Niyogi, P.: Laplacian Eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems 14, pp. 585–591. MIT Press, Cambridge (2002)
Google Scholar
Hauptmann, A., Chen, M.Y., Christel, M., Huang, C., et al.: Confounded Expectations: Informedia at TRECVID 2004 (2004)
Google Scholar
Snoek, C.G.M., Worring, M., et al.: The MediaMill TRECVID 2004 Semantic Video Search Engine. In: Proc. TRECVID Workshop, Gaithesburg, USA (2004)
Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

Download references

Authors

Yanan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Engineering, Nanyang Technological University, Block N4, Nanyang Avenue, 639798, Singapore
Tat-Jen Cham & Deepu Rajan &
School of Computer Engineering, Nanyang Technological University, 639798, Singapore
Jianfei Cai
IBM T.J. Watson Research Center, Yorktown Heights, P.O. Box 704, 10598, New York, USA
Chitra Dorai
National University of Singapore, 3 Science Dr, 117543, Singapore
Tat-Seng Chua
Center for Multimedia and Network Technology, School of Computer Enginnering, Nanyang Technological University, 639798, Singapore
Liang-Tien Chia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Wu, F. (2006). Video Semantic Concept Detection Using Multi-modality Subspace Correlation Propagation. In: Cham, TJ., Cai, J., Dorai, C., Rajan, D., Chua, TS., Chia, LT. (eds) Advances in Multimedia Modeling. MMM 2007. Lecture Notes in Computer Science, vol 4351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69423-6_51

Download citation

DOI: https://doi.org/10.1007/978-3-540-69423-6_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69421-2
Online ISBN: 978-3-540-69423-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics