A Hybrid Framework for Detecting the Semantics of Concepts and Context

Naphade, Milind R.; Smith, John R.

doi:10.1007/3-540-45113-7_20

Milind R. Naphade⁸ &
John R. Smith⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2728))

Included in the following conference series:

International Conference on Image and Video Retrieval

1195 Accesses
9 Citations

Abstract

Semantic understanding of multimedia content necessitates models for the semantics of concepts, context and structure. We propose a hybrid framework that can combine discriminant or generative models for concepts with generative models for structure and context. Using the TREC Video 2002 benchmark corpus we show that robust models can be built for several diverse visual semantic concepts. We use a novel factor graphical framework to model inter-conceptual context for 12 semantic concepts of the corpus. Using the sum-product algorithm [1] for approximate or exact inference in these factor graph multinets, we attempt to correct errors made during isolated concept detection by forcing high-level constraints. This results in a significant improvement in the overall detection performance. Enforcement of this probabilistic context model enhances the detection performance further to 22 % using the global multinet, whereas its factored approximation also leads to improvement by 18 % over the baseline concept detection. This improvement is achieved without using any additional training data or separate annotations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

F. Kschischang, B. Frey, and H. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 498–519, 2001.
Article MATH MathSciNet Google Scholar
M. Naphade, T. Kristjansson, B. Frey, and T. S. Huang, “Probabilistic multimedia objects (multijects): A novel approach to indexing and retrieval in multimedia systems,” in Proceedings of IEEE International Conference on Image Processing, Chicago, IL, Oct. 1998, vol. 3, pp. 536–540.
Google Scholar
S.F. Chang, W. Chen, and H. Sundaram, “Semantic visual templates-linking features to semantics,” in Proceedings of IEEE International Conference on Image Processing, Chicago, IL, Oct. 1998, vol. 3, pp. 531–535.
Google Scholar
Milind R. Naphade, Igor Kozintsev, and Thomas S. Huang, “A factor graph framework for semantic video indexing,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 1, pp. 40–52, Jan 2002.
Article Google Scholar
M. Naphade and J. Smith, “The role of classifiers in multimedia content management,” in SPIE Storage and Retrieval for Media Databases, San Jose, CA, Jan 2003, vol. 5021.
Google Scholar
M. Naphade, C. Lin, A. Natsev, B. Tseng, and J. Smith, “A framework for moderate vocabulary visual semantic concept detection,” submitted to IEEE ICME 2003.
Google Scholar
M. Naphade, S. Basu, J. Smith, C. Lin, and B. Tseng, “Modeling semnatic concepts to support query by keywords in video,” in IEEE International Confernce on Image Processing, Rochester, NY, Sep 2002.
Google Scholar
Vladimir Vapnik, The Nature of Statistical Learning Theory, Springer, New York, 1995.
MATH Google Scholar
“TREC Video Retrieval,” 2002, National Institute of Standards and Technology, http://www-nlpir.nist.gov/projects/trecvid/.
Google Scholar
W.H. Adams, A. Amir, C. Dorai, S. Ghoshal, G. Iyengar, A. Jaimes, C. Lang, C. Y. Lin, M. R. Naphade, A. Natsev, C. Neti, H. J. Nock, H. Permutter, R. Singh, S. Srinivasan, J.R. Smith, B. L. Tseng, A.T. Varadaraju, and D. Zhang, “IBM research TREC-2002 video retrieval system,” in Proc. Text Retrieval Conference (TREC), Gaithersburg, MD, Nov 2002.
Google Scholar
S. Srinivasan, D. Ponceleon, A. Amir, and D. Petkovic, “What is that video anyway? In search of better browsing,” in Proceedings of IEEE International Conference on Multimedia and Expo, New York, July 2000, pp. 388–392.
Google Scholar
M. R. Naphade, R. Wang, and T. S. Huang, “Classifying motion picture soundtrack for video indexing,” in IEEE International Conference on Multimedia and Expo, Tokyo, Japan, August 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Pervasive Media Management Group, IBM Thomas J. Watson Research Center, Hawthorne, NY, 10532
Milind R. Naphade & John R. Smith

Authors

Milind R. Naphade
View author publications
You can also search for this author in PubMed Google Scholar
John R. Smith
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIACS Media Lab, Leiden University, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
Erwin M. Bakker & Michael S. Lew &
Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, 405 N. Mathews Avenue, Urbana, IL, 61801, USA
Thomas S. Huang
University of Amsterdam, Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands
Nicu Sebe
Siemens Corporate Research, 755 College Road East, Princeton, NJ, 08540, USA
Xiang Sean Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Naphade, M.R., Smith, J.R. (2003). A Hybrid Framework for Detecting the Semantics of Concepts and Context. In: Bakker, E.M., Lew, M.S., Huang, T.S., Sebe, N., Zhou, X.S. (eds) Image and Video Retrieval. CIVR 2003. Lecture Notes in Computer Science, vol 2728. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45113-7_20

Download citation

DOI: https://doi.org/10.1007/3-540-45113-7_20
Published: 24 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40634-1
Online ISBN: 978-3-540-45113-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics