Skip to main content

Improving Video Concept Detection Using Spatio-Temporal Correlation

  • Conference paper
Advances in Multimedia Information Processing - PCM 2010 (PCM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6297))

Included in the following conference series:

  • 1451 Accesses

Abstract

Graph-based semi-supervised learning approaches have been proven effective and efficient in solving the problem of the inefficiency of labeled training data in many real-world application areas, such as video concept detection. As a significant factor of these algorithms, however, pair-wise similarity metric of samples has not been fully investigated. Specifically, for existing approaches, the estimation of pair-wise similarity between two samples relies on the spatial property of video data. On the other hand, temporal property, an essential characteristic of video data, is not embedded into the pair-wise similarity measure. Accordingly, in this paper, a novel framework for video concept detection, called Joint Spatio-Temporal Correlation Learning (JSTCL) is proposed. This framework is characterized by simultaneously taking into account both the spatial and temporal property of video data to improve the computation of pair-wise similarity. We apply the proposed framework to video concept detection and report superior performance compared to key existing approaches over the benchmark TRECVID data set.

This work is supported by the Research Program of Nanjing University of Posts and Telecommunications under NO. NY209018 and NO. NY209020.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Seeger, M.: Learning with labeled and unlabeled data. Technical Report, Edinburgh University (2001)

    Google Scholar 

  2. Chapelle, O., Zien, A., Scholkopf, B.: Semi-supervised learning. MIT Press, Cambridge (2006)

    Google Scholar 

  3. Song, Y., Hua, X., Wang, M.: Semi-automatic video annotation based on active learning with multiple complementary predictors. In: ACM International Conference on Multimedia Information Retrieval, pp. 97–104. ACM Press, Singapore (2005)

    Google Scholar 

  4. Yan, R., Naphade, M.: Semi-supervised cross feature learning for semantic concept detection in videos. In: Proc. IEEE International Conference on Computer Vision and Pattern Recognition, pp. 657–663. IEEE Press, San Diego (2005)

    Google Scholar 

  5. Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic function. In: Proc. IEEE International Conference on Machine Learning, pp. 912–919. IEEE Press, Washington (2003)

    Google Scholar 

  6. Zhou, D., Bousquet, O., SchÄolkopf, B.: Learning with local and global consistency. In: IEEE International Conference on Neural Information Processing Systems, pp. 321–328. IEEE Press, Vancouver (2003)

    Google Scholar 

  7. Belkin, M., Matveeva, I., Niyogi, P.: Regularization and semi-supervised learning on large graphs. In: IEEE International Conference on Annual Conference on Computational Learning Theory, pp. 624–638. IEEE Press, Wisconsin (2004)

    Google Scholar 

  8. He, J., Li, M., Zhang, C.: Generalized manifold-ranking based image retrieval. IEEE Traction on Image Processing 15, 3170–3177 (2006)

    Article  Google Scholar 

  9. Wang, C., Jing, F., Zhang, L., Zhang, H.: Image annotation refinement using random walk with restarts. In: ACM International Conference on Multimedia, pp. 647–650. ACM Press, Augsburg (2007)

    Google Scholar 

  10. Yuan, X., Hua, X., Wang, M., Wu, X.: Manifold-ranking based video concept detection on large database and feature pool. In: ACM International Conference on Multimedia, pp. 623–626. ACM Press, Augsburg (2007)

    Google Scholar 

  11. Wang, M., Hua, X., Zhang, H.: Automatic video annotation by semi-supervised learning with kernel density estimation. In: ACM International Conference on Multimedia, pp. 967–976. ACM Press, Vancouver (2008)

    Google Scholar 

  12. Wang, M., Meiz, T., Dai, L.: Video annotation by graph-based learning with neighborhood similarity. In: ACM International Conference on Multimedia, pp. 325–328. ACM Press, Vancouver (2008)

    Google Scholar 

  13. Tang, J., Hua, X., Wu, X.: Anisotropic Manifold Ranking for Video Annotation. In: IEEE International Conference on Multimedia and Expo., pp. 492–495. IEEE Press, New York (2009)

    Google Scholar 

  14. Stricker, M., Orengo, M.: Similarity of color images. In: IEEE International Conference on Storage and Retrieval for Image and Video Databases, pp. 381–392. IEEE Press, San Diego (1995)

    Google Scholar 

  15. Pass, G.: Comparing images using color coherence vectors. In: ACM International Conference on Multimedia, pp. 65–73. ACM Press, Seattle (1997)

    Google Scholar 

  16. Kokare, M., Chatterji, B., Biswas, P.: Comparison of similarity metrics for texture image retrieval. In: IEEE International Conference on Multimedia and Expo., pp. 571–575. IEEE Press, New York (2003)

    Google Scholar 

  17. Zhu, X.: Semi-Supervised Learning Literature Survey. Technical Report, University of Wisconsin-Madison (2007)

    Google Scholar 

  18. TRECVID. Trecvid retrieval evaluations, http://wwwnlpir.nist.gov/projects/trecvid

  19. Chang, C., Lin, C.: LIBSVM: a library for support vector machines, Software available at, http://www.csie.ntu.edu.tw/~cjlin/libsvm

  20. Wang, J., Zhao, Y., Wu, X., Hua, X.: Transductive multi-label learning for video concept detection. In: ACM International Conference on Multimedia, pp. 298–304. ACM Press, Vancouver (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhu, S., Liang, Z., Liu, Y. (2010). Improving Video Concept Detection Using Spatio-Temporal Correlation. In: Qiu, G., Lam, K.M., Kiya, H., Xue, XY., Kuo, CC.J., Lew, M.S. (eds) Advances in Multimedia Information Processing - PCM 2010. PCM 2010. Lecture Notes in Computer Science, vol 6297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15702-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15702-8_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15701-1

  • Online ISBN: 978-3-642-15702-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics