skip to main content
10.1145/1178677.1178685acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Exploring temporal consistency for video analysis and retrieval

Published: 26 October 2006 Publication History

Abstract

Temporal consistency is ubiquitous in video data, where temporally adjacent video shots usually share similar visual and semantic content.This paper presents a thorough study of temporal consistency defined with respect to semantic concepts and query topics using quantitative measures,and discusses its implications to video analysis and retrieval tasks. We further show that,in interactive settings, using temporal consistency leads to considerable improvement on the performance of semantic concept detection and retrieval of video data.Speci fically,an active learning method with temporal sampling strategy is proposed for building classifiers of semantic concepts,and a temporal reranking method is proposed for improving the efficiency of interactive video search.Both methods outperform existing methods by considerable margins on the TRECVID dataset.

References

[1]
LSCOM lexicon definitions and annotations version 1.0. In DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, Columbia University ADVENT Technical Report 217-2006-3, 2006.
[2]
S. Chang, W. Chen, H. Horace, H. Sundaram, and D. Zhong. A fully automated content based video search engine supporting spatio-temporal queries. IEEE Trans. on Circuit System and Video Technology, 8(5):602--615, 1998.
[3]
M. Chen, M. Christel, A. Hauptmann, and H. Wactlar. Putting active learning into multimedia applications: dynamic definition and refinement of concept classifiers. In Proc. of the 13th ACM Int'l Conf. on Multimedia, pages 902--911, New York, NY, USA, 2005. ACM Press.
[4]
S. Ebadollahi, L. Xie, S.-F. Chang, and J. Smith. Visual event detection using multi-dimensional concept dynamics. In Proc. IEEE Int'l Conf. on Multimedia and Expo (ICME 2006), 2006.
[5]
R. Khalaf and S. S. Intille. Improving multiple people tracking using temporal consistency. In MIT Dept.of Architecture House N Project Technical Report, 2001.
[6]
R. Lienhart. Comparison of automatic shot boundary detection algorithms. In SPIE Conf.on Storage and Retrieval for Image and Video Databases VII, volume 3656, pages 290--301, 1999.
[7]
M. R. Naphade, T. Kristjansson, B. Frey, and T. Huang. Probabilistic multimedia objects (multijects): A novel approach to video indexing and retrieval in multimedia systems. In Proc. of ICIP, 1998.
[8]
A. P. Natsev, M. R. Naphade, and J. Tesic. Learning the semantics of multimedia queries and concepts from a small number of examples. In Proc. of the 13th ACM Int'l Conf.on Multimedia, pages 598--607, New York, NY, USA, 2005. ACM Press.
[9]
Y. Rui, T. S. Huang, and S. Mehrotra. Constructing table-of-content for videos. Multimedia Syst., 7(5):359--368, 1999.
[10]
A. Smeaton and P. Over. Trecvid: Benchmarking the effectiveness of infomration retrieval tasks on digital video. In Proc. of the Intl. Conf. on Image and Video Retrieval, 2003.
[11]
X. Song, C.-Y. Lin, and M.-T. Sun. Autonomous visual model building based on image crawling through internet search engines. In Int'l Workshop on Multimedia Information Retrieval, pages 315--322. ACM Press, 2004.
[12]
S. Tong and E. Chang. Support vector machine active learning for image retrieval. In Proc. of the 9th ACM Int'l Conf.on Multimedia, pages 107--118, New York, NY, USA, 2001. ACM Press.
[13]
L. Xie, S.-F. Chang, A. Divakaran,and H. Sun. Structure analysis of soccer video with hidden markov models. In IEEE Int'l Conf. on Acoustic, Speech and Signal Processing, Orlando, FL, May 2002.
[14]
L. Xie, L. Kennedy, S.-F. Chang, A. Divakaran, H. Sun, and C.-Y. Lin. Layered dynamic mixture model for pattern discovery in asynchronous multi-modal streams. In Int'l Conf. on Acoustic, Speech and Signal Processing, Philadelphia, PA, March 2005.
[15]
R. Yan, J. Yang,and A. G. Hauptmann. Learning query-class dependent weights in automatic video retrieval. In Proc.of the 12th ACM Int'l Conf. on Multimedia, pages 548--555. ACM Press, 2004
[16]
H. Zhang, S. Y. Tan, S. W. Smoliar, and G. Yihong. Automatic parsing and indexing of news video. Multimedia Syst., 2(6): 256--266, 1995.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MIR '06: Proceedings of the 8th ACM international workshop on Multimedia information retrieval
October 2006
344 pages
ISBN:1595934952
DOI:10.1145/1178677
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. active learning
  2. interactive search
  3. semantic concept detection
  4. temporal consistency
  5. video retrieval

Qualifiers

  • Article

Conference

MM06
MM06: The 14th ACM International Conference on Multimedia 2006
October 26 - 27, 2006
California, Santa Barbara, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2021)A Comprehensive Survey of Detection of Tampered Video and Localization of Tampered FrameWireless Personal Communications10.1007/s11277-021-09227-z123:3(2027-2060)Online publication date: 30-Oct-2021
  • (2018)Coarse-to-Fine Copy-Move Forgery Detection for Video ForensicsIEEE Access10.1109/ACCESS.2018.28196246(25323-25335)Online publication date: 2018
  • (2016)Accurate online video tagging via probabilistic hybrid modelingMultimedia Systems10.1007/s00530-014-0399-422:1(99-113)Online publication date: 1-Feb-2016
  • (2014)Convergence of interactive displays with smart mobile devices for effective advertisingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/255745010:2(1-16)Online publication date: 14-Feb-2014
  • (2014)Memory recall based video searchACM Transactions on Multimedia Computing, Communications, and Applications10.1145/253440910:2(1-21)Online publication date: 14-Feb-2014
  • (2014)A new data hiding method via revision history records on collaborative writing platformsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/253440810:2(1-21)Online publication date: 14-Feb-2014
  • (2014)Scalable multimedia content analysis on parallel platforms using pythonACM Transactions on Multimedia Computing, Communications, and Applications10.1145/251715110:2(1-22)Online publication date: 14-Feb-2014
  • (2014)Dynamic load balancing in distributed virtual environments using heat diffusionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/249990610:2(1-19)Online publication date: 14-Feb-2014
  • (2014)Modeling correlation between multi-modal continuous words for pLSA-based video classification2014 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2014.7025874(4304-4308)Online publication date: Oct-2014
  • (2014)Background subtraction for the moving cameraComputer Vision and Image Understanding10.1016/j.cviu.2014.06.007127(73-85)Online publication date: 1-Oct-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media