Skip to main content
Log in

A Cooperative Learning Scheme for Interactive Video Search

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

The main idea of an interactive search is to gradually improve search quality of retrieval system via user interaction. While a large amount of work has been made in the past, most of the existing approaches typically require labeling effort for updating the query model. Unfortunately, it is time-consuming and tedious to label a large number of training examples. We aim to develop a novel text-driven cooperative learning scheme, which can offer users a quite natural query fashion and alleviate significantly the burden on users without compromising search performance. Starting with an advanced text-driven video search engine, a multi-view cooperative training strategy is proposed for learning from feedback data a refined ranking function. The main merit of proposed framework is its ability in mining training samples automatically from previous answer set and implicitly combining multiple modalities for effectively learning users’ query intent. Evaluation on TRECVID’ 06 video corpus shows that the proposed scheme with few training seeds achieves a comparable performance with classic interactive schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

References

  1. Lew, M. S., Sebe, N., Djeraba, C., & Jain, R. (2006). Content-based multimedia information retrieval State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2, 1–19 (2006). TOMCCAP doi:10.1145/1126004.1126005

  2. Amir, A., Argillander, J., Campbell, M., Haubold, A., Iyengar, G., Ebadollahi, S., et al: J. Teˇsi’c, and T. Volkmer, “IBM Research TRECVID-2005 Video Retrieval System,” In TREC Video Retrieval Evaluation Online Proceedings, TRECVID, Gaithersburg, USA, 2005.

  3. Chang, S. F., Hsu, W. H., Kennedy, L., Xie, L., Yanagawa, A., Zavesky, E., et al: Columbia University TRECVID-2005 video search and high-level feature extraction. In TREC Video Retrieval Evaluation Online Proceedings, TRECVID, Gaithersburg, USA, 2005.

  4. Kacprzyk, J., & Zadrozny, S. (2005). Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools. Information Science, 173, 281–304. doi:10.1016/j.ins.2005.03.002.

    Article  MathSciNet  Google Scholar 

  5. Snoek, C. G. M., van Gemert, J. C., Geusebroek, J. M., Huurnink, B., Koelma, D. C., Nguyen, G. P., et al (2005) The MediaMill TRECVID 2005 semantic video search engine. In TREC Video Retrieval Evaluation Online Proceedings, TRECVID, Gaithersburg, USA.

  6. Snoek, C., Worring, M., Koelma, D., & Smeulders, A. (2006). Learned lexicon-driven interactive video retrieval. In CIVR 2006, pp. 11–20.

  7. Zhang, D. S., & Nunamaker, J. F. (2004). A natural language approach to content-based video indexing and retrieval for interactive E-learning. IEEE Transaction on Multimedia, 6(3), 450–458.

    Article  Google Scholar 

  8. Zhou, X. S., & Huang, T. S. (2002). Relevance feedback in content-based image retrieval: some recent advances. Information Science, 148, 129–137. doi:10.1016/S0020-0255(02)00286-4.

    Article  MATH  MathSciNet  Google Scholar 

  9. Hsu, W. H., Kennedy, L. S., & Chang, S.-F. (2007). Reranking methods for visual search. IEEE Transaction on Multimedia, 14, 14–22.

    Google Scholar 

  10. Yan, R., & Hauptmann, A. G. (2005). Co-retrieval: a boosted reranking approach for video retrieval. IEE Proceedings Vision, Image and Signal Processing, 152, 888–895. doi:10.1049/ip-vis:20045188.

    Article  Google Scholar 

  11. Muneesawang, P., & Guan, L. (2002). Automatic machine interactions for content-based image retrieval using a self-organizing tree map architecture. IEEE Transactions on Neural Networks, 13(4), 821–834. doi:10.1109/TNN.2002.1021883.

    Article  Google Scholar 

  12. Hauptmann, A. G., et al. (2005). CMU Informedia’s TRECVID 2005 Skirmishes. In TREC video retrieval evaluation online proceedings, TRECVID, Gaithersburg, USA.

  13. Natsev, A., Naphade, M. R., & Tesic, J. (2005). Learning the semantic of multimedia queries and concepts from a small number of examples. In International Conference on Multimedia, ACM, Singapore, pp. 598–607.

  14. Yuan, J. H., Zheng, W. J., Chen, L., Ding, D. Y., Wang, D., Tong, Z. J., et al. (2005). Tsinghua University at TRECVID 2005. In TREC video retrieval evaluation online proceedings, TRECVID, Gaithersburg, USA.

  15. Kennedy, L. S., Natsev, A., & Chang, S. F. (2005). Automatic discovery of query-class-dependent models for multimodal search. In International Conference on Multimedia, ACM, Singapore, pp. 882–891.

  16. Hsu, W. H., Kennedy, L. S., & Chang, S.-F. (2006). Video search reranking via information bottleneck principle. In 14th annual ACM international conference on Multimedia, Santa Barbara, CA, USA, pp. 35–44.

  17. Porter, M. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137.

    Google Scholar 

  18. Lafferty, J., & Zhai, C. (2001). Risk minimization and language modeling in information retrieval,” In 24th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’01).

  19. The Lemur Toolkit for Language Modeling and Information Retrieval: URL:http://www.lemurproject.org.

  20. TRECVID, TREC Video Retrieval Evaluation.: In http://www-nlpir.nist.gov/projects/trecvid.

  21. Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the Workshop on Computational Learning Theory, ACM, New York, USA, pp. 92–100.

  22. Brefeld, U., & Scheffer, T. (2004). Co-EM support vector learning. In Proceedings of the twenty-first International Conference on Machine learning, Canada.

  23. Nigam, K., & Ghani, R. (2000). Understanding the behavior of co-training. In Proceedings of the Workshop on Text Mining, ACM.

  24. Su, H. J., Zhao, Y., & Yuan, B. Z. (2002). A new composite histogram integrating each bin’s spatial distribution for image retrieval,” In IEEE TENCON’02.

  25. Petersohn, C. (2004). Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System”, In TREC Video Retrieval Evaluation OnlineProceedings, TRECVID, URL: http://www.nlpir.nist.gov/projects/tvpubs/tvpapers04/fraunhofer.pdf

  26. Lexicon Definitions, L.S.C.O.M.: and Annotations Version1.0, DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, Columbia University ADVENT Technical Report #217-2006-3, March 2006.

  27. Vapnik, V. (2000) The nature of statistical learning theory. Tsinghua University Press, Chinese Language Edition.

  28. Chang, C. C., & Lin, C. J. (2001). LIBSVM: a library for support vector machines,” 2001, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  29. Yan, R., & Naphade, M. (2005). Multi-modal video concept extraction using co-training. In International Conference on Multimedia and Expo, IEEE, pp. 514–517.

  30. Snoek, C. G. M., & Worring, M. (2005). Multimodal video indexing: a review of the state-of-the-art. In multimedia tools and applications, 2005 Springer Science + Business Media, Netherlands, pp. 5–35.

  31. Chua, T.-S., Neo, S.-Y., Li, K.-Y., Wang, G., Shi, R., Zhao, M., et al (2004). TRECVID 2004 search and feature extraction task by NUS PRIS. In TREC Video Retrieval Evaluation Online Proceedings, TRECVID, Gaithersburg, USA.

Download references

Acknowledgments

This work was supported in part by National Science Foundation of China (No. 60602030, No. 90604032), 973 Program (No. 2006B30314), 863 Program (No. 2007AA01Z175), PCSIRT (No. IRT0707), and Specialized Research Foundation of BJTU (No. 2005SM013, No. 2005SZ005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shikui Wei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, S., Zhao, Y., Zhu, Z. et al. A Cooperative Learning Scheme for Interactive Video Search. J Sign Process Syst Sign Image Video Technol 59, 189–199 (2010). https://doi.org/10.1007/s11265-008-0287-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-008-0287-2

Keywords

Navigation