ABSTRACT
In this paper we unify two supposedly distinct tasks in multimedia retrieval. One task involves answering queries with a few examples. The other involves learning models for semantic concepts, also with a few examples. In our view these two tasks are identical with the only differentiation being the number of examples that are available for training. Once we adopt this unified view, we then apply identical techniques for solving both problems and evaluate the performance using the NIST TRECVID benchmark evaluation data [15]. We propose a combination hypothesis of two complementary classes of techniques, a nearest neighbor model using only positive examples and a discriminative support vector machine model using both positive and negative examples. In case of queries, where negative examples are rarely provided to seed the search, we create pseudo-negative samples. We then combine the ranked lists generated by evaluating the test database using both methods, to create a final ranked list of retrieved multimedia items. We evaluate this approach for rare concept and query topic modeling using the NIST TRECVID video corpus.In both tasks we find that applying the combination hypothesis across both modeling techniques and a variety of features results in enhanced performance over any of the baseline models, as well as in improved robustness with respect to training examples and visual features. In particular, we observe an improvement of 6% for rare concept detection and 17% for the search task.
- TREC Video Retrieval. National Institute of Standards and Technology, http://www-nlpir.nist.gov/projects/trecvid/.]]Google Scholar
- K. Chakrabarti, K. Porkaew, and S. Mehrotra. Efficient query refinement in multimedia databases. In Proc. 16th Intl. Conf. on Data Engineering (ICDE'00), San Diego, CA, Feb. 28--Mar 3 2000.]] Google ScholarDigital Library
- T. S. Chua, S.-Y. Neo, K.-Y. Li, G. Wang, R. Shi, M. Zhao, and H. Xu. TREC VID 2004 search and feature extraction task by NUSPRIS. In TRECVID 2004 Workshop, Gaithersburg, MD, Nov. 2004.]]Google Scholar
- A. Gupta, T. E. Weymouth, and R. Jain. Semantic queries with pictures: the VIMSYS model. In Intl. Conf. on Very Large Databases (VLDB), pages 69--70, Sep. 1991.]] Google ScholarDigital Library
- A. Hauptmann and M. Christel. Successful approaches in the T R E C video retrieval evaluations. In A C M Multimedia, New York, NY, Nov 2004.]] Google ScholarDigital Library
- Y. Ishikawa, R. Subramanya, and C. Faloutsos. Mind R eader: Querying databases through multiple examples. In Proc. of the 24th Intl. Conference on Very Large Databases (VLDB'98), pages 218--227, 1998.]] Google ScholarDigital Library
- L. Kennedy, A. Natsev, and S.-F. Chang. Automatic discovery of query-class-dependent models for multimodal search. In ACM Multimedia 2005, Singapore, Nov. 2005.]] Google ScholarDigital Library
- C. Lin, B. Tseng, and J. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. In Proc. Text Retrieval Conference (TREC), Gaithersburg, MD, Nov 2003.]]Google Scholar
- M. Naphade, J. Smith, and F. Souvannavong. On the detection of semantic concepts at TREC VID. In A C M Multimedia, New York, NY, Nov 2004.]] Google ScholarDigital Library
- M. R. Naphade, S. Basu, J. Smith, C. Y. Lin, and B. Tseng. Modeling semantic concepts to support query by keywords in video. In Proc. IEEE Intl. Conference on Image Processing (ICIP'02), Rochester, NY, Sep. 2002.]]Google ScholarCross Ref
- S. Nepal and M. V. Ramakrishna. Single feature query by multi examples in image databases. In Proc. SPIE Photonics East Intl. Symposium on Voice, Data and Communications, volume 4210, pages 424--435, 2000.]]Google Scholar
- K. Porkaew, S. Mehrotra, M. Ortega, and K. Chakrabarti. Similarity search using multiple examples in MARS. In Intl. Conf. on Visual Information Systems (VISUAL'99), pages 68--75, 1999.]] Google ScholarDigital Library
- Y. Rui, T. S. Huang, M. Ortega, and S. Mehrotra. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. on Circuits and Systems for Video Technology, 8:644--656, Sep. 1998.]]Google ScholarDigital Library
- R. Singh and R. Kothari. Relevance feedback algorithm based on learning from labeled and unlabeled data. In IEEE ICME 2003, Baltimore, MD, July 2003.]] Google ScholarDigital Library
- A. Smeaton, P. Over, and W. Kraaij. TRECVID evaluating the effectiveness of information retrieval tasks on digital video. In ACM Multimedia, New York, NY, Nov 2004.]] Google ScholarDigital Library
- D. M. J. Tax. One-Class Classification: Concept-Learning in the Absence of Counter-Examples. PhD thesis, Delft University of Technology, June 2001.]]Google Scholar
- S. Tong and E. Chang. Support vector machine active learning for image retrieval. In Proc. ACM Intl. Conf. on Multimedia, Oct. 2001.]] Google ScholarDigital Library
- V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.]] Google ScholarCross Ref
- J. Wang and J. Li. Learning-based linguistic indexing of pictures with 2-D MHMMs. In ACM Intl. Conf. Multimedia (ACMMM), Juan Les Pin, France, Dec. 2002.]] Google ScholarDigital Library
- T. Westerveld and A. P. de Vries. Multimedia retrieval using multiple examples. In CIVR, pages 344--352, 2004.]]Google ScholarCross Ref
- R. Yan and A. Hauptmann. Negative pseudo-relevance feedback in content based video retrieval. In ACM Multimedia, Berkeley, CA, Nov 2003.]] Google ScholarDigital Library
- R. Yan, J. Yang, and A. Hauptmann. Learning query class-dependent weights in automatic video retrieval. In ACM Multimedia 2004, New York, NY, Oct. 2004.]] Google ScholarDigital Library
Index Terms
- Learning the semantics of multimedia queries and concepts from a small number of examples
Recommendations
Multiview Semi-Supervised Learning with Consensus
Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications. Semi-supervised learning aims to improve the performance of a classifier trained with limited number of labeled data by utilizing the ...
A pairwise ranking based approach to learning with positive and unlabeled examples
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementA large fraction of binary classification problems arising in web applications are of the type where the positive class is well defined and compact while the negative class comprises everything else in the distribution for which the classifier is ...
News video retrieval by learning multimodal semantic information
VISUAL'07: Proceedings of the 9th international conference on Advances in visual information systemsWith the explosion of multimedia data especially that of video data, requirement of efficient video retrieval has becoming more and more important. Years of TREC Video Retrieval Evaluation (TRECVID) research gives benchmark for video search task. The ...
Comments