skip to main content
10.1145/1101149.1101288acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Learning the semantics of multimedia queries and concepts from a small number of examples

Published:06 November 2005Publication History

ABSTRACT

In this paper we unify two supposedly distinct tasks in multimedia retrieval. One task involves answering queries with a few examples. The other involves learning models for semantic concepts, also with a few examples. In our view these two tasks are identical with the only differentiation being the number of examples that are available for training. Once we adopt this unified view, we then apply identical techniques for solving both problems and evaluate the performance using the NIST TRECVID benchmark evaluation data [15]. We propose a combination hypothesis of two complementary classes of techniques, a nearest neighbor model using only positive examples and a discriminative support vector machine model using both positive and negative examples. In case of queries, where negative examples are rarely provided to seed the search, we create pseudo-negative samples. We then combine the ranked lists generated by evaluating the test database using both methods, to create a final ranked list of retrieved multimedia items. We evaluate this approach for rare concept and query topic modeling using the NIST TRECVID video corpus.In both tasks we find that applying the combination hypothesis across both modeling techniques and a variety of features results in enhanced performance over any of the baseline models, as well as in improved robustness with respect to training examples and visual features. In particular, we observe an improvement of 6% for rare concept detection and 17% for the search task.

References

  1. TREC Video Retrieval. National Institute of Standards and Technology, http://www-nlpir.nist.gov/projects/trecvid/.]]Google ScholarGoogle Scholar
  2. K. Chakrabarti, K. Porkaew, and S. Mehrotra. Efficient query refinement in multimedia databases. In Proc. 16th Intl. Conf. on Data Engineering (ICDE'00), San Diego, CA, Feb. 28--Mar 3 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. S. Chua, S.-Y. Neo, K.-Y. Li, G. Wang, R. Shi, M. Zhao, and H. Xu. TREC VID 2004 search and feature extraction task by NUSPRIS. In TRECVID 2004 Workshop, Gaithersburg, MD, Nov. 2004.]]Google ScholarGoogle Scholar
  4. A. Gupta, T. E. Weymouth, and R. Jain. Semantic queries with pictures: the VIMSYS model. In Intl. Conf. on Very Large Databases (VLDB), pages 69--70, Sep. 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Hauptmann and M. Christel. Successful approaches in the T R E C video retrieval evaluations. In A C M Multimedia, New York, NY, Nov 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Ishikawa, R. Subramanya, and C. Faloutsos. Mind R eader: Querying databases through multiple examples. In Proc. of the 24th Intl. Conference on Very Large Databases (VLDB'98), pages 218--227, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Kennedy, A. Natsev, and S.-F. Chang. Automatic discovery of query-class-dependent models for multimodal search. In ACM Multimedia 2005, Singapore, Nov. 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Lin, B. Tseng, and J. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. In Proc. Text Retrieval Conference (TREC), Gaithersburg, MD, Nov 2003.]]Google ScholarGoogle Scholar
  9. M. Naphade, J. Smith, and F. Souvannavong. On the detection of semantic concepts at TREC VID. In A C M Multimedia, New York, NY, Nov 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. R. Naphade, S. Basu, J. Smith, C. Y. Lin, and B. Tseng. Modeling semantic concepts to support query by keywords in video. In Proc. IEEE Intl. Conference on Image Processing (ICIP'02), Rochester, NY, Sep. 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Nepal and M. V. Ramakrishna. Single feature query by multi examples in image databases. In Proc. SPIE Photonics East Intl. Symposium on Voice, Data and Communications, volume 4210, pages 424--435, 2000.]]Google ScholarGoogle Scholar
  12. K. Porkaew, S. Mehrotra, M. Ortega, and K. Chakrabarti. Similarity search using multiple examples in MARS. In Intl. Conf. on Visual Information Systems (VISUAL'99), pages 68--75, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Rui, T. S. Huang, M. Ortega, and S. Mehrotra. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. on Circuits and Systems for Video Technology, 8:644--656, Sep. 1998.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Singh and R. Kothari. Relevance feedback algorithm based on learning from labeled and unlabeled data. In IEEE ICME 2003, Baltimore, MD, July 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Smeaton, P. Over, and W. Kraaij. TRECVID evaluating the effectiveness of information retrieval tasks on digital video. In ACM Multimedia, New York, NY, Nov 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. M. J. Tax. One-Class Classification: Concept-Learning in the Absence of Counter-Examples. PhD thesis, Delft University of Technology, June 2001.]]Google ScholarGoogle Scholar
  17. S. Tong and E. Chang. Support vector machine active learning for image retrieval. In Proc. ACM Intl. Conf. on Multimedia, Oct. 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.]] Google ScholarGoogle ScholarCross RefCross Ref
  19. J. Wang and J. Li. Learning-based linguistic indexing of pictures with 2-D MHMMs. In ACM Intl. Conf. Multimedia (ACMMM), Juan Les Pin, France, Dec. 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Westerveld and A. P. de Vries. Multimedia retrieval using multiple examples. In CIVR, pages 344--352, 2004.]]Google ScholarGoogle ScholarCross RefCross Ref
  21. R. Yan and A. Hauptmann. Negative pseudo-relevance feedback in content based video retrieval. In ACM Multimedia, Berkeley, CA, Nov 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Yan, J. Yang, and A. Hauptmann. Learning query class-dependent weights in automatic video retrieval. In ACM Multimedia 2004, New York, NY, Oct. 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning the semantics of multimedia queries and concepts from a small number of examples

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia
            November 2005
            1110 pages
            ISBN:1595930442
            DOI:10.1145/1101149

            Copyright © 2005 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 6 November 2005

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            MULTIMEDIA '05 Paper Acceptance Rate49of312submissions,16%Overall Acceptance Rate995of4,171submissions,24%

            Upcoming Conference

            MM '24
            MM '24: The 32nd ACM International Conference on Multimedia
            October 28 - November 1, 2024
            Melbourne , VIC , Australia

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader