skip to main content
10.1145/2461466.2461478acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Fisher kernel based relevance feedback for multimodal video retrieval

Authors Info & Claims
Published:16 April 2013Publication History

ABSTRACT

This paper proposes a novel approach to relevance feedback based on the Fisher Kernel representation in the context of multimodal video retrieval. The Fisher Kernel representation describes a set of features as the derivative with respect to the log-likelihood of the generative probability distribution that models the feature distribution. In the context of relevance feedback, instead of learning the generative probability distribution over all features of the data, we learn it only over the top retrieved results. Hence during relevance feedback we create a new Fisher Kernel representation based on the most relevant examples. In addition, we propose to use the Fisher Kernel to capture temporal information by cutting up a video in smaller segments, extract a feature vector from each segment, and represent the resulting feature set using the Fisher Kernel representation. We evaluate our method on the MediaEval 2012 Video Genre Tagging Task, a large dataset, which contains 26 categories in 15.000 videos totalling up to 2.000 hours of footage. Results show that our method significantly improves results over existing state-of-the-art relevance feedback techniques. Furthermore, we show significant improvements by using the Fisher Kernel to capture temporal information, and we demonstrate that Fisher kernels are well suited for this task.

References

  1. A. W. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain: "Content-based Image Retrieval at the End of the Early years",IEEE Trans. PAMI, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Jaakkola, D. Haussler:"Exploiting generative models in discriminative classifiers",In Advances in Neural Information Processing Systems 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. Perronnin, J. Sanchez, T. Mensink:"Improving the Fisher Kernel for Large-Scale Image Classification",ECCV, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. F. Perronnin, J.A. Rodriguez-Serrano,"Fisher Kernels for Handwritten Word-spotting",10th International Conference on Document Analysis and RecognitionPages 106--110, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Moreno and R. Rifkin."Using the Fisher kernel method for web audio classification",International Conference on Acoustics, Speech, and Signal Processing, pages 2417--2420, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. ttp://www.multimediaeval.org/mediaeval2012/Google ScholarGoogle Scholar
  7. . F. Smeaton, P. Over, W. Kraaij:"High-Level Feature Detection from Video in TRECVid: a 5-Year Retrospective of Achievements",Springer Series on Multimedia Content Analysis Theory and Applications, pp. 151--174, 2009.Google ScholarGoogle Scholar
  8. http://trec.nist.govGoogle ScholarGoogle Scholar
  9. . Rocchio:"Relevance Feedback in Information Retrieval",The Smart Retrieval System Experiments in Automatic Document Processing, G. Salton (Ed.),Prentice Hall, Englewood Cliffs NJ, pp. 313--323, 1971.Google ScholarGoogle Scholar
  10. . V. Nguyen, J.-M. Ogier, S. Tabbone, A. Boucher:"Text Retrieval Relevance Feedback Techniques for Bag-of-Words Model in CBIR",ICMLPR, 2009.Google ScholarGoogle Scholar
  11. . Rui, T. S. Huang, M. Ortega, M. Mehrotra, S. Beckman:"Relevance feedback: a power tool for interactive content-based image retrieval",IEEE Transactions on Circuits and Video Technology, 1998. %pp. 644--655, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. . Liang, Z. Sun:"Sketch retrieval and relevance feedback with biased SVM classification",Pattern Recognition Letters, 29, pp. 1733--1741, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. . Giacinto:"A Nearest-Neighbor Approach to Relevance Feedback in Content-Based Image Retrieval",ACM Confenference on Image and Video Retrieval, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. . Yu, Y. Lu, Y. Xu, N. Sebe, Q. Tian:"Integrating Relevance Feedback in Boosting for Content-Based Image Retrieval",ASSP, 2007.Google ScholarGoogle Scholar
  15. . Wu, A. Zhang:"Interactive pattern analysis for relevance feedback in multimedia information retrieval",Multimedia Systems, 10(1), pp. 41--55, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. . Yuanhua Lv, C. Zhai:"Adaptive Relevance Feedback in Information Retrieval",Information and Knowledge Management Conference, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. . Bian, D. Tao:"Biased discriminant euclidean embedding for content-based image retrieval",IEEE Trans. Image Process., 545--554, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. . Tao, X. Li, S. Maybank:"Negative samples analysis in relevance feedback"IEEE Trans. Knowl. Data Eng., 568--580, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Hauptmann, M. G. Christel, and R. Yan:"Video retrieval based on semantic concepts",Proceedings of the IEEE, vol. 96, pp. 602--622, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  20. T. Mei, B. Yang, X. Hua, S. Li:"Contextual Video Recommendation by Multimodal Relevance and User Feedback",Information Systems (TOIS), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Ionescu, K. Seyerlehner, I. Mironica, C. Vertan, P. Lambert:"An Audio-Visual Approach to Web Video Categorization",MTAP, 2012.%metricsGoogle ScholarGoogle Scholar
  22. I. Mironica, B. Ionescu, C. Vertan:"The influence of the similarity measure to relevance feedback",in Proceedings of the European Signal Processing Conference, Eusipco 2012.Google ScholarGoogle Scholar
  23. .H. Cha:"Comprehensive Survey on Distance/Similarity Measures Between Probability Density Functions",Int. Journal of Mathematical Models and Methods in Applied Sciences, 2007.% pp. 300--307, 2007.Google ScholarGoogle Scholar
  24. . Rubner, C. Tomasi, L. J. Guibas:"A Metric for Distributions with Applications to Image Databases", European Conference on Computer Vision,1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. . Deza, M.M. Deza:"Dictionary of Distances",Elsevier Science, 1st edition, 2006.Google ScholarGoogle Scholar
  26. . Hatzigiorgaki, A. N. Skodras:"Compressed Domain Image Retrieval: A Comparative Study of Similarity Metrics", SPIE Visual Communications and Image Processing, vol. 5150, 2003.Google ScholarGoogle Scholar
  27. . Kelm, S. Schmiedeke, T. Sikora,"Feature-based video key frame extraction for low quality video sequences",WIAMIS, 2009.Google ScholarGoogle Scholar
  28. K. Seyerlehner, M. Schedl, T. Pohle, P. Knees:"Using Block Level Features for Genre Classification, Tag Classification and Music Similarity Estimation",Music Information Retrieval Evaluation eXchange, 2010.Google ScholarGoogle Scholar
  29. . Liu, L. Xie, H. Meng:"Classification of music and speech in mandarin news broadcasts", Conf. on Machine Speech Communication 2007.Google ScholarGoogle Scholar
  30. aafe core features,http://yaafe.sourceforge.net/Google ScholarGoogle Scholar
  31. . Sikora:"The MPEG-7 Visual Standard for Content Description - An Overview",IEEE Transactions on Circuits and Systems for Video Technology, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. . Ludwig, D. Delgado, V. Goncalves, U. Nunes:"Trainable Classifier-Fusion Schemes: An Application To Pedestrian Detection",IEEE Int. Conference On Intelligent Transportation Systems, 1, pp. 432--437, 2009.Google ScholarGoogle Scholar
  33. . Rasche:"An Approach to the Parameterization of Structure for Fast Categorization",Int. Journal of Computer Vision, 87(3), pp. 337--356, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Nowak, M. Huiskes:"New strategies for image annotation: Overview of the photo annotation task at ImageClef 2010",In the Working Notes of CLEF 2010.Google ScholarGoogle Scholar
  35. L. Lamel, J.-L. Gauvain:"Speech Processing for Audio Indexing",Int. Conf. on Natural Language Processing, LNCS, 5221, pp. 4--15, Springer Verlag, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fisher kernel based relevance feedback for multimodal video retrieval

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
            April 2013
            362 pages
            ISBN:9781450320337
            DOI:10.1145/2461466

            Copyright © 2013 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 April 2013

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            ICMR '13 Paper Acceptance Rate38of96submissions,40%Overall Acceptance Rate254of830submissions,31%

            Upcoming Conference

            ICMR '24
            International Conference on Multimedia Retrieval
            June 10 - 14, 2024
            Phuket , Thailand

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader