research-article

Fisher kernel based relevance feedback for multimodal video retrieval

Authors:

Ionut Mironica,

Bogdan Ionescu,

Jasper Uijlings,

Nicu SebeAuthors Info & Claims

ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

Pages 65 - 72

https://doi.org/10.1145/2461466.2461478

Published: 16 April 2013 Publication History

Abstract

This paper proposes a novel approach to relevance feedback based on the Fisher Kernel representation in the context of multimodal video retrieval. The Fisher Kernel representation describes a set of features as the derivative with respect to the log-likelihood of the generative probability distribution that models the feature distribution. In the context of relevance feedback, instead of learning the generative probability distribution over all features of the data, we learn it only over the top retrieved results. Hence during relevance feedback we create a new Fisher Kernel representation based on the most relevant examples. In addition, we propose to use the Fisher Kernel to capture temporal information by cutting up a video in smaller segments, extract a feature vector from each segment, and represent the resulting feature set using the Fisher Kernel representation. We evaluate our method on the MediaEval 2012 Video Genre Tagging Task, a large dataset, which contains 26 categories in 15.000 videos totalling up to 2.000 hours of footage. Results show that our method significantly improves results over existing state-of-the-art relevance feedback techniques. Furthermore, we show significant improvements by using the Fisher Kernel to capture temporal information, and we demonstrate that Fisher kernels are well suited for this task.

References

[1]

A. W. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain: "Content-based Image Retrieval at the End of the Early years",IEEE Trans. PAMI, 2000.

Digital Library

[2]

T. Jaakkola, D. Haussler:"Exploiting generative models in discriminative classifiers",In Advances in Neural Information Processing Systems 1999.

Digital Library

[3]

F. Perronnin, J. Sanchez, T. Mensink:"Improving the Fisher Kernel for Large-Scale Image Classification",ECCV, 2010.

Digital Library

[4]

F. Perronnin, J.A. Rodriguez-Serrano,"Fisher Kernels for Handwritten Word-spotting",10th International Conference on Document Analysis and RecognitionPages 106--110, 2009.

Digital Library

[5]

P. Moreno and R. Rifkin."Using the Fisher kernel method for web audio classification",International Conference on Acoustics, Speech, and Signal Processing, pages 2417--2420, 2000.

Digital Library

[6]

ttp://www.multimediaeval.org/mediaeval2012/

[7]

. F. Smeaton, P. Over, W. Kraaij:"High-Level Feature Detection from Video in TRECVid: a 5-Year Retrospective of Achievements",Springer Series on Multimedia Content Analysis Theory and Applications, pp. 151--174, 2009.

[8]

http://trec.nist.gov

[9]

. Rocchio:"Relevance Feedback in Information Retrieval",The Smart Retrieval System Experiments in Automatic Document Processing, G. Salton (Ed.),Prentice Hall, Englewood Cliffs NJ, pp. 313--323, 1971.

[10]

. V. Nguyen, J.-M. Ogier, S. Tabbone, A. Boucher:"Text Retrieval Relevance Feedback Techniques for Bag-of-Words Model in CBIR",ICMLPR, 2009.

[11]

. Rui, T. S. Huang, M. Ortega, M. Mehrotra, S. Beckman:"Relevance feedback: a power tool for interactive content-based image retrieval",IEEE Transactions on Circuits and Video Technology, 1998. %pp. 644--655, 1998.

Digital Library

[12]

. Liang, Z. Sun:"Sketch retrieval and relevance feedback with biased SVM classification",Pattern Recognition Letters, 29, pp. 1733--1741, 2008.

Digital Library

[13]

. Giacinto:"A Nearest-Neighbor Approach to Relevance Feedback in Content-Based Image Retrieval",ACM Confenference on Image and Video Retrieval, 2007.

Digital Library

[14]

. Yu, Y. Lu, Y. Xu, N. Sebe, Q. Tian:"Integrating Relevance Feedback in Boosting for Content-Based Image Retrieval",ASSP, 2007.

[15]

. Wu, A. Zhang:"Interactive pattern analysis for relevance feedback in multimedia information retrieval",Multimedia Systems, 10(1), pp. 41--55, 2004.

Digital Library

[16]

. Yuanhua Lv, C. Zhai:"Adaptive Relevance Feedback in Information Retrieval",Information and Knowledge Management Conference, 2009.

Digital Library

[17]

. Bian, D. Tao:"Biased discriminant euclidean embedding for content-based image retrieval",IEEE Trans. Image Process., 545--554, 2010.

Digital Library

[18]

. Tao, X. Li, S. Maybank:"Negative samples analysis in relevance feedback"IEEE Trans. Knowl. Data Eng., 568--580, 2010.

Digital Library

[19]

G. Hauptmann, M. G. Christel, and R. Yan:"Video retrieval based on semantic concepts",Proceedings of the IEEE, vol. 96, pp. 602--622, 2008.

[20]

T. Mei, B. Yang, X. Hua, S. Li:"Contextual Video Recommendation by Multimodal Relevance and User Feedback",Information Systems (TOIS), 2011.

Digital Library

[21]

B. Ionescu, K. Seyerlehner, I. Mironica, C. Vertan, P. Lambert:"An Audio-Visual Approach to Web Video Categorization",MTAP, 2012.%metrics

[22]

I. Mironica, B. Ionescu, C. Vertan:"The influence of the similarity measure to relevance feedback",in Proceedings of the European Signal Processing Conference, Eusipco 2012.

[23]

.H. Cha:"Comprehensive Survey on Distance/Similarity Measures Between Probability Density Functions",Int. Journal of Mathematical Models and Methods in Applied Sciences, 2007.% pp. 300--307, 2007.

[24]

. Rubner, C. Tomasi, L. J. Guibas:"A Metric for Distributions with Applications to Image Databases", European Conference on Computer Vision,1998.

Digital Library

[25]

. Deza, M.M. Deza:"Dictionary of Distances",Elsevier Science, 1st edition, 2006.

[26]

. Hatzigiorgaki, A. N. Skodras:"Compressed Domain Image Retrieval: A Comparative Study of Similarity Metrics", SPIE Visual Communications and Image Processing, vol. 5150, 2003.

[27]

. Kelm, S. Schmiedeke, T. Sikora,"Feature-based video key frame extraction for low quality video sequences",WIAMIS, 2009.

[28]

K. Seyerlehner, M. Schedl, T. Pohle, P. Knees:"Using Block Level Features for Genre Classification, Tag Classification and Music Similarity Estimation",Music Information Retrieval Evaluation eXchange, 2010.

[29]

. Liu, L. Xie, H. Meng:"Classification of music and speech in mandarin news broadcasts", Conf. on Machine Speech Communication 2007.

[30]

aafe core features,http://yaafe.sourceforge.net/

[31]

. Sikora:"The MPEG-7 Visual Standard for Content Description - An Overview",IEEE Transactions on Circuits and Systems for Video Technology, 2001.

Digital Library

[32]

. Ludwig, D. Delgado, V. Goncalves, U. Nunes:"Trainable Classifier-Fusion Schemes: An Application To Pedestrian Detection",IEEE Int. Conference On Intelligent Transportation Systems, 1, pp. 432--437, 2009.

[33]

. Rasche:"An Approach to the Parameterization of Structure for Fast Categorization",Int. Journal of Computer Vision, 87(3), pp. 337--356, 2010.

Digital Library

[34]

S. Nowak, M. Huiskes:"New strategies for image annotation: Overview of the photo annotation task at ImageClef 2010",In the Working Notes of CLEF 2010.

[35]

L. Lamel, J.-L. Gauvain:"Speech Processing for Audio Indexing",Int. Conf. on Natural Language Processing, LNCS, 5221, pp. 4--15, Springer Verlag, 2008.

Digital Library

Cited By

Wang ZYang JGuo BZhang X(2019)Security Model of Internet of Things Based on Binary Wavelet and Sparse Neural NetworkInternational Journal of Mobile Computing and Multimedia Communications10.4018/IJMCMC.201901010110:1(1-17)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.4018/IJMCMC.2019010101
Xiong WBogdanov PZheleva M(2019)Robust and Efficient Modulation Recognition Based on Local Sequential IQ FeaturesIEEE INFOCOM 2019 - IEEE Conference on Computer Communications10.1109/INFOCOM.2019.8737397(1612-1620)Online publication date: 29-Apr-2019
https://dl.acm.org/doi/10.1109/INFOCOM.2019.8737397
Hong RHe JZhang HChua THanjalic ASnoek CWorring MBulterman DHuet BKelliher AKompatsiaris YLi J(2016)Mental Visual IndexingProceedings of the 24th ACM international conference on Multimedia10.1145/2964284.2967296(621-625)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1145/2964284.2967296
Show More Cited By

Index Terms

Fisher kernel based relevance feedback for multimodal video retrieval
1. Information systems
  1. Information retrieval
2. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Abstraction

Recommendations

Fisher Kernel Temporal Variation-based Relevance Feedback for video retrieval

We proposed a novel framework for Relevance Feedback based on the Fisher Kernel.The Fisher Kernel representation makes possible to capture temporal variation by using frame-based features.We experiment on a high variety of scenarios and public datasets (...
Multimodal retrieval with relevance feedback based on genetic programming

This paper presents a framework for multimodal retrieval with relevance feedback based on genetic programming. In this supervised learning-to-rank framework, genetic programming is used for the discovery of effective combination functions of (multimodal)...
Image retrieval based on indexing and relevance feedback

In content based image retrieval (CBIR) system, search engine retrieves the images similar to the query image according to a similarity measure. It should be fast enough and must have a high precision of retrieval. Indexing scheme is used to achieve a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

April 2013

362 pages

ISBN:9781450320337

DOI:10.1145/2461466

General Chairs:
Ramesh Jain
University of California, Irvine, USA
,
Balakrisknan Prabhakaran
University of Texas at Dallas, USA
,
Program Chairs:
Marcel Worring
University of Amsterdam, The Netherlands
,
John Smith
IBM Research, New York, USA
,
Tat-Seng Chua
National University of Singapore

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 April 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMR'13

Sponsor:

SIGMM

ICMR'13: International Conference on Multimedia Retrieval

April 16 - 20, 2013

Texas, Dallas, USA

Acceptance Rates

ICMR '13 Paper Acceptance Rate 38 of 96 submissions, 40%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
191
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang ZYang JGuo BZhang X(2019)Security Model of Internet of Things Based on Binary Wavelet and Sparse Neural NetworkInternational Journal of Mobile Computing and Multimedia Communications10.4018/IJMCMC.201901010110:1(1-17)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.4018/IJMCMC.2019010101
Xiong WBogdanov PZheleva M(2019)Robust and Efficient Modulation Recognition Based on Local Sequential IQ FeaturesIEEE INFOCOM 2019 - IEEE Conference on Computer Communications10.1109/INFOCOM.2019.8737397(1612-1620)Online publication date: 29-Apr-2019
https://dl.acm.org/doi/10.1109/INFOCOM.2019.8737397
Hong RHe JZhang HChua THanjalic ASnoek CWorring MBulterman DHuet BKelliher AKompatsiaris YLi J(2016)Mental Visual IndexingProceedings of the 24th ACM international conference on Multimedia10.1145/2964284.2967296(621-625)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1145/2964284.2967296
(2016)On interactive learning-to-rank for IRNeurocomputing10.1016/j.neucom.2016.03.084208:C(3-24)Online publication date: 5-Oct-2016
https://dl.acm.org/doi/10.1016/j.neucom.2016.03.084
Bdiri TBouguila NZiou D(2016)A statistical framework for online learning using adjustable model selection criteriaEngineering Applications of Artificial Intelligence10.1016/j.engappai.2015.10.01149:C(19-42)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1016/j.engappai.2015.10.011
Mironică IIonescu BUijlings JSebe N(2016)Fisher Kernel Temporal Variation-based Relevance Feedback for video retrievalComputer Vision and Image Understanding10.1016/j.cviu.2015.10.005143:C(38-51)Online publication date: 1-Feb-2016
https://dl.acm.org/doi/10.1016/j.cviu.2015.10.005
Beecks CUysal MHermanns JSeidl TBailey JMoffat AAggarwal Cde Rijke MKumar RMurdock VSellis TYu J(2015)Gradient-based Signatures for Efficient Similarity Search in Large-scale Multimedia DatabasesProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806459(1241-1250)Online publication date: 17-Oct-2015
https://dl.acm.org/doi/10.1145/2806416.2806459
Jiang LMitamura TYu SHauptmann AKankanhalli MRueger SManmatha RJose Jvan Rijsbergen K(2014)Zero-Example Event Search using MultiModal Pseudo Relevance FeedbackProceedings of International Conference on Multimedia Retrieval10.1145/2578726.2578764(297-304)Online publication date: 1-Apr-2014
https://dl.acm.org/doi/10.1145/2578726.2578764
Rostamzadeh NZen GMironică IUijlings JSebe N(2013)Daily Living Activities Recognition via Efficient High and Low Level Cues Combination and Fisher Kernel RepresentationImage Analysis and Processing – ICIAP 201310.1007/978-3-642-41181-6_44(431-441)Online publication date: 2013
https://doi.org/10.1007/978-3-642-41181-6_44

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten