skip to main content
10.1145/1148170.1148228acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Probabilistic latent query analysis for combining multiple retrieval sources

Published: 06 August 2006 Publication History

Abstract

Combining the output from multiple retrieval sources over the same document collection is of great importance to a number of retrieval tasks such as multimedia retrieval, web retrieval and meta-search. To merge retrieval sources adaptively according to query topics, we propose a series of new approaches called probabilistic latent query analysis (pLQA), which can associate non-identical combination weights with latent classes underlying the query space. Compared with previous query independent and query-class based combination methods, the proposed approaches have the advantage of being able to discover latent query classes automatically without using prior human knowledge, to assign one query to a mixture of query classes, and to determine the number of query classes under a model selection principle. Experimental results on two retrieval tasks, i.e., multimedia retrieval and meta-search, demonstrate that the proposed methods can uncover sensible latent classes from training data, and can achieve considerable performance gains.

References

[1]
T. S. Chua, S. Y. Neo, K. Li, G. H. Wang, R. Shi, M. Zhao, H. Xu, S. Gao, and T. L. Nwe. Trecvid 2004 search and feature extraction task by NUS PRIS. In NIST TRECVID, 2004.
[2]
T. Coleman and Y. Li. An interior, trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization, 6:418--445, 1996.
[3]
A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1--38, 1977.
[4]
N. Fuhr. Probabilistic models in information retrieval. The Computer Journal, 35(3):243--255, 1992.
[5]
A. Hauptmann, M.-Y. Chen, M. Christel, C. Huang, W.-H. Lin, T. Ng, N. Papernick, A. Velivelli, J. Yang, R. Yan, H. Yang, and H. D. Wactlar. Confounded expectations: Informedia at trecvid 2004. In Proc. of TRECVID, 2004.
[6]
T. Hofmann. Probabilistic latent semantic indexing. In Proc. of the 22nd Intl. ACM SIGIR conference, pages 50--57, 1999.
[7]
I.-H. Kang and G. Kim. Query type classification for web document retrieval. In Proc. of the 26th ACM SIGIR, pages 64--71. ACM Press, 2003.
[8]
L. Kennedy, P. Natsev, and S.-F. Chang. Automatic discovery of query class dependent models for multimodal search. In ACM Multimedia, Singapore, November 2005.
[9]
G. Kimeldorf and G. Wahba. Some results on tchebycheffian spline functions. J. Math. Anal. Applic., 33:82--95, 1971.
[10]
R. Manmatha, F. Feng, and T. Rath. Using models of score distributions in information retrieval. In Proc. of the 27th ACM SIGIR Conference on Research and Development in Information Retrieval, 2001.
[11]
M. Montague and J. A. Aslam. Relevance score normalization for metasearch. In Proceedings of the 10th international ACM CIKM conference, pages 427--433, New York, NY, USA, 2001.
[12]
R. Nallapati. Discriminative models for information retrieval. In Proc. of the 27th SIGIR conference on Research and development in information retrieval, pages 64--71, 2004.
[13]
S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society for Informaiton Science, 27, 1977.
[14]
G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6(2):461--464, 1978.
[15]
J. A. Shaw and E. A. Fox. Combination of multiple searches. In Text REtrieval Conference, 1994.
[16]
A. Smeaton and P. Over. TRECVID: Benchmarking the effectiveness of information retrieval tasks on digital video. In Proc. of the Intl. Conf. on Image and Video Retrieval, 2003.
[17]
E. M. Voorhees, N. K. Gupta, and B. Johnson-Laird. Learning collection fusion strategies. In Proc. of the 18th ACM SIGIR conference on Research and development in information retrieval, pages 172--179, 1995.
[18]
E. M. Voorhees and D. Harman. Overview of the eighth text retrieval conference (trec-8). In TREC, 1999.
[19]
R. Yan and A. G. Hauptmann. Efficient margin-based rank learning algorithms for information retrieval. In International Conference on Image and Video Retrieval(CIVR), 2006.
[20]
R. Yan, J. Yang, and A. G. Hauptmann. Learning query-class dependent weights in automatic video retrieval. In Proceedings of the 12th annual ACM international conference on Multimedia, pages 548--555, 2004.
[21]
E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In Proceedings of the 28th international ACM SIGIR conference, pages 512--519, New York, NY, USA, 2005. ACM Press.

Cited By

View all
  • (2015)Evaluation of semi-supervised learning method on action recognitionMultimedia Tools and Applications10.1007/s11042-014-1936-z74:2(523-542)Online publication date: 1-Jan-2015
  • (2014)Multimedia search rerankingACM Computing Surveys10.1145/253679846:3(1-38)Online publication date: 1-Jan-2014
  • (2013)Probabilistic latent class models for predicting student performanceProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2507832(1513-1516)Online publication date: 27-Oct-2013
  • Show More Cited By

Index Terms

  1. Probabilistic latent query analysis for combining multiple retrieval sources

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
    August 2006
    768 pages
    ISBN:1595933697
    DOI:10.1145/1148170
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 August 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. combination
    2. query class
    3. retrieval
    4. statistical learning

    Qualifiers

    • Article

    Conference

    SIGIR06
    Sponsor:
    SIGIR06: The 29th Annual International SIGIR Conference
    August 6 - 11, 2006
    Washington, Seattle, USA

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2015)Evaluation of semi-supervised learning method on action recognitionMultimedia Tools and Applications10.1007/s11042-014-1936-z74:2(523-542)Online publication date: 1-Jan-2015
    • (2014)Multimedia search rerankingACM Computing Surveys10.1145/253679846:3(1-38)Online publication date: 1-Jan-2014
    • (2013)Probabilistic latent class models for predicting student performanceProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2507832(1513-1516)Online publication date: 27-Oct-2013
    • (2013)Query-Document-Dependent FusionIEEE Transactions on Multimedia10.1109/TMM.2013.228043715:8(1830-1842)Online publication date: 1-Dec-2013
    • (2013)Multi-Feature Fusion via Hierarchical Regression for Multimedia AnalysisIEEE Transactions on Multimedia10.1109/TMM.2012.223473115:3(572-581)Online publication date: 1-Apr-2013
    • (2012)Parallel Lasso for Large-Scale Video Concept DetectionIEEE Transactions on Multimedia10.1109/TMM.2011.217478114:1(55-65)Online publication date: 1-Feb-2012
    • (2012)Forecasting user visits for online display advertisingInformation Retrieval10.1007/s10791-012-9201-416:3(369-390)Online publication date: 30-May-2012
    • (2012)A latent variable ranking model for content-based retrievalProceedings of the 34th European conference on Advances in Information Retrieval10.1007/978-3-642-28997-2_29(340-351)Online publication date: 1-Apr-2012
    • (2012)Multi-modal solution for unconstrained news story retrievalProceedings of the 18th international conference on Advances in Multimedia Modeling10.1007/978-3-642-27355-1_19(186-195)Online publication date: 4-Jan-2012
    • (2011)Document dependent fusion in multimodal music retrievalProceedings of the 19th ACM international conference on Multimedia10.1145/2072298.2071949(1105-1108)Online publication date: 28-Nov-2011
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media