Abstract
This paper proposes a new document–query similarity for PLSI that allows queries to be used in PLSI without folding-in. We compare this similarity to Fisher kernels, the state-of-the-art approach for PLSI, on a corpus of 1M+ word occurrences coming from TREC–AP.
The hereby described work has been supported by the Swiss National Science Foundation doctoral grants #200021-111817 and #200020-119745.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cadez, I.V., Gaffney, S., Smyth, P.: A general probabilistic framework for clustering individuals and objects. In: Proc. of 6th KDD, pp. 140–149 (2000)
Chappelier, J.-C., Eckard, E.: PLSI: the true Fisher kernel and beyond. In: Proc. of ECML/PKDD (2009)
Hinneburg, A., Gabriel, H.-H., Gohr, A.: Bayesian folding-in with Dirichlet kernels for PLSI. In: Proc. of 7th Int. Conf. on Data Mining, pp. 499–504 (2007)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proc. of 22th Int. Conf. on Research and Development in Information Retrieval (SIGIR), pp. 50–57 (1999)
Hofmann, T.: Learning the similarity of documents. In: Adv. in Neural Information Processing Systems, vol. 12, pp. 914–920 (2000)
Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proc. of 24th Annual Int. Conference on Research and Development in Information Retrieval (SIGIR), pp. 111–119 (2001)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proc. of 21st SIGIR, pp. 275–281 (1998)
Welling, M., Chemudugunta, C., Sutter, N.: Deterministic latent variable models and their pitfalls. In: SIAM Conference on Data Mining SDM 2008 (2008)
Zhai, C.: Statistical language models for information retrieval: A critical review. Foundations and Trends in Information Retrieval 2(3), 137–213 (2008)
Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: Proc. of 10th CIKM, pp. 403–410 (2001)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chappelier, JC., Eckard, E. (2009). An Ad Hoc Information Retrieval Perspective on PLSI through Language Model Identification. In: Azzopardi, L., et al. Advances in Information Retrieval Theory. ICTIR 2009. Lecture Notes in Computer Science, vol 5766. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04417-5_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-04417-5_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04416-8
Online ISBN: 978-3-642-04417-5
eBook Packages: Computer ScienceComputer Science (R0)