Abstract
This article describes a large-scale empirical evaluation across different types of English text collections. We ran about 140,000 experiments and analyzed the results on system component-level to find out if we can select configurations that perform reliable on specific types of corpora. To our own surprise we observed that a specific set of configuration parameters achieved 95% of the optimal average MAP across all collections. We conclude that this configuration could be used as baseline reference for evaluation of new IR approaches on English text corpora.
Keywords
- Information Retrieval System
- Ranking Algorithm
- Feedback Model
- Information Retrieval Evaluation
- Porter Stemmer
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cleverdon, C.W.: Report on the testing and analysis of an investigation into the comparative efficiency of indexing systems. Tech. report, Cranfield, USA (1962)
Hanbury, A., Müller, H.: Automated Component–Level Evaluation: Present and Future. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 124–135. Springer, Heidelberg (2010)
Ogilvie, P., Callan, J.: Experiments using the Lemur toolkit. TREC 10, 103–108 (2001)
Ounis, I., Lioma, C., Macdonald, C., Plachouras, V.: Research directions in terrier: a search engine for advanced retrieval on the Web. Novatica/UPGRADE Special Issue on Next Generation Web Search, 49–56 (2007)
Billerbeck, B., Cannane, A., Chattaraj, A., Lester, N., Webber, W., Williams, H.E., Yiannis, J., Zobel, J.: RMIT University at TREC 2004 (2004)
Ferro, N., Harman, D.: CLEF 2009: Grid@CLEF pilot track overview. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mostefa, D., Penas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 552–565. Springer, Heidelberg (2010)
McNamee, P., Nicholas, C., Mayfield, J.: Addressing morphological variation in alphabetic languages. In: ACM SIGIR 2009, pp. 75–82 (2009)
Amati, G.: Frequentist and bayesian approach to information retrieval. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 13–24. Springer, Heidelberg (2006)
Zhou, Y., Croft, W.B.: Ranking robustness: a novel framework to predict query performance. In: ACM CIKM 2006, pp. 567–574 (2006)
Hauff, C., Azzopardi, L., Hiemstra, D.: The combination and evaluation of query performance prediction methods. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 301–312. Springer, Heidelberg (2009)
Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: ACM SIGIR 2001, pp. 111–119 (2001)
Collins-Thompson, K.: Reducing the risk of query expansion via robust constrained optimization. In: ACM CIKM 2009, pp. 837–846 (2009)
Kürsten, J., Eibl, M.: Vergleich von IR Systemkonfigurationen auf Komponentenebene. In: Internationales Symposium der Informationswissenschaft 2011 (to appear, 2011)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, p. 160. Cambridge University Press, Cambridge (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kürsten, J., Eibl, M. (2011). A Large-Scale System Evaluation on Component-Level. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_69
Download citation
DOI: https://doi.org/10.1007/978-3-642-20161-5_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20160-8
Online ISBN: 978-3-642-20161-5
eBook Packages: Computer ScienceComputer Science (R0)