Abstract
This paper describes MIRACLE approach to WebCLEF. A set of independent indexes was constructed for each top level domain of the EuroGOV collection. Each index contains information extracted from the document, like URL, title, keywords, detected named entities or HTML headers. These indexes are queried to obtain partial document rankings, which are combined with various relative weights to test the value of each index. The final aim is to identify which index (or combination of them) is more relevant for a retrieval task, avoiding the construction of a full-text index.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Goñi-Menoyo, J.M., González, J.C., Villena-Román, J.: MIRACLE at Ad-Hoc CLEF 2005: Merging and Combining Without Using a Single Approach. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 44–53. Springer, Heidelberg (2006)
Ragget, D., Le Hors, A., Jacobs, I. (eds.): HTML 4.01 Specification. W3C Recommendation, visited 15/07/2005, (1999), Online http://www.w3.org/TR/html4/
Sigurbjörnsson, B., Kamps, J., de Rijke, M.: Overview of WebCLEF 2005. In: Proceedings of the Cross Language Evaluation Forum 2005. LNCS, Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martínez-González, Á., Martínez-Fernández, J.L., de Pablo-Sánchez, C., Villena-Román, J. (2006). MIRACLE at WebCLEF 2005: Combining Web Specific and Linguistic Information. In: Peters, C., et al. Accessing Multilingual Information Repositories. CLEF 2005. Lecture Notes in Computer Science, vol 4022. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11878773_95
Download citation
DOI: https://doi.org/10.1007/11878773_95
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45697-1
Online ISBN: 978-3-540-45700-8
eBook Packages: Computer ScienceComputer Science (R0)