Abstract
For UFRGS’s participation on the TEL task at CLEF2008, our aim was to assess the validity of using algorithms for mining association rules to find mappings between concepts on a Cross-Language Information Retrieval scenario. Our approach requires a sample of parallel documents to serve as the basis for the generation of the association rules. The results of the experiments show that the performance of our approach is not statistically different from the monolingual baseline in terms of mean average precision. This is an indication that association rules can be effectively used to map concepts between languages. We have also tested a modification to BM25 that aims at increasing the weight of rare terms. The results show that this modified version achieved better performance. The improvements were considered to be statistically significant in terms of MAP on our monolingual runs.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aguirre, E., et al.: CLEF 2008: Ad Hoc Track Overview. In: Peters, C., et al. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 15–37. Springer, Heidelberg (2009)
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference on Management of Data, Washington, D.C (1993)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th VLDB Conference, Santiago, Chile, pp. 487–499 (1994)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science 41(6), 1–13 (1990)
Google Translator, http://www.google.com/translate_t (accessed on: February 8, 2009)
Hipp, J., Güntzer, U.: Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining. ACM SIGKDD Explorations Newsletter 4(1), 50–55 (2002)
Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Robertson, S., Walker, S.: Okapi at TREC-3. In: Proceedings of the Third Text REtrieval Conference (TREC). Gaithesburg, Maryland (1994)
Snowball. Spanish Stemmer, http://snowball.tartarus.org/algorithms/spanish/stemmer.html (retrieved August 08, 2008)
Veloso, A., Meira Jr., W., Gonçalves, M.A., Zaki, M.: Multi-label Lazy Associative Classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 605–612. Springer, Heidelberg (2007)
Zettair, www.seg.rmit.edu.au/zettair/ (retrieved 11/06/07, 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pinto Geraldo, A., Moreira, V.P. (2009). UFRGS@CLEF2008: Using Association Rules for Cross-Language Information Retrieval. In: Peters, C., et al. Evaluating Systems for Multilingual and Multimodal Information Access. CLEF 2008. Lecture Notes in Computer Science, vol 5706. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04447-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-04447-2_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04446-5
Online ISBN: 978-3-642-04447-2
eBook Packages: Computer ScienceComputer Science (R0)