Abstract
The UTACLIR system of University of Tampere uses a dictionary-based CLIR approach. The idea of UTACLIR is to recognize distinct source key types and process them accordingly. The linguistic resources utilized by the framework include morphological analysis or stemming in indexing, normalization of topic words, stop word removal, splitting of compounds, translation utilizing bilingual dictionaries, handling of non-translated words, phrase composition of compounds in the target language, and constructing structured queries. UTACLIR was shown to perform consistently with different language pairs. The greatest differences in performance are due to the translation dictionary used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hedlund, T., Keskustalo, H., Pirkola, A., Airio, E., Järvelin K.: Utaclir @ CLEF 2001 — Effects of compound splitting and n-gram techniques. Evaluation of Cross-language Information Retrieval Systems. Lecture Notes in Computer Science; Vol. 2406. Springer-Verlag, Berlin Heidelberg New York (2002) 118–136
Pirkola, A., Keskustalo, H., Leppänen, E., Känsälä, A., Järvelin, K.: Targeted s-gram matching: a novel n-gram matching technique for cross-and monolingual word form variants. Information Research, 7(2) (2002), http://www.InformationR.net/ir/7-2/paperl26.html
Pirkola, A.: The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. Proceedings of the 21st ACM/SIGIR Conference (1998) 55-63
Hedlund, T., Keskustalo, H., Pirkola, A, Airio, E., Järvelin, K.: UTACLIR @ CLEF 2001: New features for handling compound words and untranslatable proper names. Working Notes for the CLEF 2001 Workshop, Italy (2001) 118-136, http://www.ercim.org/publication/ws-proceedings/CLEF2/hedlund.pdf
Kekäläinen, J., Järvelin, K.: The impact of query structure and query expansion on retrieval performance. Proceedings of the 21st ACM/SIGIR Conference (1998) 130-137
Hedlund, T., Pirkola, A., Keskustalo, H., Airio, E.: Cross-language information retrieval: using multiple language pairs. Proceedings of ProLISSA. The Second Biannual DISSAnet Conference, Pretoria (2002) 24-25 October, 2002
Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. Proceedings of the 18tth ACM/SIGIR Conference (1995) 21-28
Chen, A.: Cross-language retrieval experiments at CLEF 2002. Working Notes for the CLEF 2002 Workshop, Italy (2002) 5-20, http://clef.iei.pi.cnr.it:2002/workshop2002/WN/01.pdf
Nie, J., Jin, F.: Merging different languages in a single document collection. Working Notes for the CLEF 2002 Workshop, Italy (2002) 59-62, http://clef.iei.pi.cnr.it:2002/workshop2002AVN/6.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Airio, E., Keskustalo, H., Hedlund, T., Pirkola, A. (2003). UTACLIR @ CLEF 2002 — Bilingual and Multilingual Runs with a Unified Process. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Advances in Cross-Language Information Retrieval. CLEF 2002. Lecture Notes in Computer Science, vol 2785. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45237-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-45237-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40830-7
Online ISBN: 978-3-540-45237-9
eBook Packages: Springer Book Archive