UTACLIR @ CLEF 2002 — Bilingual and Multilingual Runs with a Unified Process

Airio, Eija; Keskustalo, Heikki; Hedlund, Turid; Pirkola, Ari

doi:10.1007/978-3-540-45237-9_7

Eija Airio⁵,
Heikki Keskustalo⁵,
Turid Hedlund⁵ &
…
Ari Pirkola⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2785))

Included in the following conference series:

Workshop of the Cross-Language Evaluation Forum for European Languages

314 Accesses
3 Citations

Abstract

The UTACLIR system of University of Tampere uses a dictionary-based CLIR approach. The idea of UTACLIR is to recognize distinct source key types and process them accordingly. The linguistic resources utilized by the framework include morphological analysis or stemming in indexing, normalization of topic words, stop word removal, splitting of compounds, translation utilizing bilingual dictionaries, handling of non-translated words, phrase composition of compounds in the target language, and constructing structured queries. UTACLIR was shown to perform consistently with different language pairs. The greatest differences in performance are due to the translation dictionary used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hedlund, T., Keskustalo, H., Pirkola, A., Airio, E., Järvelin K.: Utaclir @ CLEF 2001 — Effects of compound splitting and n-gram techniques. Evaluation of Cross-language Information Retrieval Systems. Lecture Notes in Computer Science; Vol. 2406. Springer-Verlag, Berlin Heidelberg New York (2002) 118–136
Google Scholar
Pirkola, A., Keskustalo, H., Leppänen, E., Känsälä, A., Järvelin, K.: Targeted s-gram matching: a novel n-gram matching technique for cross-and monolingual word form variants. Information Research, 7(2) (2002), http://www.InformationR.net/ir/7-2/paperl26.html
Pirkola, A.: The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. Proceedings of the 21^st ACM/SIGIR Conference (1998) 55-63
Google Scholar
Hedlund, T., Keskustalo, H., Pirkola, A, Airio, E., Järvelin, K.: UTACLIR @ CLEF 2001: New features for handling compound words and untranslatable proper names. Working Notes for the CLEF 2001 Workshop, Italy (2001) 118-136, http://www.ercim.org/publication/ws-proceedings/CLEF2/hedlund.pdf
Kekäläinen, J., Järvelin, K.: The impact of query structure and query expansion on retrieval performance. Proceedings of the 21^st ACM/SIGIR Conference (1998) 130-137
Google Scholar
Hedlund, T., Pirkola, A., Keskustalo, H., Airio, E.: Cross-language information retrieval: using multiple language pairs. Proceedings of ProLISSA. The Second Biannual DISSAnet Conference, Pretoria (2002) 24-25 October, 2002
Google Scholar
Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. Proceedings of the 18^tth ACM/SIGIR Conference (1995) 21-28
Google Scholar
Chen, A.: Cross-language retrieval experiments at CLEF 2002. Working Notes for the CLEF 2002 Workshop, Italy (2002) 5-20, http://clef.iei.pi.cnr.it:2002/workshop2002/WN/01.pdf
Nie, J., Jin, F.: Merging different languages in a single document collection. Working Notes for the CLEF 2002 Workshop, Italy (2002) 59-62, http://clef.iei.pi.cnr.it:2002/workshop2002AVN/6.pdf

Download references

Author information

Authors and Affiliations

Department of Information Studies, University of Tampere, Finland
Eija Airio, Heikki Keskustalo, Turid Hedlund & Ari Pirkola

Authors

Eija Airio
View author publications
You can also search for this author in PubMed Google Scholar
Heikki Keskustalo
View author publications
You can also search for this author in PubMed Google Scholar
Turid Hedlund
View author publications
You can also search for this author in PubMed Google Scholar
Ari Pirkola
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche (ISTI-CNR), Via G. Moruzzi 1, 56124, Pisa, Italy
Carol Peters
Eurospider Information Technology AG, Schaffhauserstr. 18, 8006, Zürich, Switzerland
Martin Braschler
Universidad Nacional de Educación a Distancia Lenguajes y Sístemas Informáticos, Ciudad Universitaria, 28040, Madrid, Spain
Julio Gonzalo
Informationszentrum Sozialwissenschaften, Arbeitsgemeinschaft Sozialwissenschaftlicher Institute e.V. (IZ), Lennéstr. 30, 53113, Bonn, Germany
Michael Kluck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Airio, E., Keskustalo, H., Hedlund, T., Pirkola, A. (2003). UTACLIR @ CLEF 2002 — Bilingual and Multilingual Runs with a Unified Process. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Advances in Cross-Language Information Retrieval. CLEF 2002. Lecture Notes in Computer Science, vol 2785. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45237-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-45237-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40830-7
Online ISBN: 978-3-540-45237-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics