Skip to main content

Exploiting Thesauri and Hierarchical Categories in Cross-Language Information Retrieval

  • Conference paper
  • First Online:
Text, Speech and Dialogue (TSD 2002)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2448))

Included in the following conference series:

  • 584 Accesses

Abstract

As Internet resources become accessible worldwide, need to develop efficient methods for information retrieval across languages becomes primordial. In the present paper, we focus on query expansion techniques to improve the effectiveness of information retrieval. A combination to a dictionary-based translation and statisticalbased disambiguation is indispensable to overcome translation’s ambiguity. We propose a model, which uses multiple sources for query reformulation and expansion to select expansion terms and retrieve information needed by a user. Relevance feedback, thesaurus-based expansion, as well as a new feedback strategy, based on the extraction of domain keywords to expand user’s query, are introduced and evaluated. We evaluated the effectiveness of the proposed combined method using an application of a French-English information retrieval.

The present research study is supported in part by the Ministry of Education, Culture, Sports, Science and Technology of Japan, under grants 11480088, 12680417 and 12208032, and by the CREST program of the JST Corporation (Japan Science and Technology).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ballesteros, L. and Croft, W.B.: Phrasal Translation and Query Expansion Techniques for Cross-Language Information Retrieval. In: Proceedings of the 20th ACM SIGIR Conference (1997) pp. 84–91.

    Google Scholar 

  2. Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics, Vol. 19. No. 1 (1993) pp. 61–74.

    Google Scholar 

  3. Loupy, C., Bellot, P., El-Beze, M. and Marteau, P.-F.: Query Expansion and Classification of Retrieved Documents. In Proceedings of TREC-7. NIST Publication (1998).

    Google Scholar 

  4. Richardson, R., Smeaton, A.F.: Using WordNet in Knowledge-based Approach to Information Retrieval. In Proceedings BCS-IRSG Colloquium, CREWE (1995).

    Google Scholar 

  5. Sadat, F., Maeda, A., Yoshikawa, M. and Uemura, S.: Integrating Dictionary-based and Statisticalbased Approaches in Cross-Language Information Retrieval. IPSJ SIG Notes, 2000-DBS-121/2000-FI-58 (2000) pp. 61–68.

    Google Scholar 

  6. Sadat, F., Maeda, A., Yoshikawa, M. and Uemura, S.: Query Expansion Techniques for the CLEF Bilingual Track. In: Proceedings of the CLEF 2001 Cross-Language System Evaluation Campaign (2001) pp. 99–104.

    Google Scholar 

  7. Salton, G. and McGill, M.: Introduction to Modern Information Retrieval. New York: McGraw-Hill (1983).

    MATH  Google Scholar 

  8. Yamabana, K., Muraki, K., Doi, S. and Kamei, S.: A Language Conversion Front-End for Cross-Linguistic Information Retrieval. In: Proceedings of SIGIR Workshop on Cross-Linguistic Information Retrieval, Zurich, Switzerland (1996).

    Google Scholar 

  9. Voorhees, M. E.: Query Expansion using Lexical-Semantic Relations. In: Proceedings of the 17th ACM SIGIR Conference (1994) pp. 61–69.

    Google Scholar 

  10. Vossen, P.: EuroWordNet, A Multilingual Database for Information Retrieval. In: Proceedings of the DELOS Workshop on CLIR, ZĂĽrich (1997).

    Google Scholar 

  11. Vossen, P.: EuroWordNet, A Multilingual Database with Lexical Semantic Networks. The Kluwer Academic Publishers (1998).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sadat, F., Yoshikawa, M., Uemura, S. (2002). Exploiting Thesauri and Hierarchical Categories in Cross-Language Information Retrieval. In: Sojka, P., KopeÄŤek, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2002. Lecture Notes in Computer Science(), vol 2448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46154-X_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-46154-X_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44129-8

  • Online ISBN: 978-3-540-46154-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics