Skip to main content
Log in

Applying EuroWordNet to Cross-Language Text Retrieval

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

We discuss ways in which EuroWordNet (EWN) can be used in multilingual information retrieval activities, focusing on two approaches to Cross-Language Text Retrieval that use the EWN database as a large-scale multilingual semantic resource. The first approach indexes documents and queries in terms of the EuroWordNet Inter-Lingual-Index, thus turning term weighting and query/document matching into language-independent tasks. The second describes how the information in the EWN database could be integrated with a corpus-based technique, thus allowing retrieval of domain-specific terms that may not be present in our multilingual database. Our objective is to show the potential of EuroWordNet as a promising alternative to existing approaches to Cross-Language Text Retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alonge, A., N. Calzolari, P. Vossen, L. Bloksma, I. Castellon, T. Marti and W. Peters. “The Linguistic Design of the EuroWordNet Database”. Computers and the Humanities, Special Issue on EuroWordNet (this volume) (1998).

  • Ballesteros, L. and W. Croft. “Dictionary-based Methods for Cross-lingual Information Retrieval”. In Proceedings of the 7th International DEXA Conference on Database and Expert Systems Applications, 1996, pp. 791–801.

  • Brill, E. “A Simple Rule-based Part of Speech Tagger”. In Proceedings of the Third Conference on Applied Natural Language Processing, 1992.

  • Carbonell, J., Y. Yang, R. Frederking, R. Brown, Y. Geng and D. Lee. “Translingual Information Retrieval”. In Proceedings of IJCAI'97, 1997.

  • Chai, J. and A. Bierman. “The Use of Lexical Semantics in Information Extraction”. Proceedings of the ACL/EACL'97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

  • Church, K. and P. Hanks. “Word Association Norms, Mutual Information and Lexicography”. Computational Linguistics, 16(1) (1990), 22–29.

    Google Scholar 

  • Dumais, S., T. Landauer and M. Littman. “Automatic Cross-linguistic Information Retrieval Using Latent Semantic Indexing”. In Working Notes of the Workshop on Cross-Linguistic Information Retrieval, ACM SIGIR'96, 1996, pp. 16–23.

  • Dunning, T. “Accurate Methods for the Statistics of Surprise and Coincidence”. Computational Linguistics, 19(1) (1993).

  • Fujii, A., T. Hasegawa, T. Tokunaga and H. Tanaka. “Integration of Hand-crafted and Statistical Resources in Measuring Word Similarity”. Proceedings of the ACL/EACL'97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

  • Gilarranz, J., J. Gonzalo and M. Verdejo. “An Approach to Cross-language Text Retrieval with the EuroWordNet Semantic Database”. In AAAI Spring Symposium on Cross-Language Text and Speech Retrieval. AAAI Press SS-97-05, 1997, pp. 49–55.

  • Gonzalo, J., M. F. Verdejo, I. Chugur and J. Cigarrán. “Indexing with WordNet Synsets can Improve Text Retrieval”. In Proceedings of the ACL/COLING Workshop on Usage of WordNet for Natural Language Processing, 1998.

  • Grishman, R., C. Macleod and J. Sterling. “New York University Description of the Proteus System as Used for MUC-4”. In Proceedings of the Fourth Message Understanding Conference, 1992, pp. 223–241.

  • Harman, D. K. “The First Text Retrieval Conference (trec-1)”. Information Processing and Management, 29(4) (1993), 411–414.

    Google Scholar 

  • Hull, D. and G. Grefenstette. “Querying across Languages. A Dictionary-based Approach to Multilingual Information Retrieval”. In Proceedings of the 19th ACM SIGIR Conference, 1996, pp. 49–57.

  • Krovetz, R. and W. Croft. “Lexical Ambiguity and Information Retrieval”. ACM Transactions on Information Systems, 10(2), 1992, 115–141.

    Article  Google Scholar 

  • Kurohashi, S. and M. Nagao. “A Method of Case Structure Analysis for Japanese Sentences Based on Examples in Case Frame Dictionary”. IEEE Transactions on Information and Systems, E77-D(2) (1994), 227–239.

    Google Scholar 

  • Li, H. and N. Abe. “Generalizing Case Frames Using a Thesaurus and the Mdl Principle”. In Proceedings of Recent Advances in Natural Language Processing, 1995, pp. 239–248.

  • McCarthy, D. “Word Sense Disambiguation for Acquisition of Selectional Preferences”. In Proceedings of the ACL/EACL'97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

  • Miller, G., C. Beckwith, D. Fellbaum, D. Gross and K. Miller. Five Papers on WordNet, CSL Report 43. Technical report, Cognitive Science Laboratory, Princeton University, 1990.

  • Miller, G. A., C. Leacock, R. Tengi and R. T. Bunker. “A Semantic Concordance”. In Proceedings of the ARPA Workshop on Human Language Technology. Morgan Kauffman, 1993.

  • Màrquez, L. and L. Padró. “A Flexible POS Tagger Using an Automatically Acquired Language Model”. In Proceedings of ACL/EACL'97, 1997.

  • Ng, H. T. “Exemplar-based Word Sense Disambiguation: Some Recent Improvements”. In Proceedings of the Second Conference on Empirical Methods in NLP, 1997.

  • Peters, W., P. Vossen, P. Díez-Orzas and G. Adriaens. “The Multilingual Design of the EuroWordNet Database”. Computers and the Humanities, Special Issue on EuroWordNet (this volume), 1998.

  • Picchi, E. and C. Peters. “Cross Language Information Retrieval: A System for Comparable Corpus Querying”. In Working Notes of the Workshop on Cross-Linguistic Information Retrieval, ACM SIGIR'96. Ed. G. Grefenstette, 1996, pp. 24–33.

  • Resnik, P. “Using Information Content to Evaluate Semantic Similarity in a Taxonomy”. In Proceedings of IJCAI, 1995.

  • Ribas, F. “On Learning more Appropriate Selectional Restrictions”. In Proceedings of the Seventh Conference of the European Chapter of the Association for Computational Linguistics, 1995, pp. 112–118.

  • Richardson, R. and A. Smeaton. “Using WordNet in a Knowledge-based Approach to Information Retrieval”. In Proceedings of the BCS-IRSG Colloquium, Crewe, 1995.

  • Rodríguez, H., S. Climent, P. Vossen, L. Bloksma, A. Roventini, F. Bertagna, A. Alonge and W. Peters. “The Top-down Strategy for Building EuroWordNet: Vocabulary Coverage, Base Concepts and Top Ontology”. Computers and the Humanities, Special Issue on EuroWordNet (this volume), 1998.

  • Salton, G. (ed.). The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall, 1971.

  • Sanderson, M. “Word Sense Disambiguation and Information Retrieval”. In Proceedings of 17th International Conference on Research and Development in Information Retrieval, 1994.

  • Sanfilippo, A. “Using Semantic Similarity to Acquire Co-occurrence Restrictions from Corpora”. In Proceedings of the ACL/EACL'97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

  • Segond, F., A. Schiller, G. Grefenstette and J. Chanod. “An Experiment in Semantic Tagging Using Hidden Markov Model Tagging”. Proceedings of the ACL/EACL'97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources. Eds. P. Vossen, G. Adriaens, N. Calzolari, A. Sanfilippo and Y. Wilks, 1997.

  • Sheridan, P. and J. Ballerini. “Experiments in Multilingual Information Retrieval Using the Spider System”. In Proceedings of the 19th ACM SIGIR Conference, 1996, pp. 58–65.

  • Smeaton, A., F. Kelledy and R. O'Donnell. “TREC-4 Experiments at Dublin City University: Thresolding Posting Lists, Query Expansion with WordNet and POS Tagging of Spanish”. In Proceedings of TREC-4, 1995.

  • Smeaton, A. and A. Quigley. “Experiments on Using Semantic Distances between Words in Image Caption Retrieval”. In Proceedings of the 19th International Conference on Research and Development in IR, 1996.

  • Voorhees, E. M. “Query Expansion Using Lexical-semantic Relations”. In Proceedings of the 17 th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, 1994.

  • Vossen, P. “Introduction to EuroWordNet”. Computers and the Humanities, Special Issue on EuroWordNet (this volume), 1998.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gonzalo, J., Verdejo, F., Peters, C. et al. Applying EuroWordNet to Cross-Language Text Retrieval. Computers and the Humanities 32, 185–207 (1998). https://doi.org/10.1023/A:1001129911019

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1001129911019

Navigation