skip to main content
10.1145/355214.355218acmconferencesArticle/Chapter ViewAbstractPublication PagesiralConference Proceedingsconference-collections
Article
Free Access

Query term disambiguation for Web cross-language information retrieval using a search engine

Authors Info & Claims
Published:01 November 2000Publication History

ABSTRACT

With the worldwide growth of the Internet, research on Cross-Language Information Retrieval (CLIR) is being paid much attention. Existing CLIR approaches based on query translation require parallel corpora or comparable corpora for the disambiguation of translated query terms. However, those natural language resources are not readily available. In this paper, we propose a disambiguation method for dictionary-based query translation that is independent of the availability of such scarce language resources, while achieving adequate retrieval effectiveness by utilizing Web documents as a corpus and using co-occurrence information between terms within that corpus. In the experiments, our method achieved 97% of manual translation case in terms of the average precision.

References

  1. 1.Kikui, G. Identifying the coding system and language of on-line documents using statistical language models. Transactions oflPSJ, 1997, 38(12), pp. 2440-2448.]]Google ScholarGoogle Scholar
  2. 2.Sugimoto, S., Maeda, A., Dartois, M., Ohta, J., Nakao, S., Sakaguchi, T. and Tabata, K. Experimental studies on an applet-based document viewer for multilingual WWW Documents -- Functional Extension of and Lessons Learned from Multilingual HTML. In Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries (ECDL'98), Lecture Notes in Computer Science 1513, Springer-Verlag, 1998, pp. 199-214.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.Jansen, B. J., Spink, A. and Saracevic, T. Real life, real users, and real needs: a study and analysis of user queries on the Web. Information Processing & Management, 2000, 36(2), pp. 207-227.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.Fujii, A. and Ishikawa, T. Cross-language information retrieval for technical documents. In Proceedings of the Joint ACL SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999, pp. 29-37.]]Google ScholarGoogle Scholar
  5. 5.Oard, D. W. Alternative approaches for cross-language text retrieval. In Electronic Working Notes of the AAAI Symposium on Cross-Language Text and Speech Retrieval, 1997.]]Google ScholarGoogle Scholar
  6. 6.Grefenstette, G., editor. Cross-language information retrieval. The Kluwer International Series on Information Retrieval, Vol. 2. Kluwer Academic Publishers, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Nie, J., Simard, M., Isabelle, P. and Durand, R. Crosslanguage information retrieval based on parallel texts and automatic mining of parallel texts from the Web. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'99), 1999, pp. 74-81.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.Maeda, A. and Uemura, S. Key technologies for multilingual information processing on WWW. In Proceedings of the Fourth International Symposium on Standardization of Multilingual Information Technology (MLIT-4), 1999, pp. 15-25.]]Google ScholarGoogle Scholar
  9. 9.Lin, C., Lin, W., Bian, G. and Chen, H. Description of the NTU Japanese-English cross-lingual information retrieval system used for NTCIR workshop. In Proceedings of the First NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, 1999, pp. 145-148.]]Google ScholarGoogle Scholar
  10. 10.Jang, M., Myaeng, S. H. and Park, S. Y. Using mutual information to resolve query translation ambiguities and query term weighting. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL'99), 1999, pp. 223-229.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.Ballesteros, L. and Croft, W. B. Resolving ambiguity for cross-language retrieval. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'98), 1998, pp. 64-71.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.Fatiha, S., Maeda, A., Yoshikawa, M. and Uemura, S.: Integrating Dictionary-based and Statistical-based Approaches in Cross-Language Information Retrieval, IPSJ SIG Notes, 2000-DBS-121/2000-FI-Sg, 2000, pp. 61--68.]]Google ScholarGoogle Scholar
  13. 13.Ikeno, A., Murata, T., Shimohata, S. and Yamamoto, H. Machine translation using the Internet natural language resources. In Proceedings of World TELECOM99+ lnteractive99 Forum, 1999.]]Google ScholarGoogle Scholar
  14. 14.Church, K. W. and Hanks, P. Word association norms, mutual information, and lexicography. Computational Linguistics, 1990, 16(1), pp. 22-29.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.Kitamura, M. and Matsumoto, Y. Automatic extraction of translation patterns in parallel corpora. Transactions oflPSJ, 1997, 38(4), pp. 727-736. (in Japanese)]]Google ScholarGoogle Scholar
  16. 16.Dunning, T. Accurate methods for the statisticx of surprise and coincidence. Computational Linguistics, 1993, 19(1), pp. 61-74.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., Hidaka, S. and Adachi, J. The NTCIR workshop: the first evaluation workshop on Japanese text retrieval and cross-lingual information retrieval. In Proceedings of the 4th International Workshop on Information Retrieval with Asian Languages (1RAL '99), 1999.]]Google ScholarGoogle Scholar
  18. 18.Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Matsuda, H. and Asahara, M. Japanese morphological analysis system ChaSen version 2.0 manual 2nd edition. Technical Report NAIST-IS- TR99013, Nara Institute of Science and Technology, 1999.]]Google ScholarGoogle Scholar
  19. 19.Japan Electronic Dictionary Research Institute, Ltd. EDR electronic dictionary version 1.5 technical guide, Technical Report TR2-007, Japan Electronic Dictionary Research Institute, Ltd., 1996.]]Google ScholarGoogle Scholar
  1. Query term disambiguation for Web cross-language information retrieval using a search engine

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languages
          November 2000
          220 pages
          ISBN:1581133006
          DOI:10.1145/355214
          • Chairmen:
          • Kam-Fai Wong,
          • Dik L. Lee,
          • Jong-Hyeok Lee

          Copyright © 2000 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 November 2000

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader