Skip to main content

Term Similarity-Based Query Expansion for Cross-Language Information Retrieval

  • Conference paper
  • First Online:
Research and Advanced Technology for Digital Libraries (ECDL 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1696))

Included in the following conference series:

Abstract

We propose a query expansion technique which is based on a statistical similarity measure among terms to improve the effectiveness of the dictionary-based cross-language information retrieval (CLIR) method. We employ a term similarity-based sense disambiguation technique proposed in our earlier work to enhance the accuracy of the dictionary-based query translation method. The query expansion technique is then applied to the translation of queries to further improve their retrieval performance. We demonstrate the effectiveness of the two techniques combined using queries in three languages, namely, German, Spanish, and Indonesian, to retrieve English documents from a standard TREC (Text Retrieval Conference) collection. The results of our experiments indicate that the term similarity-based techniques work better when there are more phrases in the queries. In addition, our results also re-emphasize other researchers’ finding that phrase recognition and translation are critical to CLIR’s effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adriani, Mirna and Croft, W. Bruce. The Effectiveness of a Dictionary-Based Technique for Indonesian-English Cross-Language Text Retrieval. CIIR Technical Report IR-170, University of Massachusetts, Amherst, 1997.

    Google Scholar 

  2. Adriani, Mirna. Using Statistical Term Similarity for Sense Disambiguation in Cross-Language Information Retrieval. To appear in Information Retrieval.

    Google Scholar 

  3. Ballesteros, L., and Croft, W. Bruce. Resolving Ambiguity for Cross-language Retrieval. In Proceedings of the 21 st International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 64–71, 1998.

    Google Scholar 

  4. Ballesteros, L., and Croft, W. Bruce. Phrasal Translation and Query Expansion Techniques for Cross Language Information Retrieval. In Proceedings of the 20 th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 84–91, 1997.

    Google Scholar 

  5. Callan, J. P., Croft, W.B., Harding, S.M.. The Inquery Retrieval System. In Proceedings of Third International Conference on Database and Expert Systems Applications, 1992.

    Google Scholar 

  6. Carbonell, J., Yang, Y., Frederking, R., Brown, R.D., Geng, Y., and Lee, D. Translingual Information Retrieval: A Comparative Evaluation. In Proceedings of Fifteenth International Joint Conference on Artificial Intelligence (IJCAI), 1997.

    Google Scholar 

  7. Davis, M. and Dunning, T. E. A TREC Evaluation of Query-Translation Methods for Multi-Lingual Text Retrieval. In NIST Special Publication: The 4 th Text Retrieval Conference (TREC-4), D.K. Harman, ed. Gaithersburg, MD: NIST, 1995.

    Google Scholar 

  8. Davis, Mark W. and Ogden, William C. Free Resources and Advanced Alignment for Cross-Language Text Retrieval. In NIST Special Publication: The 6 th Text Retrieval Conference (TREC-6), D.K. Harman, ed. Gaithersburg, MD: NIST, 1997.

    Google Scholar 

  9. Harman, Donna. Overview of the Sixth Text Retrieval Conference. In Proceeding of the 6 th Text Retrieval Conference (TREC-6), 1997.

    Google Scholar 

  10. Hull, D. A., and Grefenstette, G. Querying Across Languages: A dictionary-based approach to Multilingual Information Retrieval. In Proceedings of the 19 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 49–57, 1996.

    Google Scholar 

  11. Mendenhall, William, Scheaffer, Richard L., and Wackerly, Dennis D. Mathematical Statistics with Applications, Third ed., Boston: Duxbury Press, 1986.

    MATH  Google Scholar 

  12. Oard, Douglas W. and Hackett, Paul. Document Translation for Cross-Language Text Retrieval at the University of Maryland. In Proceeding of the Sixth Text Retrieval Conference (TREC-6), 1997.

    Google Scholar 

  13. Pevzner, B. Comparative Evaluation of the Operation of the Russian and English variants of the Pusto-Nepusto-2 System. Automatic Documentation and Mathematical Linguistic, 6:71–74, 1972.

    Google Scholar 

  14. Pirkola, A. The Effects of Query Structure and Dictionary setups in Dictionary-Based Cross-language Information Retrieval. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 55–63, 1998.

    Google Scholar 

  15. Qiu, Y. and Frei, H. P. Concept Based Query Expansion. In Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pages 160–169, 1993.

    Google Scholar 

  16. van Rijsbergen, C. J. Information Retrieval, Second ed., London: Butterworths, 1979.

    Google Scholar 

  17. Salton, G. Automatic Processing of Foreign Language Documents. Journal of the American Society for Information Science, 21: 187–194, 1970.

    Article  Google Scholar 

  18. Salton, Gerard, and McGill, Michael J. Introduction to Modern Information Retrieval, New York: McGraw-Hill, 1983.

    MATH  Google Scholar 

  19. Sheridan, P., and Ballerini, J. P. Experiments in Multilingual Information Retrieval using the SPIDER System. In Proceedings of the 19 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ZĂĽrich, Switzerland, August 1996.

    Google Scholar 

  20. Sheridan, P., Braschler, M., and Schauble, P. Cross-Language Information Retrieval in a Multilingual Legal Domain. In Research and Advanced Technology for Digital Libraries, First European Conference, ECDL’97, Pisa, Italy, September 1997.

    Google Scholar 

  21. Spark Jones, K. Automatic Keyword Classifications for Information Retrieval. London: Butterworth, 1971.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Adriani, M., van Rijsbergen, C.J. (1999). Term Similarity-Based Query Expansion for Cross-Language Information Retrieval. In: Abiteboul, S., Vercoustre, AM. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1999. Lecture Notes in Computer Science, vol 1696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48155-9_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-48155-9_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66558-8

  • Online ISBN: 978-3-540-48155-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics