Abstract
We propose a query expansion technique which is based on a statistical similarity measure among terms to improve the effectiveness of the dictionary-based cross-language information retrieval (CLIR) method. We employ a term similarity-based sense disambiguation technique proposed in our earlier work to enhance the accuracy of the dictionary-based query translation method. The query expansion technique is then applied to the translation of queries to further improve their retrieval performance. We demonstrate the effectiveness of the two techniques combined using queries in three languages, namely, German, Spanish, and Indonesian, to retrieve English documents from a standard TREC (Text Retrieval Conference) collection. The results of our experiments indicate that the term similarity-based techniques work better when there are more phrases in the queries. In addition, our results also re-emphasize other researchers’ finding that phrase recognition and translation are critical to CLIR’s effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adriani, Mirna and Croft, W. Bruce. The Effectiveness of a Dictionary-Based Technique for Indonesian-English Cross-Language Text Retrieval. CIIR Technical Report IR-170, University of Massachusetts, Amherst, 1997.
Adriani, Mirna. Using Statistical Term Similarity for Sense Disambiguation in Cross-Language Information Retrieval. To appear in Information Retrieval.
Ballesteros, L., and Croft, W. Bruce. Resolving Ambiguity for Cross-language Retrieval. In Proceedings of the 21 st International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 64–71, 1998.
Ballesteros, L., and Croft, W. Bruce. Phrasal Translation and Query Expansion Techniques for Cross Language Information Retrieval. In Proceedings of the 20 th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 84–91, 1997.
Callan, J. P., Croft, W.B., Harding, S.M.. The Inquery Retrieval System. In Proceedings of Third International Conference on Database and Expert Systems Applications, 1992.
Carbonell, J., Yang, Y., Frederking, R., Brown, R.D., Geng, Y., and Lee, D. Translingual Information Retrieval: A Comparative Evaluation. In Proceedings of Fifteenth International Joint Conference on Artificial Intelligence (IJCAI), 1997.
Davis, M. and Dunning, T. E. A TREC Evaluation of Query-Translation Methods for Multi-Lingual Text Retrieval. In NIST Special Publication: The 4 th Text Retrieval Conference (TREC-4), D.K. Harman, ed. Gaithersburg, MD: NIST, 1995.
Davis, Mark W. and Ogden, William C. Free Resources and Advanced Alignment for Cross-Language Text Retrieval. In NIST Special Publication: The 6 th Text Retrieval Conference (TREC-6), D.K. Harman, ed. Gaithersburg, MD: NIST, 1997.
Harman, Donna. Overview of the Sixth Text Retrieval Conference. In Proceeding of the 6 th Text Retrieval Conference (TREC-6), 1997.
Hull, D. A., and Grefenstette, G. Querying Across Languages: A dictionary-based approach to Multilingual Information Retrieval. In Proceedings of the 19 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 49–57, 1996.
Mendenhall, William, Scheaffer, Richard L., and Wackerly, Dennis D. Mathematical Statistics with Applications, Third ed., Boston: Duxbury Press, 1986.
Oard, Douglas W. and Hackett, Paul. Document Translation for Cross-Language Text Retrieval at the University of Maryland. In Proceeding of the Sixth Text Retrieval Conference (TREC-6), 1997.
Pevzner, B. Comparative Evaluation of the Operation of the Russian and English variants of the Pusto-Nepusto-2 System. Automatic Documentation and Mathematical Linguistic, 6:71–74, 1972.
Pirkola, A. The Effects of Query Structure and Dictionary setups in Dictionary-Based Cross-language Information Retrieval. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 55–63, 1998.
Qiu, Y. and Frei, H. P. Concept Based Query Expansion. In Proceedings of ACM SIGIR International Conference on Research and Development in Information Retrieval, pages 160–169, 1993.
van Rijsbergen, C. J. Information Retrieval, Second ed., London: Butterworths, 1979.
Salton, G. Automatic Processing of Foreign Language Documents. Journal of the American Society for Information Science, 21: 187–194, 1970.
Salton, Gerard, and McGill, Michael J. Introduction to Modern Information Retrieval, New York: McGraw-Hill, 1983.
Sheridan, P., and Ballerini, J. P. Experiments in Multilingual Information Retrieval using the SPIDER System. In Proceedings of the 19 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ZĂĽrich, Switzerland, August 1996.
Sheridan, P., Braschler, M., and Schauble, P. Cross-Language Information Retrieval in a Multilingual Legal Domain. In Research and Advanced Technology for Digital Libraries, First European Conference, ECDL’97, Pisa, Italy, September 1997.
Spark Jones, K. Automatic Keyword Classifications for Information Retrieval. London: Butterworth, 1971.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Adriani, M., van Rijsbergen, C.J. (1999). Term Similarity-Based Query Expansion for Cross-Language Information Retrieval. In: Abiteboul, S., Vercoustre, AM. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1999. Lecture Notes in Computer Science, vol 1696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48155-9_20
Download citation
DOI: https://doi.org/10.1007/3-540-48155-9_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66558-8
Online ISBN: 978-3-540-48155-3
eBook Packages: Springer Book Archive