Abstract
We present two main approaches to cross-language information retrieval based on the exploitation of multilingual corpora to derive cross-lingual term-term correspondences. These two approaches are evaluated in the framework of the multilingual-4 (ML4) task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hull, D., Grefenstette, G.: Querying across Languages: a Dictionary-Based Approach to Multilingual Information Retrieval. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1996)
Ballesteros, L., Croft, B.W.: Phrasal Translation and Query Expansion Techniques for Cross-Language Information Retrieval. In: Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1997)
Davis, M.W., Ogden, W.C.: QUILT: Implementing a Large-Scale Cross-Language Text Retrieval System. In: Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1997)
Gey, F.C., Jiang, H., Petras, V., Chen, A.: Cross-Language Retrieval for the CLEF Collections - Comparing Multiple Methods of Retrieval. In: Peters, C. (ed.) CLEF 2000. LNCS, vol. 2069, pp. 116–128. Springer, Heidelberg (2001)
Savoy, J.: Report on CLEF-2002 Experiments: Combining Multiple Sources of Evidence. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 66–90. Springer, Heidelberg (2003)
Nie, J.-Y., Simard, M., Isabelle, P., Durand, R.: Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1999)
Brown, P., Della Pietra, S., Della Pietra, V., Mercer, R.L.: The Mathematics of Statistical Machine Learning Translation: Parameter Estimation. Computational Linguistics 19(2), 263–311 (1993)
Peters, C., Picchi, E.: Capturing the Comparable: A System for Querying Comparable Text Corpora. In: Bolasco, S., Lebart, L., Salem, A. (eds.) JADT 1995 - 3rd International Conference on Statistical Analysis of Textual Data, pp. 255–262 (1995)
Littman, M.L., Dumais, S.T., Landauer, T.K.: Automatic cross-language information retrieval using latent semantic indexing. In: Grefenstette, G. (ed.) Cross language information retrieval. Kluwer, Dordrecht (1998)
Bach, F.R., Jordan, M.I.: Kernel indepedendent component analysis. Journal of Machine Learning Research 3, 1–48 (2002)
Lai, P.L., Fyfe, C.: Kernel and nonlinear canonical correlation analysis. International Journal of Neural Systems 10(5), 365–377 (2000)
Vinokourov, A., Shawe-Taylor, J., Cristianini, N.: Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis. In: Advances of Neural Information Processing Systems, vol. 15 (2002)
Germann, U.: Aligned Hansards of the 36th Parliament of Canada (2001) (Release 2001-1a), http://www.isi.edu/natural-language/download/hansard/
Dagan, I., Itai, I.: Word Sense Disambiguation using a Second Language Monolingual Corpus. Computational Linguistics 2(4) (1994)
Gale, W.A., Church, K.W.: A Program for Aligning Sentences in Bilingual Corpora. In: Meeting of the Association for Computational Linguistics, pp. 177–184 (1991)
Gaussier, E.: Flow Network Models for Word ALignment and Terminology Extraction from Bilingual Corpora. In: Proceedings of the joint 17th International Conference on Computational Linguistics and 26th Annual Meeting of the Association for Computational Linguistics, pp. 444–450 (1998)
Hiemstra, D.: Using Statistical Methods to create a Bilingual Dictionary. Masters Thesis. Universiteit Twente (1996)
Hull, D.: Automating the constuction of bilingual terminology lexicons. Terminlogy 5(2) (1997)
Gaussier, E., Hull, D., Ait-Mokhtar, S.: Term Alignment in Use: Machine-Aided Human Translation. In: Véronis, J. (ed.) Parallel Text Processing Alignment and Use of Translation Corpora. Kluwer Academic Publishers, Dordrecht (2000)
Bishop, Y., Fienberg, S., Holland, P.: Discrete Multivariate Analysis. MIT Press, Cambridge (1975)
Tanaka, K., Iwasaki, H.: Extraction of Lexical Translations from Non-Aligned Corpora. In: International Conference on Computational Linguistics, COLING 1996 (1996)
Shahzad, I., Ohtake, K., Masuyama, S., Yamamoto, K.: Identifying Translations of Compound Nouns Using Non-aligned Corpora. In: Proceedings of the Workshop MAL 1999, pp. 108–113 (1999)
Rapp, R.: Automatic Identification of Word Translations from Unrelated English and German Corpora. In: Proceedings of the European Association for Computational Linguistics (1999)
Fung, P.: A Statistical View on Bilingual Lexicon Extraction: From parallel corpora to non-parallel corpora. In: Véronis, J. (ed.) Parallel Text Processing. Alignment and Use of Translation Corpora. Kluwer Academic Publishers, Dordrecht (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cancedda, N., Déjean, H., Gaussier, É., Renders, JM., Vinokourov, A. (2004). Report on CLEF-2003 Experiments: Two Ways of Extracting Multilingual Resources from Corpora. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds) Comparative Evaluation of Multilingual Information Access Systems. CLEF 2003. Lecture Notes in Computer Science, vol 3237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30222-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-30222-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24017-4
Online ISBN: 978-3-540-30222-3
eBook Packages: Springer Book Archive