Abstract
This work focuses on a hybrid machine translation system from Spanish into Catalan called SisHiTra. In particular, we focus on its word translation disambiguation module, which has to decide on the correct translation of each ambiguous input word in accordance with its context. We propose the use of statistical pattern recognition techniques for this task and, in particular, multinomial Naive Bayes text classifiers. Extensive empirical results on the use of these classifiers are presented, in which the influence of the window (context) size and parameter smoothing are carefully studied.
Work supported by the “Agència Valenciana de Ciència i Tecnologia” under grant GRUPOS03/031 and the Spanish project ITEFTE (TIC2003-08681-C02-02).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Juan, A., Ney, H.: Reversing and Smoothing the Multinomial Naive Bayes Text Classifier. In: Proc. of the 2nd Int. Workshop on Pattern Recognition in Information Systems (PRIS 2002), Alicante (Spain), April 2002, pp. 200–212 (2002)
Navarro, J.R., et al.: SisHiTra: A Hybrid Machine Translation System from Spanish to Catalan. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds.) EsTAL 2004. LNCS (LNAI), vol. 3230, pp. 349–359. Springer, Heidelberg (2004)
Och, F.J., Ney, H.: Improved Statistical Alignment Models. In: ACL 2000, Hongkong, China, October 2000, pp. 440–447 (2000)
Roche, E., Schabes, Y.: Deterministic Part-Of-Speech Tagging with Finite-State Transducers. Computational Linguistics 21(2), 227–253 (1995)
Tomás, J., Casacuberta, F.: Binary deature classification for word disambiguation in statistical machine translation. In: Proceedings of the 2nd International Workshop on Pattern Recognition in Information Systems, Spain, pp. 213–224 (2002)
Vilar, D., Ney, H., Juan, A., Vidal, E.: Effect of Fature Smoothing Methods in Text Classification Tasks. In: Proc. of the 2nd Int. Workshop on Pattern Recognition in Information Systems, PRIS 2004 (2004)
El Periódico, website: www.elperiodico.com , Ediciones Primera Plana S.A., Consell de Cent, 425-427. 08009 Barcelona (Spain)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Andrés, J., Navarro, J.R., Juan, A., Casacuberta, F. (2005). Word Translation Disambiguation Using Multinomial Classifiers. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds) Pattern Recognition and Image Analysis. IbPRIA 2005. Lecture Notes in Computer Science, vol 3523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11492542_76
Download citation
DOI: https://doi.org/10.1007/11492542_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26154-4
Online ISBN: 978-3-540-32238-2
eBook Packages: Computer ScienceComputer Science (R0)