Abstract
A technique to infer finite-state transducers is proposed in this work. This technique is based on the formal relations between finite-state transducers and regular grammars. The technique consists of: 1) building a corpus of training strings from the corpus of training pairs; 2) inferring a regular grammar and 3) transforming the grammar into a finite-state transducer.
The proposed method was assessed through a series of experiments within the framework of the EUTRANS project.
This work has been partially funded by the European Union under grant IT-LTR-OS-30268.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amengual, J.C., Benedí, J.B., Casacuberta, F., Castaño, A., Castellanos, A., Jiménez, V.M., Llorens, D., Marzal, A., Pastor, M., Prat, F., Vidal, E., Vilar, J.M.: The EUTRANS-I speech translation system. To be published in Machine Translation (2000)
Berstel, J.: Transductions and context-free languages. B. G. Teubner Stuttgart (1979)
Brown, P., Cocke, J., Della Pietra, S., Della Pietra, V., Jelinek, J., Lafferty, J., Mercer, R., Roossin, P.: A statistical approach to machine translation. Computational Linguistics 16(2), 79–85 (1990)
Brown, P.F., Lai, J.C., Mercer, R.L.: Aligning sentences in parallel corpora. In: 29th Annual Meeting of the ACL, pp. 169–176 (1991)
Brown, P., Della Pietra, S., Della Pietra, V., Mercer, R.: The mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19(2), 263–310 (1993)
Casacuberta, F.: Maximum mutual information and conditional maximum likelihood estimations of stochastic syntax-directed translation schemes. In: Miclet, L., de la Higuera, C. (eds.) ICGI 1996. LNCS (LNAI), vol. 1147, pp. 282–291. Springer, Heidelberg (1996)
Casacuberta, F., de la Higuera, C.: Computational complexity of problems on probabilistic grammars and transducers. In: Proceedings of the 5th International Colloquium on Grammatical Inference (2000)
Fu, K.S.: Syntactic pattern recognition and applications. Prentice-Hall, Englewood Cliffs (1982)
García, P., Vidal, E., Casacuberta, F.: Local languages, the succesor method and a step towards a general methodology for the inference of regular grammars. IEEE Transactions on Pattern Analysis and Machine Intelligence 9(6), 841–844 (1987)
Ney, H., Martin, S., Wessel, F.: Statistical language modeling using leaving-oneout. In: Corpus-based methods in language and speech processing, ch. 6, Kluwer Academic Publishers, Dordrecht (1997)
Instituto Tecnológico de Informt́ica, Fondazione Ugo Bordoni, Rheinisch Westfälische Technische Hochschule Aachen Lehrstuhl für Informatik VI and Zeres GmbH Bochum: Example-based language translation systems. Second year progress report, EUTRANSproject, Technical report deliverable D0.1b. Information Technology. Long Term Research Domain. Open scheme. Project Number 32026 (1999)
Knight, K., Al-Onaizan, Y.: Translation with finite-state devices. In: Proceedings of the 4th ANSTA Conference (1998)
Llorens, D.: Suavizado general de autómatas finitos, Ph.D. Thesis. Universitat Politècnica de València (2000) (to be published)
Mäkinen, E.: Inferring finite transducers, University of Tampere, Report A-1999-3 (1999)
Maryanski, F., Thomason, M.G.: Properties of stochastic syntax-directed translation schemata. International Journal of Computer and Information Science 8(2), 89–110 (1979)
Oncina, J., García, P., Vidal, E.: Learning subsequential transducers for pattern recognition tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 448–454 (1993)
Rheinisch Westfälische Technische Hochschule Aachen Lehrstuhl für Informatik VI and Instituto Tecnológico de Informática: Statistical Modeling Techniques and Results and Search Techniques and Results, EuTransproject, Technical Report Deliverables D3.1a and D3.2a Information Technology. Long Term Research Domain. Open scheme. Project Number 32026 (1999)
Clarkson, P.R., Rosenfeld, R.: Statistical Language Modeling Using the CMUCambridge Toolkit. Proceedings ESCA Eurospeech 5, 2707–2710 (1997)
Vidal, E., García, P., Segarra, E.: Inductive learning of finite-state transducers for the interpretation of unidimensional objects. In: Mohr, R., Pavlidis, T., Sanfeliu, A. (eds.) Structural Pattern Analysis, pp. 17–35. World Scientific pub., Singapore (1989)
Vidal, E., Casacuberta, F., García, P.: Grammatical inference and automatic speech recognition. In: Rubio, A., López, J. (eds.) Speech recognition and coding: new advances and trends. NATO-ASI, vol. F147, pp. 174–191. Springer, Heidelberg (1995)
Vidal, E.: Finite-state speech-to-speech translation. In: Proceedings of the International Conference on Acoustic, Speech and Signal Processing, Munich, Germany, vol. I, pp. 111–114 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Casacuberta, F. (2000). Inference of Finite-State Transducers by Using Regular Grammars and Morphisms. In: Oliveira, A.L. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2000. Lecture Notes in Computer Science(), vol 1891. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45257-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-45257-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41011-9
Online ISBN: 978-3-540-45257-7
eBook Packages: Springer Book Archive