Abstract
Allomorfessor extends the unsupervised morpheme segmentation method Morfessor to account for the linguistic phenomenon of allomorphy, where one morpheme has several different surface forms. The method discovers common base forms for allomorphs from an unannotated corpus by finding small modifications, called mutations, for them. Using Maximum a Posteriori estimation, the model is able to decide the amount and types of the mutations needed for the particular language. In Morpho Challenge 2009 evaluations, the effect of the mutations was discovered to be rather small. However, Allomorfessor performed generally well, achieving the best results for English in the linguistic evaluation, and being in the top three in the application evaluations for all languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bernhard, D.: Simple morpheme labelling in unsupervised morpheme analysis. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 873–880. Springer, Heidelberg (2008)
Bernhard, D.: MorphoNet: Exploring the use of community structure for unsupervised morpheme analysis. In: Working notes for the CLEF 2009 Workshop, Corfu, Greece (2009)
Creutz, M., Lagus, K.: Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Tech. Rep. A81, Publications in Computer and Information Science, Helsinki University of Technology (2005)
Creutz, M., Lagus, K.: Unsupervised models for morpheme segmentation and morphology learning. ACM Transactions on Speech and Language Processing 4(1) (2007)
Dasgupta, S., Ng, V.: High-performance, language-independent morphological segmentation. In: The Annual Conference of the North American Chapter of the ACL, NAACL-HLT (2007)
Goldsmith, J.: Unsupervised learning of the morphology of a natural language. Computational Linguistics 27(2), 153–189 (2001)
Kohonen, O., Virpioja, S., Klami, M.: Allomorfessor: Towards unsupervised morpheme analysis. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 975–982. Springer, Heidelberg (2009)
Kurimo, M., Virpioja, S., Turunen, V., Blackwood, G.W., Byrne, W.: Overview and results of Morpho Challenge 2009. In: Multilingual Information Access Evaluation 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, September 30 - October 2. LNCS, vol. I, Springer, Heidelberg (2010)
Rissanen, J.: Stochastic Complexity in Statistical Inquiry, vol. 15. World Scientific Series in Computer Science, Singapore (1989)
Virpioja, S., Kohonen, O.: Unsupervised morpheme analysis with Allomorfessor. In: Working notes for the CLEF 2009 Workshop, Corfu, Greece (2009)
Yarowsky, D., Wicentowski, R.: Minimally supervised morphological analysis by multimodal alignment. In: Proceedings of the 38th Meeting of the ACL, pp. 207–216 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Virpioja, S., Kohonen, O., Lagus, K. (2010). Unsupervised Morpheme Analysis with Allomorfessor. In: Peters, C., et al. Multilingual Information Access Evaluation I. Text Retrieval Experiments. CLEF 2009. Lecture Notes in Computer Science, vol 6241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15754-7_73
Download citation
DOI: https://doi.org/10.1007/978-3-642-15754-7_73
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15753-0
Online ISBN: 978-3-642-15754-7
eBook Packages: Computer ScienceComputer Science (R0)