Skip to main content

Learning multilingual morphology with Clog

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1446))

Abstract

The paper presents the decision list learning system Clog and the results of using it to learn nominal inflections of English, Romanian, Czech, Slovene, and Estonian. The dataset used to induce rules for the synthesis and analysis of the inflectional paradigms of nouns and adjectives of these languages is the Multext-East multilingual tagged corpus. The ILP system FoIDL is also applied to the same dataset, and this paper compares the induction methodology and results of the two systems. The experiment shows that the accuracy of the two systems is comparable when using the same training set. However, while FOIDL is, due to efficiency reasons, severely limited in the size of the training set, CLOG does not suffer from such limitations. With the increase of the training set size possible with CLOG, it significantly outperforms FOIDL and learns highly accurate morphological rules.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Cussens. Part-of-speech tagging using Progol. In Proc. 7th Intl. Wshp. on Inductive Logic Programming, pages 93–108. Springer, Berlin, 1997.

    Google Scholar 

  2. W. Daelemans, T. Weijters, and A. van der Vosch, editors. Proc. ECML-97 Workshop on Empirical Learning of Natural Language Processing Tasks. Prague, Czech Republic, 1997.

    Google Scholar 

  3. S. Džeroski and T. Erjavec. Induction of Slovene nominal paradigms. In Proc. 7th Intl. Wshp. on Inductive Logic Programming, pages 141–148. Springer, Berlin, 1997.

    Google Scholar 

  4. T. Erjavec, N. Ide, V. Petkevič, and J. Véronis. MULTEXT-East: Multilingual text tools and corpora for Central and Eastern European languages. In Proc. 1st TELRI European Seminar, pages 87–98. Tihany, Hungary, 1995.

    Google Scholar 

  5. T. Erjavec, M. Monachini (ids.). Specifications and Notation for Lexicon Encoding. MULTEXT-East Final Report DLIF, Ljubljana, US, 1997.

    Google Scholar 

  6. R. J. Mooney. Inductive logic programming for natural language processing. In Proc. 6th Intl. Wshp. on Inductive Logic Programming, pages 3–22. Springer, Berlin, 1997.

    Google Scholar 

  7. R. J. Mooney and M.-E. Califf. Induction of first-order decision lists: Results on learning the past tense of English verbs. Journal of Artificial Intelligence Research, (3):1–24, 1995.

    Google Scholar 

  8. S. Muggleton. Inverse entailment and Progol, New Generation Computing, (13):245–286, 1995. *** DIRECT SUPPORT *** A0008D21 00005

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

David Page

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Manandhar, S., Džeroski, S., Erjavec, T. (1998). Learning multilingual morphology with Clog . In: Page, D. (eds) Inductive Logic Programming. ILP 1998. Lecture Notes in Computer Science, vol 1446. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0027317

Download citation

  • DOI: https://doi.org/10.1007/BFb0027317

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64738-6

  • Online ISBN: 978-3-540-69059-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics