Skip to main content
Log in

An Estonian Morphological Analyser and the Impact of a Corpus on Its Development

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

The paper describes a morphological analyser forEstonian and how using a text corpus influenced theprocess of creating it and the resulting programitself. The influence is not limited to the lexicononly, but is also noticeable in the resulting algorithm andimplementation too. When work on the analyser began,there were no computational treatment of Estonianderivatives and compounds. After some cycles ofdevelopment and testing on the corpus, we came up withan acceptable algorithm for their treatment. Both themorphological analyser and the speller based on ithave been successfully marketed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Brodda, B. and F. Karlsson. "An Experiment with Automatic Morphological Analysis of Finnish". Papers from the Institute of Linguistics. Publication 40. Stockholm: University of Stockholm, 1980.

    Google Scholar 

  • EKG. Eesti Keele Grammatika 1 (Grammar of the Estonian Language 1.). Ed. M. Erelt. Tallinn: Eesti TA EKI, 1995.

    Google Scholar 

  • Francis, N.W. and H. Kucera. Manual of Information to Accompany a Standard Corpus of Present-Day Edited American English, for Use with Digital Computers. Providence, R.I., 1964.

  • Guidelines. Guidelines for Electronic Text Encoding and Interchange. Ed. Michael Sperberg-McQueen and Lou Burnard, Text Encoding Initiative. Chicago: Oxford. April 8, 1994

  • Hennoste, T., K. Muischnek, H. Potter and T. Roosmaa. "Tartu Ülikooli eesti kirjakeele korpus: ülevaade tehtust ja probleemidest (The Tartu University Corpus of Estonian Literary Language: An Overview of Finished Things and Problems)". Keel ja Kirjandus, 10 (1993), 587–600.

    Google Scholar 

  • Itogi. VINITI Itogi nauki i tehniki. Serija informatika (VINITI Summaries of Science and Technology. Series of Informatics), Vol. 7. Moscow, 1983

  • Johansson, S., G. Leech, H. and Goodluck. Manual of Information to Accompany the Lancaster-Oslo/Bergen Corpus of British English, for Use with Digital Computers. Oslo, 1978.

  • Karlsson, F. "SWETWOL:A Comprehensive Morphological Analyzer for Swedish". Nordic Journal of Linguistics 1 (1992), 1–45.

    Google Scholar 

  • Kasik, R. Eesti keele tuletusõpetus: õppevahend eesti filoloogia ja žurnalistikaosakonna üliõpilastele. 1. Substantiivituletus (Estonian Derivation: A Textbook for the Students of the Dept. of Estonian Linguistics and Journalism. 1. Derivation of Substantives). TR Ü, Tartu, 1984.

  • Kasik, R. Eesti keele tuletusõpetus: õppevahend eesti filoloogia ja žurnalistikaosakonna üliõpilastele. 1. Adjektiivi-ja adverbituletus (Estonian Derivation: A Textbook for the Students of the Dept. of Estonian Linguistics and Journalism. 1. Derivation of Adjectives and Adverbs). TR Ü, Tartu, 1992.

  • Kask, A. "Liitsõnad ja liitmisviisid eesti keeles (Compound Words and Ways of Compounding in Estonian)". Eesti keele grammatika 3.1.,Tartu, 1967.

  • Koskenniemi, K. "Two-Level Morphology: A General Computational Model for Wordform Recognition and Production". Publications of the Dept. of General Linguistics, University of Helsinki 11 (1983).

  • Kull, R. Liitnimisõnade kujunemine eesti kirjakeeles (Nominal Compound Development in Estonian Literary Language). Dissertation for candidate of philological sciences, ENSV TA KKI, Tallinn, 1967.

  • Proszeky, G. and L. Tihanyi. "A Fast Morphological Analyzer for Lemmatizing Agglutinative Languages". Papers in Computational Lexicography. Complex-92. Ed. F. Kiefer, G. Kiss and J. Pajzs. Budapest: Linguistics Institute, HAS, 1992, pp. 265–278.

    Google Scholar 

  • Solak, A. and K. Oflazer. "Design and Implementation of a Spelling Checker for Turkish". Literary and Linguistic Computing 8(3) (1993).

  • Sproat, R. Morphology and Computation. Cambridge, MA: The MIT Press.

  • Svartvik, J. and R. Quirk. A Corpus of English Conversation. Lund, 1980.

  • Valgma, J. and N. Remmel. Eesti Keele Grammatika (Grammar of the Estonian Language). Tallinn: Valgus, 1970.

    Google Scholar 

  • Viks, Ü. A Concise Morphological Dictionary of Estonian. Tallinn: Institute of Estonian Language and Literature, 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kaalep, HJ. An Estonian Morphological Analyser and the Impact of a Corpus on Its Development. Computers and the Humanities 31, 115–133 (1997). https://doi.org/10.1023/A:1000668108369

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1000668108369

Navigation