Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Methods for morphology learning in low(er)-resource scenarios

View/Open
Bergmanis2020_Redacted.pdf (1.491Mb)
Bergmanis2020.pdf (3.018Mb)
Date
25/06/2020
Author
Bergmanis, Toms
Metadata
Show full item record
Abstract
A core issue that hampers development and use of language technology for underresourced and morphologically rich languages is data sparsity. In this work, we consider unsupervised morphological analysis and lemmatization — two linguistically motivated ways to combat problems with sparse data. The morphological analysis aims to represent words in terms of the smallest meaningful units of language — morphemes (e.g., acid +ify +ed), while lemmatization concerns individual relationships among words (e.g., walks, walking and walked all are different forms of the lexeme walk). In this thesis, we focus on morphology learning in low-resource scenarios: we propose algorithms and methods that learn unsupervised morphological analysis and lemmatization with higher accuracy than the previous work while having affordable training data requirements. Our unsupervised morphological analyzers have similar or better underlying morpheme accuracy than three strong baselines while on average, inducing 12.8% more compact representation of the data than the next best system. Our lemmatizers reduce the training data requirements to raw character representations of wordforms in their immediate context, yet yield improvements (especially on unseen and ambiguous words) over systems that learn from complete morphologically annotated sentences.
URI
https://hdl.handle.net/1842/37115

http://dx.doi.org/10.7488/era/416
Collections
  • Informatics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page