Skip to main content

Deep Learning-Based Part-of-Speech Tagging of the Tigrinya Language

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1283))

Abstract

Deep Neural Networks have demonstrated the great efficiency in many NLP task for various languages. Unfortunately, some resource-scarce languages as, e.g., Tigrinya still receive too little attention, therefore many NLP applications as part-of-speech tagging are in their early stages. Consequently, the main objective of this research is to offer the effective part-of-speech tagging solutions for the Tigrinya language having rather small training corpus.

In this paper the Deep Neural Network classifiers (i.e., Feed Forward Neural Network, Long Short-Term Memory, Bidirectional LSTM and Convolutional Neural Network) are investigated by applying them on a top of trained distributional neural word2vec embeddings. Seeking for the most accurate solutions, DNN models are optimized manually and automatically. Despite automatic hyper-parameter optimization demonstrates a good performance with the Convolutional Neural Network, the manually tested Bidirectional Long Short – Term Memory method achieves the highest overall accuracy equal to 0.91%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Available at http://crubadan.org/languages/ti and word list compiled by Biniam Gebremichael's web crawler, available http://www.cs.ru.nl/biniam/geez/crawl.php.

  2. 2.

    Available at https://eng.jnlp.org/yemane/ntigcorpus.

  3. 3.

    For representing this and further models plot_model function in Keras was used.

References

  1. Amsalu, S., Gibbon, D.: Finite state morphology of amharic. In: International Conference on Recent Advances in Natural Language Processing, pp. 47–51 (2005)

    Google Scholar 

  2. Chollet, F.: Keras: deep learning library for Theano and Tensorflow (2015). https://keras.io/. Accessed Mar 2020

  3. Gebregzabiher, T.: Part of speech tagger for tigrigna language. Department of Computer Science, Addis Ababa University, Master thesis (2010)

    Google Scholar 

  4. Hyperas: Keras + Hyperopt: A Very Simple Wrapper for Convenient Hyperparameter Optimization. https://github.com/maxpumperla/hyperas. Accessed Mar 2020

  5. Keleta, Y., Yamamoto, K., Marasinghe, A.: Nagaoka Tigrinya Corpus: Design and Development of Part-of-Speech Tagged Corpus. The Association for Natural Language Processing, pp. 413–416 (2016)

    Google Scholar 

  6. Keleta, Y., Yamamoto, K., Marasinghe, A.: Tigrinya part-of-speech tagging with morphological patterns and the New Nagaoka Tigrinya Corpus. Int. J. Comput. Appl. 146(14), 33–41 (2016). https://doi.org/10.5120/ijca2016910943

    Article  Google Scholar 

  7. McNemar, Q.: Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153–157. https://doi.org/10.1007/bf02295996, PMID 20254758

  8. Nwankpa, Ch., Ijomah, W., Gachagan, A., Marshall, S.: Activation Functions: Comparison of Trends in Practice and Research for Deep Learning (2018). arXiv:1811.03378v1

  9. Řehůřek, R., Sojka, P.: Software framework for topic modeling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010). https://doi.org/10.13140/2.1.2393.1847

  10. Tensorflow. https://www.tensorflow.org/. Accessed Mar 2020

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Senait Gebremichael Tesfagergish .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tesfagergish, S.G., Kapociute-Dzikiene, J. (2020). Deep Learning-Based Part-of-Speech Tagging of the Tigrinya Language. In: Lopata, A., Butkienė, R., Gudonienė, D., Sukackė, V. (eds) Information and Software Technologies. ICIST 2020. Communications in Computer and Information Science, vol 1283. Springer, Cham. https://doi.org/10.1007/978-3-030-59506-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59506-7_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59505-0

  • Online ISBN: 978-3-030-59506-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics