Abstract
Parsing plays a significant role in many natural language processing (NLP) applications as their efficiency relies on having an effective parser. This paper presents Amharic sentence parser developed using base phrase chunker that groups syntactically correlated words at different levels. We use HMM to chunk base phrases where incorrectly chunked phrases are pruned with rules. The task of parsing is then performed by taking chunk results as inputs. Bottom-up approach with transformation algorithm is used to transform the chunker to the parser. Corpus from Amharic news outlets and books was collected for training and testing. The training and testing datasets were prepared using the 10-fold cross validation technique. Test results on the test data showed an average parsing accuracy of 93.75%.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abney, S.: Parsing by chunks. In: Berwick, R., Abney, S., Tenny, C. (eds.) Principle-Based Parsing. Kluwer Academic Publishers (1991)
Abney, S.: Chunks and Dependencies: Bringing Processing Evidence to Bear on Syntax. In: Computational Linguistics and the Foundations of Linguistic Theory. CSLI (1995)
Amare, G.: Modern Amharic Grammar in a Simple Approach, Addis Ababa, Ethiopia (2010)
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Relly Media Inc., Sebastopol (2009)
Earley, J.: An efficient context-free parsing algorithm. Communications of the ACM 13(2), 94–102 (1970)
Grover, C., Tobin, R.: Rule-based chunking and reusability. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC 2006 (2006)
Jurafsky, D., Martin, H.: Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics, 2nd edn. Prentice-Hall (2009)
Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, And Computation, ch. 7, pp. 228–302. Addison-Wesley (2001)
Ibrahim, A., Assabie, Y.: Hierarchical Amharic Base Phrase Chunking Using HMM With Error Pruning. In: Proceedings of the 6th Conference on Language and Technology, Poznan, Poland, pp. 328–332 (2013)
Kutlu, M.: Noun phrase chunker for Turkish using dependency parser. Doctoral dissertation. Bilkent University (2010)
Lewis, P., Simons, F., Fennig, D.: Ethnologue: Languages of the World, 17th edn. SIL International, Dallas (2013)
Li, S.J.: Chunk parsing with maximum entropy principle. Chinese Journal of Computers: Chinese Edition 26(12), 1722–1727 (2003)
Manning, C., Schuetze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Molina, A., Pla, F.: Shallow parsing using specialized HMMs. The Journal of Machine Learning Research 2, 595–613 (2002)
Ramshaw, A., Marcus, P.: Text chunking using transformation-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, pp. 82–94 (1995)
Thao, H., Thai, P., Minh, N., Thuy, Q.: Vietnamese noun phrase chunking based on conditional random fields. In: International Conference on Knowledge and Systems Engineering (KSE 2009), pp. 172–178. IEEE (2009)
Tjong, E.F., Sang, K., Buchholz, S.: Introduction to the CoNLL-2000 shared task: Chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning, vol. 7, pp. 127–132 (2000)
Xu, F., Zong, C., Zhao, J.: A Hybrid Approach to Chinese Base Noun Phrase Chunking. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney (2006)
Yimam, B.: Amharic Grammar, Addis Ababa, Ethiopia (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ibrahim, A., Assabie, Y. (2014). Amharic Sentence Parsing Using Base Phrase Chunking . In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-54906-9_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54905-2
Online ISBN: 978-3-642-54906-9
eBook Packages: Computer ScienceComputer Science (R0)