Abstract
Named Entity Recognition (NER) is an important task in Natural Language Processing (NLP) with a wide range of applications. Recently, word embedding based systems that does not rely on hand-crafted features dominate the task as in the case of many other sequence labeling tasks in NLP. However, we are also observing the emergence of hybrid models that make use of hand crafted features through data augmentation to improve performance of such NLP systems. Such hybrid systems are especially important for less resourced languages such as Turkish as deep learning models require a large dataset to achieve good performance. In this paper, we first give a detailed analysis of the effect of various syntactic, semantic and orthographic features on NER for Turkish. We also improve the performance of the best feature based models for Turkish using additional features. We believe that our results will guide the research in this area and help making use of the key features for data augmentation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bharadwaj, S.S., Medapati, S.B.: Named-entity based speech recognition. US Patent App. 14/035,845, 26 March 2015
Chinchor, N., Robinson, P.: MUC-7 named entity task definition. In: Proceedings of the 7th Conference on Message Understanding, vol. 29 (1997)
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. arXiv preprint arXiv:1511.08308 (2015)
Demir, H., Ozgur, A.: Improving named entity recognition for morphologically rich languages using word embeddings. In: ICMLA, pp. 117–122 (2014)
Fletcher, R.: Practical Methods of Optimization. Wiley, Hoboken (2013)
Fresko, M., Rosenfeld, B., Feldman, R.: A hybrid approach to NER by MEMM and manual rules. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 361–362. ACM (2005)
Güngör, O., Üsküdarlı, S., Güngör, T.: Improving named entity recognition by jointly learning to disambiguate morphological tags. arXiv preprint arXiv:1807.06683 (2018)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Lavergne, T., Cappé, O., Yvon, F.: Practical very large scale CRFs. In: Proceedings the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 504–513. Association for Computational Linguistics, July 2010. http://www.aclweb.org/anthology/P10-1052
Li, Z., Wang, X., Aw, A., Chng, E.S., Li, H.: Named-entity tagging and domain adaptation for better customized translation. In: Proceedings of the Seventh Named Entities Workshop, pp. 41–46 (2018)
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv preprint arXiv:1603.01354 (2016)
McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 188–191. Association for Computational Linguistics, Stroudsburg (2003). https://doi.org/10.3115/1119176.1119206
Mollá, D., Van Zaanen, M., Cassidy, S., et al.: Named entity recognition in question answering of speech data (2007)
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pp. 147–155. Association for Computational Linguistics (2009)
Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Conference on Empirical Methods in Natural Language Processing (1996)
Sak, H., Güngör, T., Saraçlar, M.: Morphological disambiguation of Turkish text with perceptron algorithm. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 107–118. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-70939-8_10
Sasano, R., Kurohashi, S.: Japanese named entity recognition using structural natural language processing. In: Proceedings of the Third International Joint Conference on Natural Language Processing, vol. II (2008)
Şeker, G.A., Eryiğit, G.: Initial explorations on using CRFs for Turkish named entity recognition. Proc. COLING 2012, 2459–2474 (2012)
Şeker, G.A., Eryiğit, G.: Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content 1. Semant. Web 8(5), 625–642 (2017)
Sutton, C., McCallum, A., et al.: An introduction to conditional random fields. Found. Trends® Mach. Learn. 4(4), 267–373 (2012)
Tür, G., Hakkani-Tür, D., Oflazer, K.: A statistical information extraction system for Turkish. Nat. Lang. Eng. 9(2), 181–210 (2003)
Yeniterzi, R.: Exploiting morphology in Turkish named entity recognition system. In: Proceedings of the ACL 2011 Student Session, pp. 105–110. Association for Computational Linguistics (2011)
Acknowledgements
This work was partially supported by JST CREST Grant Number JPMJCR1402, JSPS KAKENHI Grant Numbers 17H01693, and 17K20023JST.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Akdemir, A., Güngör, T. (2019). A Detailed Analysis and Improvement of Feature-Based Named Entity Recognition for Turkish. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-26061-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26060-6
Online ISBN: 978-3-030-26061-3
eBook Packages: Computer ScienceComputer Science (R0)