A Detailed Analysis and Improvement of Feature-Based Named Entity Recognition for Turkish

Akdemir, Arda; Güngör, Tunga

doi:10.1007/978-3-030-26061-3_2

Arda Akdemir¹¹ &
Tunga Güngör¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11658))

Included in the following conference series:

International Conference on Speech and Computer

1268 Accesses
2 Citations

Abstract

Named Entity Recognition (NER) is an important task in Natural Language Processing (NLP) with a wide range of applications. Recently, word embedding based systems that does not rely on hand-crafted features dominate the task as in the case of many other sequence labeling tasks in NLP. However, we are also observing the emergence of hybrid models that make use of hand crafted features through data augmentation to improve performance of such NLP systems. Such hybrid systems are especially important for less resourced languages such as Turkish as deep learning models require a large dataset to achieve good performance. In this paper, we first give a detailed analysis of the effect of various syntactic, semantic and orthographic features on NER for Turkish. We also improve the performance of the best feature based models for Turkish using additional features. We believe that our results will guide the research in this area and help making use of the key features for data augmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Systematic Study of Various Approaches and Problem Areas of Named Entity Recognition

Named Entity Recognition by Using XLNet-BiLSTM-CRF

Article 07 June 2021

On Significance of Subword Tokenization for Low-Resource and Efficient Named Entity Recognition: A Case Study in Marathi

References

Bharadwaj, S.S., Medapati, S.B.: Named-entity based speech recognition. US Patent App. 14/035,845, 26 March 2015
Google Scholar
Chinchor, N., Robinson, P.: MUC-7 named entity task definition. In: Proceedings of the 7th Conference on Message Understanding, vol. 29 (1997)
Google Scholar
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. arXiv preprint arXiv:1511.08308 (2015)
Demir, H., Ozgur, A.: Improving named entity recognition for morphologically rich languages using word embeddings. In: ICMLA, pp. 117–122 (2014)
Google Scholar
Fletcher, R.: Practical Methods of Optimization. Wiley, Hoboken (2013)
MATH Google Scholar
Fresko, M., Rosenfeld, B., Feldman, R.: A hybrid approach to NER by MEMM and manual rules. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 361–362. ACM (2005)
Google Scholar
Güngör, O., Üsküdarlı, S., Güngör, T.: Improving named entity recognition by jointly learning to disambiguate morphological tags. arXiv preprint arXiv:1807.06683 (2018)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Lavergne, T., Cappé, O., Yvon, F.: Practical very large scale CRFs. In: Proceedings the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 504–513. Association for Computational Linguistics, July 2010. http://www.aclweb.org/anthology/P10-1052
Li, Z., Wang, X., Aw, A., Chng, E.S., Li, H.: Named-entity tagging and domain adaptation for better customized translation. In: Proceedings of the Seventh Named Entities Workshop, pp. 41–46 (2018)
Google Scholar
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv preprint arXiv:1603.01354 (2016)
McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, CONLL 2003, vol. 4, pp. 188–191. Association for Computational Linguistics, Stroudsburg (2003). https://doi.org/10.3115/1119176.1119206
Mollá, D., Van Zaanen, M., Cassidy, S., et al.: Named entity recognition in question answering of speech data (2007)
Google Scholar
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Article Google Scholar
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pp. 147–155. Association for Computational Linguistics (2009)
Google Scholar
Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Conference on Empirical Methods in Natural Language Processing (1996)
Google Scholar
Sak, H., Güngör, T., Saraçlar, M.: Morphological disambiguation of Turkish text with perceptron algorithm. In: Gelbukh, A. (ed.) CICLing 2007. LNCS, vol. 4394, pp. 107–118. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-70939-8_10
Chapter Google Scholar
Sasano, R., Kurohashi, S.: Japanese named entity recognition using structural natural language processing. In: Proceedings of the Third International Joint Conference on Natural Language Processing, vol. II (2008)
Google Scholar
Şeker, G.A., Eryiğit, G.: Initial explorations on using CRFs for Turkish named entity recognition. Proc. COLING 2012, 2459–2474 (2012)
Google Scholar
Şeker, G.A., Eryiğit, G.: Extending a CRF-based named entity recognition model for Turkish well formed text and user generated content 1. Semant. Web 8(5), 625–642 (2017)
Article Google Scholar
Sutton, C., McCallum, A., et al.: An introduction to conditional random fields. Found. Trends® Mach. Learn. 4(4), 267–373 (2012)
Article Google Scholar
Tür, G., Hakkani-Tür, D., Oflazer, K.: A statistical information extraction system for Turkish. Nat. Lang. Eng. 9(2), 181–210 (2003)
Article Google Scholar
Yeniterzi, R.: Exploiting morphology in Turkish named entity recognition system. In: Proceedings of the ACL 2011 Student Session, pp. 105–110. Association for Computational Linguistics (2011)
Google Scholar

Download references

Acknowledgements

This work was partially supported by JST CREST Grant Number JPMJCR1402, JSPS KAKENHI Grant Numbers 17H01693, and 17K20023JST.

Author information

Authors and Affiliations

University of Tokyo, Tokyo, Japan
Arda Akdemir
Bogazici University, Istanbul, Turkey
Tunga Güngör

Authors

Arda Akdemir
View author publications
You can also search for this author in PubMed Google Scholar
Tunga Güngör
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arda Akdemir .

Editor information

Editors and Affiliations

Utrecht University, Utrecht, The Netherlands
Albert Ali Salah
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Akdemir, A., Güngör, T. (2019). A Detailed Analysis and Improvement of Feature-Based Named Entity Recognition for Turkish. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-26061-3_2
Published: 24 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26060-6
Online ISBN: 978-3-030-26061-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Detailed Analysis and Improvement of Feature-Based Named Entity Recognition for Turkish

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Systematic Study of Various Approaches and Problem Areas of Named Entity Recognition

Named Entity Recognition by Using XLNet-BiLSTM-CRF

On Significance of Subword Tokenization for Low-Resource and Efficient Named Entity Recognition: A Case Study in Marathi

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Detailed Analysis and Improvement of Feature-Based Named Entity Recognition for Turkish

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Systematic Study of Various Approaches and Problem Areas of Named Entity Recognition

Named Entity Recognition by Using XLNet-BiLSTM-CRF

On Significance of Subword Tokenization for Low-Resource and Efficient Named Entity Recognition: A Case Study in Marathi

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation