Abstract
In this paper, we present the Named Entity Recognition system and we evaluate baseline classifiers. We use tweets as informal and noisy texts including emoticons, abbreviations, which significantly degrade the performance of classifiers. We present the dataset format, the feature set, we evaluate and test each classifier subject to different combinations of features. Finally, we discover the most representative set of features. Our experimental results show that the presented system is reached at 72% level in precision, 69% in recall and 69% in F1 (micro average), respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A platform for building Python programs to work with human language data. https://www.nltk.org/. Accessed 8 Mar 2019
A Python library for topic modelling, document indexing and similarity retrieval with large corpora. https://pypi.python.org/pypi/gensim. Accessed 8 Mar 2019
A Python module for machine learning. http://sklearn.org/stable/index.html. Accessed 8 Mar 2019
Webpack Bundle Analyzer. https://github.com/webpack-contrib/webpack-bundle-analyzer. Accessed 8 Mar 2019
Ghosh S, Maitra P, Das D (2016) Feature based approach to named entity recognition and linking for tweets. In: #Microposts
Godin F, Vandersmissen B, De Neve W, Van de Walle R (2015) Multimedia lab @ ACL WNUT NER shared task: named entity recognition for twitter microposts using distributed word representations. In: Proceedings of the workshop on noisy user-generated text, pp 146–153. Association for Computational Linguistics. https://doi.org/10.18653/v1/W15-4322. http://aclweb.org/anthology/W15-4322
Greenfield K, Caceres RS, Coury M, Geyer K, Gwon Y, Matterer J, Mensch AC, Sahin CS, Simek O (2016) A reverse approach to named entity extraction and linking in microposts. In: #Microposts
Rizzo G, van Erp M, Plu J, Troncy R (2016) Making sense of microposts (#microposts2016) Named Entity rEcognition and Linking (NEEL) challenge. In: Proceedings of the 6th workshop on ‘Making Sense of Microposts’, pp 50–59
Taşpınar M, Ganiz MC, Acarman T (2017) A feature based simple machine learning approach with word embeddings to named entity recognition on tweets. In: Frasincar F, Ittoo A, Nguyen LM, Métais E (eds) Natural Language Processing and Information Systems. Springer International Publishing, Cham, pp 254–259
Torres-Tramón P, Hromic H, Walsh B, Heravi BR, Hayes C (2016) Kanopy4tweets: entity extraction and linking for twitter. In: #Microposts
Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology - Volume 1, NAACL 2003, pp 173–180. Association for Computational Linguistics, Stroudsburg. https://doi.org/10.3115/1073445.1073478
Acknowledgements
The authors gratefully acknowledge the support of Galatasaray University, scientific research support program under grant #18.401.002.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Büyüktopaç, O., Acarman, T. (2020). Evaluation of a Feature Set with Word Embeddings to Improve Named Entity Recognition on Tweets. In: Burduk, R., Kurzynski, M., Wozniak, M. (eds) Progress in Computer Recognition Systems. CORES 2019. Advances in Intelligent Systems and Computing, vol 977. Springer, Cham. https://doi.org/10.1007/978-3-030-19738-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-19738-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19737-7
Online ISBN: 978-3-030-19738-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)