Abstract
This paper presents a named entity recognition method which finds predetermined entities in an unstructured text. The method uses word similarities based on typical word transformations (lemmatization and stemming), word embeddings and character level based similarity to map those entities onto words in the text. The approach is language independent, though language-dependent components are used for lemmatization, stemming and word embedding, and works on any given set of entities. Special attention is given to the entities which are represented in a hierarchical form with the hypernymy-hyponymy relation. The proposed method has the following advantages: it finds the normalized form of the recognized entity name; it is easy to adjust to a new domain; it respects the hierarchical organization of entities; and due to the modular approach can be constantly improved just by updating components for lemmatization, stemming or word embedding. The proposed entity recognition method was tested on a test set of tourist queries and hierarchical entities collected from Slovenia.info tourist portal.
Partially supported by Joint cooperation programme V-A Interreg Slovenia-Austria, project AS-IT-IC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
References
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Kazama, J., Torisawa, K.: Exploiting Wikipedia as external knowledge for named entity recognition. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)
Kozareva, Z.: Bootstrapping named entity recognition with automatically generated gazetteer lists. In: Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 15–21. Association for Computational Linguistics (2006)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Maynard, D., Tablan, V., Ursu, C., Cunningham, H., Wilks, Y.: Named entity recognition from diverse text types. In: Recent Advances in Natural Language Processing 2001 Conference, pp. 257–274 (2001)
Mooney, C.Z., Duval, R.D., Duvall, R.: Bootstrapping: A Nonparametric Approach to Statistical Inference. No. 94-95. Sage, Thousand Oaks (1993)
Nadeau, D., Turney, P.D., Matwin, S.: Unsupervised named-entity recognition: generating gazetteers and resolving ambiguity. In: Lamontagne, L., Marchand, M. (eds.) AI 2006. LNCS (LNAI), vol. 4013, pp. 266–277. Springer, Heidelberg (2006). https://doi.org/10.1007/11766247_23
Patel, A., Sands, A., Callison-Burch, C., Apidianaki, M.: Magnitude: a fast, efficient universal vector embedding utility package. arXiv preprint arXiv:1810.11190 (2018)
Porter, M.F.: Snowball: a language for stemming algorithms (2001). http://snowball.tartarus.org/texts/introduction.html
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education Limited, Kuala Lumpur (2016)
Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4, pp. 142–147. Association for Computational Linguistics (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Štravs, M., Zupančič, J. (2019). Named Entity Recognition Using Gazetteer of Hierarchical Entities. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2019. Lecture Notes in Computer Science(), vol 11606. Springer, Cham. https://doi.org/10.1007/978-3-030-22999-3_65
Download citation
DOI: https://doi.org/10.1007/978-3-030-22999-3_65
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22998-6
Online ISBN: 978-3-030-22999-3
eBook Packages: Computer ScienceComputer Science (R0)