Abstract
In this paper we address the problem of first name and last name identification in a news collection. The approach presented is based on corpus investigation and is language independent. At the core of the system there is a name classifier based on the values of different parameters. In its most general form, the name category identification is not an easy task. The hardest problems are raised by ambiguous tokens – those that can be either a first or a last name and/or by tokens with just one occurrence. However, the system is able to predict the name category with high accuracy. The experiments have been run on an Italian newspaper and the evaluation has been carried on I-CAB.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Driscoll, P., Yarowsky, D.: Disambiguation of Standardized Personal Name Variants. In: Proc. IWMMIES, Borovets, Bulgaria, pp. 1–7 (2007)
Krstev, C., Dusko, V., Maurel, D.: Multilingual ontology of proper names. Language and Technology Conference (2005)
Magnini, B., et al.: Ontology Population from Textual Mentions: Task Definition and Benchmark. In: Proc. OLP2 workshop on Ontology Population and Learning, Sidney, Australia, Joint with ACL/Coling (2006)
Mann, G.S., Yarowsky, D.: Unsupervised personal name disambiguation. In: Proc. Conference on natural Languag Learning, pp. 33–40 (2003)
Magnini, B., et al.: Ontology Population from Textual Mentions: Task Definition and Benchmark. In: Proc. OLP2 workshop on Ontology Population and Learning, Sidney, Australia, Joint with ACL/Coling (2006)
Popescu, O., Magnini, B.: Iterative Person Coreference Using Name Frequency Estimates. In: Proc. 3rd Language & Technology Conference, Poznan, Poland (2007)
Zanoli, R., Pianta, E.: SVM based NER, Technical Report, Trento, Italy (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Popescu, O., Magnini, B. (2008). Language Independent First and Last Name Identification in Person Names. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-78135-6_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78134-9
Online ISBN: 978-3-540-78135-6
eBook Packages: Computer ScienceComputer Science (R0)