Abstract
This article describes a finite-state cascade for the extraction of person names in texts in French. We extract these proper names in order to categorize and to cluster texts with them. After a finite-state pre-processing (division of the text in sentences, tagging with dictionaries, etc.), a series of finite-state transducers is applied one after the other to the text and locates left and right contexts that indicates the presence of a person name. An evaluation of the results of this extraction is presented.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abney, S. (1996). Partial parsing via finite-state cascades, In Workshop on Robust Parsing, 8th European Summer School in Logic, Language and Information, Prague, Czech Republic, pp. 8–15.
Ait-Mokhtar, S., Chanod, J. (1997) Incremental finite state parsing, in ANLP’97.
Coates-Stephens, S. (1993). The Analysis and Acquisition of Proper Names for the Understanding of Free Text, in Computers and the Humanities, 26(5–6), pp. 441–456.
Courtois, B., Silberztein, M. (1990). Dictionnaire électronique des mots simples du français, Paris, Larousse.
Dejong, G. (1982). An Overview of the frump System, in W.B. Lehnert et M. H. Ringle éd., Strategies for Natural Language Processing, ErlBaum, pp. 149–176.
Fairon, C. (2000). Structures non-connexes. Grammaire des incises en français: description linguistique et outils informatiques, Thése de doctorat en informatique, Université Paris 7.
Friburger, N., Dister, A., Maurel, D. (2000). Améliorer le découpage des phrases sous INTEX, in Actes des journées Intex 2000, RISSH, Liéges, Belgique, to appear.
Gala-Pavia, N. (1999). Using the Incremental Finite-State Architecture to create a Spanish Shallow Parser, in Proceedings of XV Congres of SEPLN, Lleida, Spain.
Hobbs, J. R., Appelt, D. E., Bear, J., Israel, D., Kameyama, M., Stickel, M., Tyson, M. (1996). FASTUS: A cascaded finite-state transducer for extracting information from natural-language text, in Finite-State Devices for Natural Language Processing. MIT Press, Cambridge, MA.
Kim, J.S., Evens, M.W. (1996). Efficient Coreference Resolution for Proper Names in the Wall Street Journal Text, in online proceedings of MAICS’96, Bloomington.
Kokkinakis, D. and Johansson-Kokkinakis, S. (1999). A Cascaded Finite-State Parser for Syntactic Analysis of Swedish. In Proceedings of the 9th EACL. Bergen, Norway.
Piton, O., Maurel, D. (1997). Le traitement informatique de la géographie politique internationale, in Colloque Franche-Comté Traitement automatique des langues (FRACTAL 97), Besançon, 10–12 décembre, Bulag, numéro spécial, pp. 321–328.
Roche, E., Schabes, Y. (1997). Finite-State Language Processing, Cambridge, Massachussets, MIT Press.
Silberztein, M. (1998). “INTEX: a Finite-State Transducer toolbox”, in Proceedings of the 2nd International Workshop on Implementing Automata (WIA’97), Springer Verlag.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Friburger, N., Maurel, 1. (2002). Finite-State Transducer Cascade to Extract Proper Names in Texts. In: Watson, B.W., Wood, D. (eds) Implementation and Application of Automata. CIAA 2001. Lecture Notes in Computer Science, vol 2494. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36390-4_10
Download citation
DOI: https://doi.org/10.1007/3-540-36390-4_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00400-4
Online ISBN: 978-3-540-36390-3
eBook Packages: Springer Book Archive