Abstract
This paper describes and evaluates a linguistically based NER system for Portuguese, based on lexico-semantical information, pattern matching and morphosyntactic, context driven Constraint Grammar rules. Preliminary F-scores for cross-domain news texts, when distinguishing six different name types, were 91.85 (raw) and 93.6 (subtyping of ready-chunked proper nouns).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bick, Eckhard: The Parsing System ‘Palavras’ — Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Aarhus University Press, Århus (2000)
Bick, Eckhard: “Named Entity Recognition for Danish”. I: Årbog for Nordisk Sprogteknologisk Forskningsprogram 2000–2004. Forthcoming (2003).
Bikel, Daniel M. & Miller, Scott & Schwartz, Richard & Weischedel, Ralph: Nymble: a High-Performance Learning Name-finder. In: Proc. of the Conf. on Applied Natural Language Processing 1997
Borthwick, Andrew & Sterling, John & Agichtein, Eugene & Grishman, Ralph: NYU: Description of the MENE Named Entity System as Used in MUC-7. In: Proc. of the 7th Message Understanding Conf. (MUC7), April 29th–May 1st, Fairfax (1998)
Iason, Demiros et. al.: Named Entity Recognition in Greek Texts. In: Proceedings of the 2nd Int. Conference on Language Resources & Evaluation (LREC), 2000
Marsh, E. & Perzanowski, D.: MUC-7 evaluation of I.E. Technology: Overview of Results. In: Proc. of the 7th Message Understanding Conf. (MUC7), April 29th–May 1st, Fairfax (1998)
Mikheev, Andrei & Grover, Claire & Moens, Marc: Description of the LTG System used for MUC-7. In: Proceedings of the 7th Message Understanding Conference (MUC7), April 29th–May 1st, Fairfax (1998)
Palmer, David D. & Day, David S.: A Statistical Profile of the Named Entity Task. In: Proceedings of the Fifth Conference on Applied Natural Language Processing March 31st–April 3rd 1997
Rocha, Paulo A. & Santos, Diana: CETEMPúblico: Um corpus de grandes dimensões de linguagem jornalística portuguesa. In: Maria das Graças Volpe Nunes (ed.): Actas do V. PROPOR, Nov. 19th–22nd, Atibaia (2000), pp. 131–140
Santos, Diana & Bick, Eckhard: Providing Internet access to Portuguese corpora: the AC/DC project. In Gavriladou et al. (eds.): Proc. 2nd International Conf. on Language Resources and Evaluation, LREC2000 (Athens, 2000), pp. 205–210.
Stevenson, Mark & Gaizauskas, Robert: Using Corpus-derived Name Lists for Named Entity Recognition. In: Proc. of the Sixth Conf. on Applied Natural Language Processing, Seattle, 2000
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bick, E. (2003). Multi-level NER for Portuguese in a CG Framework. In: Mamede, N.J., Trancoso, I., Baptista, J., das Graças Volpe Nunes, M. (eds) Computational Processing of the Portuguese Language. PROPOR 2003. Lecture Notes in Computer Science(), vol 2721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45011-4_18
Download citation
DOI: https://doi.org/10.1007/3-540-45011-4_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40436-1
Online ISBN: 978-3-540-45011-5
eBook Packages: Springer Book Archive