Abstract
This paper describes an approach to extracting proper nouns in the very large text corpora without using the lexicon or cue word dictionary. At first, we train the pattern for extracting the proper nouns by applying the initial proper names into the unannotated corpora that does not have any tags yet. And then we continuously apply the pattern templates into the corpora in order to extract new proper nouns until certain period.
This work was supported by the Korea Science and Engineering Foundation(KOSEF) through the Advanced Information Technology Research Center(AITrc).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
MUC, Proc. of 7th Message Understanding Conference(MUC-7), (1998)
Borthwick, A.: A Japanese Named Entity Recognizer Constructed by a Non-speaker of Japanese. In Proc. of the IREX Workshop (1999) 187–193
Yangaber, R., W. Lin, and R. Grishman: Unsupervised Learning of Generalized Names. In Proc. of the 19 th International Conference on Computational Linguistics, (2002) 1135–1141
Stevenson, M. and R. Gaizauskas: Improving Named Entity Recognition using Annotated Corpora. LREC Workshop on Information Extraction meets Corpus Linguistics (2000)
Kang, S.: Korean Morphological Analyzer. http://nlp.kookmin.ac.kr/ (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kang, SS., Woo, CW. (2003). Unsupervised Learning of Pattern Templates from Unannotated Corpora for Proper Noun Extraction. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2003. Lecture Notes in Computer Science(), vol 2639. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39205-X_103
Download citation
DOI: https://doi.org/10.1007/3-540-39205-X_103
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-14040-5
Online ISBN: 978-3-540-39205-7
eBook Packages: Springer Book Archive