Named Entity Recognition Using Acyclic Weighted Digraphs: A Semi-supervised Statistical Method

Kim, Kono; Yoon, Yeohoon; Kim, Harksoo; Seo, Jungyun

doi:10.1007/978-3-540-71701-0_60

Named Entity Recognition Using Acyclic Weighted Digraphs: A Semi-supervised Statistical Method

Kono Kim¹,
Yeohoon Yoon²,
Harksoo Kim³ &
…
Jungyun Seo⁴

Conference paper

1808 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Abstract

We propose a NE (Named Entity) recognition system using a semi-supervised statistical method. In training time, the NE recognition system builds error-prone training data only using a conventional POS (Part-Of-Speech) tagger and a NE dictionary that semi-automatically is constructed. Then, the NE recognition system generates a co-occurrence similarity matrix from the error-prone training corpus. In running time, the NE recognition system constructs AWDs (Acyclic Weighted Digraphs) based on the co-occurrence similarity matrix. Then, the NE recognition system detects NE candidates and assigns categories to the NE candidates using Viterbi searching on the AWDs. In the preliminary experiments on PLO (Person, Location and Organization) recognition, the proposed system showed 81.32% on average F1-measure.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bikel, D.M., Miller, S., Schwartz, R.: Nymble: a High-performance Learning Name-finder. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 194–201 (1997)
Google Scholar
Borthwick, A., et al.: NYU: Description of the MENE Named Entity System as Used in MUC-7. In: Proceedings of the Seventh Message Understanding Conference (1997)
Google Scholar
Cohen, W.W., Sarawagi, S.: Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, New York (2004)
Google Scholar
MUC-6 (1995), http://www.cs.nyu.edu/cs/faculty/grishman/muc6.html
MUC-7 (1997), http://www-nlpir.nist.gov/related_projects/muc/proceedings/muc_7_proceedings/overview.html
Seon, C.N., et al.: Named Entity Recognition Using Machine Learning Methods and Pattern-Selection Rules. In: Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium (2001)
Google Scholar
Sekine, S., Grishman, R., Shinnou, H.: A Decision Tree Method for Finding and Classifying Names in Japanese Texts. In: Proceedings of 6th Workshop on Vary Large Corpora (1998)
Google Scholar
Viterbi, A.J.: Error Bounds for Convolution Codes and an Asymptotically Optimal Decoding Algorithm. IEEE Transactions on Information Theory 13, 260–269 (1967)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Natural Language Processing Laboratory, Department of Computer Science, Sogang University, 1 Sinsu-dong, Mapo-gu, Seoul, 121-742, Korea
Kono Kim
NHN corporation, Venture Town Bldg., 25-2 Jeongja-dong, Bundang-gu, Seongname-City, Gyeonggi-do, 463-844, Korea
Yeohoon Yoon
Program of Computer and Communications Engineering, College of Information Technology, Kangwon National University, 192-1, Hyoja 2(i)-dong, Chuncheon-si, Gangwon-do, 200-701, Korea
Harksoo Kim
Department of Computer Science and Interdisciplinary Program of Integrated Biotechnology, Sogang University, 1 Sinsu-dong, Mapo-gu, Seoul, 121-742, Korea
Jungyun Seo

Authors

Kono Kim
View author publications
You can also search for this author in PubMed Google Scholar
Yeohoon Yoon
View author publications
You can also search for this author in PubMed Google Scholar
Harksoo Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jungyun Seo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, K., Yoon, Y., Kim, H., Seo, J. (2007). Named Entity Recognition Using Acyclic Weighted Digraphs: A Semi-supervised Statistical Method. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_60

Download citation

DOI: https://doi.org/10.1007/978-3-540-71701-0_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics