Skip to main content

Named Entity Recognition Using Acyclic Weighted Digraphs: A Semi-supervised Statistical Method

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Abstract

We propose a NE (Named Entity) recognition system using a semi-supervised statistical method. In training time, the NE recognition system builds error-prone training data only using a conventional POS (Part-Of-Speech) tagger and a NE dictionary that semi-automatically is constructed. Then, the NE recognition system generates a co-occurrence similarity matrix from the error-prone training corpus. In running time, the NE recognition system constructs AWDs (Acyclic Weighted Digraphs) based on the co-occurrence similarity matrix. Then, the NE recognition system detects NE candidates and assigns categories to the NE candidates using Viterbi searching on the AWDs. In the preliminary experiments on PLO (Person, Location and Organization) recognition, the proposed system showed 81.32% on average F1-measure.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bikel, D.M., Miller, S., Schwartz, R.: Nymble: a High-performance Learning Name-finder. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 194–201 (1997)

    Google Scholar 

  2. Borthwick, A., et al.: NYU: Description of the MENE Named Entity System as Used in MUC-7. In: Proceedings of the Seventh Message Understanding Conference (1997)

    Google Scholar 

  3. Cohen, W.W., Sarawagi, S.: Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, New York (2004)

    Google Scholar 

  4. MUC-6 (1995), http://www.cs.nyu.edu/cs/faculty/grishman/muc6.html

  5. MUC-7 (1997), http://www-nlpir.nist.gov/related_projects/muc/proceedings/muc_7_proceedings/overview.html

  6. Seon, C.N., et al.: Named Entity Recognition Using Machine Learning Methods and Pattern-Selection Rules. In: Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium (2001)

    Google Scholar 

  7. Sekine, S., Grishman, R., Shinnou, H.: A Decision Tree Method for Finding and Classifying Names in Japanese Texts. In: Proceedings of 6th Workshop on Vary Large Corpora (1998)

    Google Scholar 

  8. Viterbi, A.J.: Error Bounds for Convolution Codes and an Asymptotically Optimal Decoding Algorithm. IEEE Transactions on Information Theory 13, 260–269 (1967)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Kim, K., Yoon, Y., Kim, H., Seo, J. (2007). Named Entity Recognition Using Acyclic Weighted Digraphs: A Semi-supervised Statistical Method. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71701-0_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71700-3

  • Online ISBN: 978-3-540-71701-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics