Skip to main content

Bootstrapping and Rule-Based Model for Recognizing Vietnamese Named Entity

  • Conference paper
Intelligent Information and Database Systems (ACIIDS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8398))

Included in the following conference series:

Abstract

This paper intends to address and solve the problem Vietnamese Named Entity recognition and classification (VNER) by using the bootstrapping algorithm and rule-based model. The rule-based model relies on contextual rules to provide contextual evidence that a VNE belongs to a category. These rules exploit linguistic constraints of category are constructed by using the bootstrapping algorithm. Bootstrapping algorithm starts with a handful of seed VNEs of a given category and accumulate all contextual rules found around these seeds in a large corpus. These rules are ranked and used to find new VNEs.

Our experimented corpus is generated from about 250.034 online news articles and over 9.000 literatures. Our VNER system consists 27 categories and more 300.000 VNEs which are recognized and categorized. The accuracy of the recognizing and classifying algorithm is about 95%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, C., Lee, H.J.: A Three-Phase System for Chinese Named Entity Recognition. In: Proceedings of ROCLING XVI, pp. 39–48 (2004)

    Google Scholar 

  2. Le Trung, H., Le Anh, V., Le Trung, K.: An Unsupervised Learning and Statistical Approach for Vietnamese Word Recognition and Segmentation. In: Nguyen, N.T., Le, M.T., Świątek, J. (eds.) ACIIDS 2010, Part II. LNCS (LNAI), vol. 5991, pp. 195–204. Springer, Heidelberg (2010)

    Google Scholar 

  3. Le Trung, H., Le Anh, V., Dang, V.-H., Hoang, H.V.: Recognizing and Tagging Vietnamese Words Based on Statistics and Word Order Patterns. In: Nguyen, N.T., Trawiński, B., Katarzyniak, R., Jo, G.-S. (eds.) Adv. Methods for Comput. Collective Intelligence. SCI, vol. 457, pp. 3–12. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  4. Lin, W., Yangarber, R., Grishman, R.: Bootstrapped learning of semantic classes from positive and negative examples. In: Proceedings of ICMLK 2003 Workshop on the Continuum from Labeled to Unlabeled Data (2003)

    Google Scholar 

  5. Micheal, T., Riloff, E.: A Bootstrapping Method for Learning Semantic Lexicon using Extraction Pattern Contexts. In: Proceedings of the ACL 2002 conference on Empirical Methods in Natural Language Processing, pp. 214–221 (2002)

    Google Scholar 

  6. Riloff, E., Jones, R.: Learning Dictionaries for Information Extraction by Multi-level Bootstrapping. In: Proceedings of the Sixteenth National Conference on the Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference, pp. 474–479 (1999)

    Google Scholar 

  7. Tran, Q.T., Pham, T.X.T., Ngo, Q.H., Dinh, D., Collier, N.: Named Entity Recognition in Vietnamese documents. Progress in Informatics Journal, 5–13 (2007)

    Google Scholar 

  8. Pham, T.X.T., Kawazoe, A., Dinh, D., Collier, N., Tran, Q.T.: Construction of a Vietnamese Corpora for Named Entity Recognition. In: RIAO 2007, 8th International Conference, pp. 719–724. Carnegie Mellon University, Pittsburgh (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Le Trung, H., Le Anh, V., Le Trung, K. (2014). Bootstrapping and Rule-Based Model for Recognizing Vietnamese Named Entity. In: Nguyen, N.T., Attachoo, B., Trawiński, B., Somboonviwat, K. (eds) Intelligent Information and Database Systems. ACIIDS 2014. Lecture Notes in Computer Science(), vol 8398. Springer, Cham. https://doi.org/10.1007/978-3-319-05458-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05458-2_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05457-5

  • Online ISBN: 978-3-319-05458-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics