Skip to main content
Log in

On active annotation for named entity recognition

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

A major constraint of machine learning techniques for solving several information extraction problems is the availability of sufficient amount of training examples, which involve huge costs and efforts to prepare. Active learning techniques select informative instances from the unlabeled data and add it to the training set in such a way that the overall classification performance improves. In random sampling approach, unlabeled data is selected for annotation at random and thus can’t yield the desired results. In contrast, active learning selects the useful data from a huge pool of unlabeled documents. The strategies used often classify the instances to belong to the incorrect classes. The classifier is confused between two classes if the test instance is located near the margin. We propose two methods for active learning, and show that these techniques favorably result in the increased performance. The first approach is based on support vector machine (SVM), whereas the second one is based on an ensemble learning which utilizes the classification capabilities of two well-known classifiers, namely SVM and conditional random field. The motivation of using these classifiers is that these are orthogonal in nature, and thereby a combination of them can produce the better results. In order to show the efficacy of the proposed approach we choose a crucial problem, namely named entity recognition (NER) in three languages, namely Bengali, Hindi and English. This is also evaluated for NER in biomedical domain. Evaluation results reveal that the proposed techniques indeed show considerable performance improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Here by extraction we mean both recognition and classification.

  2. http://crfpp.sourceforge.net.

  3. http://chasen.org/~taku/software/yamcha/.

  4. http://cl.aist-nara.ac.jp/taku-ku/software/TinySVM.

  5. http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/tagger.

  6. http://ltrc.iiit.ac.in/ner-ssea-08.

  7. http://research.nii.ac.jp/collier/workshops/JNLPBA04st.htm.

  8. http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/ERtask/report.html.

  9. We iterate the algorithm for more than 10 iterations as we observed performance improvement even in the 10th iteration.

References

  1. Dligach D, Palmer M (2011) Good seed makes a good crop: accelerating active learning using language modeling. In: Proceedings of the 49th annual meeting of the association for computational linguistics: shortpapers, Portland, Oregon. Association for Computational Linguistics, pp 6–10

  2. Dligach D, Palmer M (2009) Using language modeling to select useful annotation data. In: Proceedings of human language technologies, Portland, Oregon. Association for Computational Linguistics, pp 25–30

  3. Laws F, Heimer F, Sch\(\ddot{u}\)tze H (2012) Active learning for coreference resolution. In: 2012 conference of the North American chapter of the association for computational linguistics: human language technologies, Montreal, Canada. Association for Computational Linguistics, pp 508–512

  4. Settles B (2009) Active learning literature survey. In: Computer sciences technical report 1648

  5. Ekbal A, Bonin F, Saha S, Stemle E, Barbu E, Cavulli F, Girardi C, Nardelli F, Poesio M (2012) Rapid adaptation of ne resolvers for humanities domains using active annotation. J Lang Technol Comput Linguist (JLCL) 26(2):26–38

  6. Small K, Roth D (2010) Margin-based active learning for structured predictions. Int J Mach Learn Cybern 1(1–4):3–25

    Article  Google Scholar 

  7. Wang XZ, Dong LC, Yan JH (2012) Maximum ambiguity-based sample selection in fuzzy decision tree induction. IEEE Trans Knowl Data Eng 24(8):1491–1505

    Article  Google Scholar 

  8. Settles B (2008) Curious machines: active learning with structured instances. PhD thesis, University of Wisconsin-Madison

  9. Tong S (2001) Active learning: theory and applications. PhD thesis, Stanford University

  10. Monteleoni C (2006) Learning with online constraints: shifting concepts and active learning. PhD thesis, Massachusetts Institute of Technology

  11. Olsson F (2008) Bootstrapping named entity recognition by means of active machine learning. PhD thesis, University of Gothenburg

  12. Olsson F (2009) A literature survey of active machine learning in the context of natural language processing. In: Technical report t2009:06, Swedish Institute of Computer Science

  13. Schein AI, Ungar LH (October 2007) Active learning for logistic regression: an evaluation. Mach Learn 68(3):235–265

    Article  Google Scholar 

  14. Baldridge J, Palmer A (2009) How well does active learning actually work? Time-based evaluation of cost-reduction strategies for language documentation. In: Proceedings of the 2009 conference on empirical methods in natural language processing (EMNLP ’09) vol 1, Stroudsburg. Association for Computational Linguistics, pp 296–305

  15. Tomanek K, Olsson F (2009) A web survey on the use of active learning to support annotation of text data. In: Proceedings of the NAACL HLT 2009 workshop on active learning for natural language processing, HLT ’09, Stroudsburg. Association for Computational Linguistics, pp 45–48

  16. Dasgupta S (2004) Analysis of a greedy active learning strategy. In: Advances in neural information processing systems. MIT Press, USA, pp 337–344

  17. Balcan MF, Hanneke S, Vaughan J (2010) The true sample complexity of active learning. Mach Learn 80(2–3):111–139

    Article  MathSciNet  Google Scholar 

  18. Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP’08), Stroudsburg. Association for Computational Linguistics, pp 1070–1079

  19. Reichart R, Tomanek K, Hahn U, Rappoport A (2008) Multi-task active learning for linguistic annotations. In: Proceedings of ACL-08: HLT, Columbus, Ohio. Association for Computational Linguistics, pp 861–869

  20. Riloff E, Jones R (1999) Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the sixteenth national conference on artificial intelligence and the eleventh innovative applications of artificial intelligence conference innovative applications of artificial intelligence (AAAI’99/IAAI ’99), Menlo Park. American Association for Artificial Intelligence, pp 474–479

  21. Cucchiarelli A, Velardi P (March 2001) Unsupervised named entity recognition using syntactic and semantic contextual evidence. Comput Linguist 27(1):123–131

    Article  Google Scholar 

  22. Etzioni O, Cafarella M, Downey D, Popescu AM, Shaked T, Soderland S, Weld DS, Yates A (June 2005) Unsupervised named-entity extraction from the web: an experimental study. Artif Intell 165(1):91–134

    Article  Google Scholar 

  23. Tomanek K, Hahn U (2009) Reducing class imbalance during active learning for named entity annotation. In: Proceedings of the fifth international conference on knowledge capture (K-CAP’09), New York. ACM, pp 105–112

  24. Becker M, Hachey B, Alex B, Grover C (2005) Optimising selective sampling for bootstrapping named entity recognition. In: Proceedings of the ICML workshop on learning with multiple views, pp 5–11

  25. Yao L, Sun C, Li S, Wang X, Wang X (2009) Crf-based active learning for chinese named entity recognition. In: SMC, IEEE, pp 1557–1561

  26. Laws F, Schätze H (2008) Stopping criteria for active learning of named entity recognition. In: Proceedings of the 22nd international conference on computational linguistics (COLING’08), vol 1, Stroudsburg. Association for Computational Linguistics, pp 465–472

  27. Shen D, Zhang J, Su J, Zhou G, Tan CL (2004) Multi-criteria-based active learning for named entity recognition. In: Proceedings of the 42nd annual meeting on association for computational linguistics (ACL’04), Stroudsburg. Association for Computational Linguistics

  28. Ekbal A, Naskar S, Bandyopadhyay S (2007) Named entity recognition and transliteration in Bengali. Named Entities Recognit Classif Use Spec Issue Lingvisticae Investig J 30(1):95–114

    Article  Google Scholar 

  29. Ekbal A, Bandyopadhyay S (2009) A conditional random field approach for named entity recognition in Bengali and Hindi. Linguist Issues Lang Technol (LiLT) 2(1):1–44

    Google Scholar 

  30. Li W, McCallum A (2004) Rapid development of Hindi named entity recognition using conditional random fields and feature induction. ACM Trans Asian Lang Inf Process 2(3):290–294

    Article  Google Scholar 

  31. Srikanth P, Murthy KN (2008) Named entity recognition for Telugu. In: Proceedings of the IJCNLP-08 workshop on NER for South and South East Asian languages, pp 41–50

  32. Yao L, Sun C, Wu Y, Wang X, Wang X (2011) Biomedical named entity recognition using generalized expectation criteria. Int J Mach Learn Cybern 2(4):235–243

    Article  Google Scholar 

  33. Vapnik VN (1995) The nature of statistical learning theory. Springer-Verlag New York Inc., New York

    Book  MATH  Google Scholar 

  34. Lafferty JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp 282–289

  35. Collins M, Singer Y (1999) Unsupervised models for named entity classification. In: Proceedings of the joint SIGDAT conference on empirical methods in natural language processing and very large corpora

  36. Joachims T (1999) Making large scale SVM learning practical. MIT Press, Cambridge

    Google Scholar 

  37. Vlachos A (2006) Active annotation. In: Proceedings of EACL 2006 workshop on adaptive text extraction and mining, Trento

  38. Saha SK, Sarkar S, Mitra P (2009) Feature selection techniques for maximum entropy based biomedical named entity recognition. J Biomed Inform 42(5):905–911

    Article  Google Scholar 

  39. Ekbal A, Bandyopadhyay S (2008) A web-based Bengali news corpus for named entity recognition. Lang Resour Eval J 42(2):173–182

    Article  Google Scholar 

  40. Tjong Kim Sang EF, De Meulder F (2003) Introduction to the Conll-2003 shared task: language independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL, pp 142–147

  41. Kim J-D, Ohta T, Tsuruoka Y, Tateisi Y  (2004) Introduction to the bio-entity recognition task at jnlpba. In: Proceedings of the international joint workshop on natural language processing in biomedicine and its applications (JNLPBA’04). Association for Computational Linguistics, pp 70–75

  42. Lin D, Wu X (2009) Phrase clustering for discriminative learning. In: Proceedings of 47th annual meeting of the ACL and the 4th IJCNLP of the AFNLP, pp 1030–1038

  43. Suzuki J, Isozaki H (2008) Semi-supervised sequential labeling and segmentation using Gigaword scale unlabeled data. In: Proceedings of ACL/HLT-08, pp 665–673

  44. Florian R, Ittycheriah A, Jing H, Zhang T (2003) Named entity recognition through classifier combination. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL

  45. Chieu HL, Ng HT (2003) Named entity recognition with a maximum entropy approach. In: Proceedings of CoNLL-2003, HLT-NAACL, pp 160–163

  46. Klein D, Smarr J, Nguyen H, Manning CD (2003) Named entity recognition with character-level models. In: Proceedings of CoNLL-2003, HLT-NAACL, pp 188–191

  47. Wu D, Ngai G, Carput M (2003) A stacked, voted, stacked model for named entity recognition. In: Proceedings of the CoNLL-2003, HLT-NAACL, pp 200–203

  48. Zhou G, Su J (2004) Exploring deep knowledge resources in biomedical name recognition. In: Proceedings of the international joint workshop on natural language processing in biomedicine and its applications (JNLPBA ’04), pp 96–99

  49. Song Y, Kim E, Lee GG, Yi B (2004) Posbiotm-ner in the shared task of bionlp/nlpba 2004. In: Proceedings of the joint workshop on natural language processing in biomedicine and its applications (JNLPBA-2004)

  50. Ponomareva N, Pla F, Molina A, Rosso P (2007) Biomedical named entity recognition: a poor knowledge hmm-based approach. In: NLDB, pp 382–387

  51. Park KM, Kim SH, Rim HC, Hwang YS (2004) Me-based biomedical named entity recognition using lexical knowledge. ACM Trans Asian Lang Inf Process 5:4–21

    Article  Google Scholar 

  52. Settles B (2004) Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of the international joint workshop on natural language processing in biomedicine and its applications (JNLPBA’04). Association for Computational Linguistics, pp 104–107

  53. Finkel J, Dingare S, Nguyen H, Nissim M, Sinclair G, Manning C (2004) Exploiting context for biomedical entity recognition: from syntax to the web. In: Proceedings of the joint workshop on natural language processing in biomedicine and its applications (JNLPBA-2004), pp 88–91

  54. Kim S, Yoon J, Park KM, Rim HC (2005) Two-phase biomedical named entity recognition using a hybrid method. In: IJCNLP, pp 646–657

  55. Leaman R, Gonzalez G (2008) BANNER: an executable survey of advances in biomedical named entity recognition. In: Proceedings of the pacific symposium on biocomputing, pp 652–663

  56. Kabiljo R, Clegg AB, Shepherd AJ (2009) A realistic assessment of methods for extracting gene/protein interactions from free text. BMC Bioinform 10:233. doi:10.1186/1471-2105-10-233

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Asif Ekbal or Sriparna Saha.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ekbal, A., Saha, S. & Sikdar, U.K. On active annotation for named entity recognition. Int. J. Mach. Learn. & Cyber. 7, 623–640 (2016). https://doi.org/10.1007/s13042-014-0275-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-014-0275-8

Keywords

Navigation