Skip to main content
Log in

Biomedical named entity recognition using generalized expectation criteria

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

It is difficult to apply machine learning to a domain which is short of labeled training data, such as biomedical named entity recognition (NER) which remains a challenging task because of its extraordinary complex nomenclature. In this paper, we proposed a semi-supervised method which can train condition random field (CRF) models using generalized expectation (GE) criteria to solve biomedical named entity recognition problem. In the proposed method, instead of “instance” labeling, the “feature” labeling is applied to get the training data which can save lots of labeling time. Latent Dirichlet Allocation (LDA) model was involved to choose the features for labeling. Experiment results show that the proposed method can dramatically improve the performance of biomedical NER through incorporating unlabeled data by feature labeling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Dai H et al (2010) New challenges for biological text-mining in the next decade. J Comput Sci Technol 25(1):169–179

    Article  Google Scholar 

  2. Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguisticae Investigationes 30:3–26

    Article  Google Scholar 

  3. Hu Q et al (2010) An efficient gene selection technique for cancer recognition based on neighborhood mutual information. Int J Mach Learn Cybern 1–12

  4. Kuncheva LI (2010) Full-class set classification using the Hungarian algorithm. Int J Mach Learn Cybern 1(1–4):53–61

    Google Scholar 

  5. Krallinger M et al (2008) Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge. Genome Biol 9(Suppl 2):1

    Article  Google Scholar 

  6. Dai H et al (2008) BIOSMILE web search: a web application for annotating biomedical entities and relations. Nucl Acids Res 36(Web Server issue):W390

  7. Rebholz-Schuhmann D (2008) Text processing through web services: calling Whatizit. Bioinformatics 24(2):296–298

    Article  Google Scholar 

  8. Si L, Kanungo T, Huang X (2005) Boosting performance of bio-entity recognition by combining results from multiple systems. In: Proceedings of the 5th international workshop on Bioinformatics, ACM, pp 76–83

  9. Vlachos A (2007) Evaluating and combining biomedical named entity recognition systems, In: BioNLP 2007: biological, translational, and clinical language processing, pp 199–206

  10. Saha SK, Sarkar S, Mitra PP (2009) Feature selection techniques for maximum entropy based biomedical named entity recognition. J Biomed Inform 42(5):905–911

    Article  Google Scholar 

  11. Lin YF et al (2004) A maximum entropy approach to biomedical named entity recognition. In: Proceedings of the 4th ACM SIGKDD Workshop on Data Mining in Bioinformatics, Citeseer, pp 56–61

  12. Lee KJ, Hwang YS, Rim HC (2003) Two-phase biomedical NE recognition based on SVMs. In: Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine. Association for Computational Linguistics, pp 33–40

  13. Li L, Zhou R, Huang D (2009) Two-phase biomedical named entity recognition using CRFs. Comput Biol Chem 33(4):334–338

    Article  Google Scholar 

  14. Zhou G, Su J (2004) Exploring deep knowledge resources in biomedical name recognition in the joint workshop on natural language processing in biomedicine and its applications. In: Proceedings of the Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA 2004), pp. 96–99

  15. Lee K et al (2004) Biomedical named entity recognition using two-phase model based on SVMs. J Biomed Inform 37(6):436–447

    Article  Google Scholar 

  16. Nigam K et al (2000) Text classification from labelled and unlabelled documents using EM. Mach Learn 103–134

  17. Brefeld U, Scheffer T (2006) Semi-supervised learning for structured output variables, In: Proceedings of the 23rd international conference on Machine learning, ACM New York, NY, USA: Pittsburgh, Pennsylvania, pp 145–152

  18. Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In: the ICML-2003 Workshop on The Continuum from Labeled to Unlabeled Data, pp 912–919

  19. Altun Y, McAllester D, Belkin M (2006) Maximum margin semi-supervised learning for structured variables. Adv Neural Inf Process Syst 18:33–40

    Google Scholar 

  20. F. Jiao, Wang S, Lee CH, Greiner R, Schuurmans D (2006) Semi-supervised conditional random fields for improved sequence segmentation and labeling, the 21st International Conference on Computational Linguistics, pp 209–216

  21. Small K, Roth D (2010) Margin-based active learning for structured predictions. Int J Mach Learn Cybern 1(1–4):3–25

    Google Scholar 

  22. McCallum A, Mann G, Druck G (2007) Generalized expectation criteria. Computer science technical note. University of Massachusetts, Amherst

    Google Scholar 

  23. Mann GS, McCallum A (2007) Simple, robust, scalable semi-supervised learning via expectation regularization, In: Proceedings of the 24th international conference on Machine learning, ACM, pp 593–600

  24. Mann G, McCallum A (2010) Generalized expectation criteria for semi-supervised learning with weakly labeled data. J Mach Learn Res 11:955–984

    MathSciNet  Google Scholar 

  25. Druck G, Mann G, McCallum A (2007) Reducing annotation effort using generalized expectation criteria (Technical Report 2007-62), University of Massachusetts, Amherst

  26. Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  27. Rabiner L (1989) A tutorial on Hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286

    Article  Google Scholar 

  28. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, 282–289

  29. Wallach H (2004) Conditional random fields: an introduction. Technical Report MS-CIS-04-21, Department of Computer and Information Science, University of Pennsylvania, p 50

  30. Mann, G, McCallum A (2008) Generalized expectation criteria for semi-supervised learning of conditional random fields. In: Proceeding of Association of Computational Linguistics, pp 870–878

  31. Raghavan H, Madani O, Jones R (2006) Active learning with feedback on features and instances. J Mach Learn Res 7:1655–1686

    MathSciNet  MATH  Google Scholar 

  32. Sun C et al (2007) Rich features based conditional random fields for biological named entities recognition. Comput Biol Med 37(9):1327–1333

    Article  Google Scholar 

  33. Tsai T et al (2006) Integrating linguistic knowledge into a conditional random field framework to identify biomedical named entities. Expert Syst Appl 30(1):117–128

    Article  Google Scholar 

  34. Settles B (2004) Biomedical named entity recognition using conditional random fields and rich feature sets. In: International Conference on Computational Linguistics. Geneva, Switzerland, pp 104–107

  35. Tsai T, Wu C, Hsu W (2005) Using maximum entropy to extract biomedical named entities without dictionaries. In: Proceedings of IJCNLP2005, pp 270–275

  36. Deerwester S et al (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  37. Wenbo L, Le S, Dakun Z (2008) Text classification based on labeled-LDA model. Chinese J Comput 31(4):620–627

    MathSciNet  Google Scholar 

  38. Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1):177–196

    Article  MATH  Google Scholar 

  39. Landauer TK, Foltz PPW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25(2):259–284

    Article  Google Scholar 

  40. Hofmann T (1999) Probabilistic latent semantic analysis. In: Proceeding of Uncertainty in Artificial Intelligence, Citeseer, pp 21–26

  41. Boyd-Graber J, Blei D, Zhu X (2007) A topic model for word sense disambiguation. In empirical methods in natural language processing, pp 1024–1033

  42. Toutanova K, Johnson M (2007) A Bayesian LDA-based model for semi-supervised part-of-speech tagging. Adv Neural Inf Process Syst 1521–1528

  43. Georgescul M, Clark A, Armstrong S (2008) A comparative study of mixture models for automatic topic segmentation of multiparty dialogues. In: Proceedings of the Third International Joint Conference on Natural Language Processing, pp 925–930

  44. Arora R, Ravindran B (2008) Latent dirichlet allocation based multi-document summarization. In: Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data. ACM, pp 91–97

  45. McCallum AK (2002) MALLET: a Machine Learning for Language Toolkit. http://mallet.cs.umass.edu

Download references

Acknowledgment

This work is supported by National Natural Science Foundation of China (60973076, 61073127), Research Fund for the Doctoral Program of Higher Education of China (20102302120053) and the Fundamental Research Funds for the Central Universities (Grant on HIT.NSRIF.2010045).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lin Yao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, L., Sun, C., Wu, Y. et al. Biomedical named entity recognition using generalized expectation criteria. Int. J. Mach. Learn. & Cyber. 2, 235–243 (2011). https://doi.org/10.1007/s13042-011-0022-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-011-0022-3

Keywords

Navigation