Skip to main content
Log in

Simultaneous feature and parameter selection using multiobjective optimization: application to named entity recognition

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In this paper, we propose an efficient algorithm based on the concept of multiobjective optimization (MOO) for performing feature selection and parameter optimization of any machine learning technique. Feature and parameter combinations have significant effect to the accuracy of the classifier. We perform feature selection and parameter optimization for four different classifiers, namely conditional random field, support vector machine, memory based learner and maximum entropy. The proposed algorithms are evaluated for solving the problems of named entity recognition, an important component in many text processing applications. Currently we experiment with four different languages, namely Bengali, Hindi, Telugu and English. At first the proposed MOO based technique is used to determine the appropriate features and parameters. For each of the classifiers, the algorithm produces a set of solutions on the final Pareto optimal front. Each solution represents a classifier with a particular feature and parameter combination. All these solutions are thereafter combined using a MOO based classifier ensemble technique. Evaluation results show that the proposed approach attains the F-measure (harmonic mean of recall and precision) values of 90.48, 90.44, 78.71 and 88.68 % for Bengali, Hindi, Telugu and English, respectively. We also show that for all the experimental settings the proposed feature and parameter optimization technique performs reasonably better than the baseline systems, developed with random feature subsets. Comparisons with the existing works also show the efficacy of our proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. http://ltrc.iiit.ac.in/ner-ssea-08.

  2. http://crfpp.sourceforge.net.

  3. http://chasen-org/taku/software/yamcha/.

  4. http://maxent.sourceforge.net/.

  5. http://ltrc.iiit.ac.in/ner-ssea-08.

  6. http://www.eci.gov.in/DevForum/Fullname.asp.

References

  1. Yao L, Sun C, Wu Y, Wang X, Wang X (2011) Biomedical named entity recognition using generalized expectation criteria. Int J Mach Learn Cybern 2(4):235–243

    Article  Google Scholar 

  2. Cunningham H (2002) GATE, a general architecture for text engineering. Comput Humanit 36:223–254

    Article  Google Scholar 

  3. Babych B, Hartley A (2003) Improving machine translation quality with automatic named entity recognition. In: Proceedings of EAMT/EACL 2003 workshop on MT and other language technology tools, pp 1–8

  4. Moldovan D, Harabagiu S, Girju R, Morarescu P, Lacatusu F, Novischi A, Badulescu A, Bolohan O (2002) LCC tools for question answering. In: Text retrieval conference (TREC)

  5. Nobata C, Sekine S, Isahara H, Grishman R (2002) Summarization system integrated with named entity tagging and IE pattern discovery. In: Proceedings of third international conference on language resources and evaluation (LREC 2002), Spain

  6. Miller S, Crystal M, Fox H, Ramshaw L, Schawartz R, Stone R, Weischedel R, the Annotation Group (1998) BBN: description of the SIFT system as used for MUC-7. In: MUC-7, Fairfax, Virginia

  7. Bikel DM, Schwartz RL, Weischedel RM (1999) An algorithm that learns what’s in a name. Mach Learn 34(1–3):211–231

    Article  MATH  Google Scholar 

  8. Borthwick A (1999) Maximum entropy approach to named entity recognition. Ph.D. thesis, New York University

  9. Borthwick A, Sterling J, Agichtein E, Grishman R (1998) NYU: description of the MENE named entity system as used in MUC-7. In: MUC-7, Fairfax

  10. Wang XZ, Dong CR (2009) Improving generalization of fuzzy if-then rules by maximizing fuzzy entropy. IEEE Trans Fuzzy Syst 17(3):556–567

    Article  Google Scholar 

  11. Wang XZ, Dong LC, Yan JH (2012) Maximum ambiguity-based sample selection in fuzzy decision tree induction. IEEE Trans Knowl Data Eng 24(8):1491–1505

    Article  Google Scholar 

  12. Sekine S (1998) Description of the Japanese NE system used for MET-2. In: MUC-7, Fairfax, Virginia

  13. Bennet SW, Aone C, Lovell C (1997) Learning to tag multilingual texts through observation. In: Proceedings of empirical methods of natural language processing, Providence, Rhode Island, pp 109–116

  14. McCallum, A, Li W (2003) Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of CoNLL, Canada, pp 188–191

  15. Lafferty, JD, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, pp 282–289

  16. Chen WJ, Shao YH, Hong N (2013) Laplacian smooth twin support vector machine for semi-supervised classification. Int J Mach Learn Cybern 5(3):459–468

  17. Sun L, Mu WS, Qi B, Zhou ZJ (2014) A new privacy-preserving proximal support vector machine for classification of vertically partitioned data. Int J Mach Learn Cybern. doi:10.1007/s13042-014-0245-1

  18. Collins M, Singer Y (1999) Unsupervised models for named entity classification. In: Proceedings of the joint SIGDAT conference on empirical methods in natural language processing and very large corpora

  19. Riloff E, Jones R (1999) Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings AAAI ’99/IAAI ’99: Proceedings of the sixteenth national conference on artificial intelligence and the eleventh conference on innovative applications of artificial intelligence, pp 474–479

  20. Yangarber R, Lin W, Grishman R (2002) Unsupervised learning of generalized names. In: Proceedings of the 19th international conference on computational linguistics (COLING-2002), pp 1–7

  21. Alfonseca E, Manandhar S (1999) An unsupervised method for general named entity recognition and automated concept discovery. In: Proceedings AAAI ’99/IAAI ’99: Proceedings of the sixteenth national conference on artificial intelligence and the eleventh conference on innovative applications of artificial intelligence, pp 474–479

  22. Shinyama Y, Sekine S (2004) Named entity discovery using comparable news articles. In: Proceedings of the international conference on computational linguistics (COLING), Switzerland, pp 848–855

  23. Etzioni O, Cafarrella M, Downey D, Popescu AM, Shaked T, Soderland S, Weld DS, Yates A (2005) Unsupervised named entity extraction from the web: an experimental study. Artif Intell 165:91–134

    Article  Google Scholar 

  24. Mikheev A, Grover C, Moens M (1998) Description of the LTG system used for MUC-7. In: MUC-7, Fairfax, Virginia

  25. Srihari R, Niu C, Li W (2002) A hybrid approach for named entity and sub-type tagging. In: Proceedings of sixth conference on applied natural language processing (ANLP), pp 247–254

  26. Yu X (2007) Chinese named entity recognition with cascaded hybrid model. In: Proceedings of NAACL HLT 2007, Prague, pp 197–200

  27. Ekbal A, Bandyopadhyay S (2009) A conditional random field approach for named entity recognition in Bengali and Hindi. Linguist Issues Lang Technol (LiLT) 2(1):1–44

    Google Scholar 

  28. Ekbal A, Naskar S, Bandyopadhyay S (2007) Named entity recognition and transliteration in Bengali. Named Entities: Recognit Classif Use Spec Issue Lingvist Investig J 30(1):95–114

    Google Scholar 

  29. Patel A, Ramakrishnan G, Bhattacharya P (2009) Relational learning assisted construction of rule base for Indian language NER. In: Proceedings of ICON 2009: 7th international conference on natural language processing, India

  30. Li W, McCallum A (2004) Rapid development of hindi named entity recognition using conditional random fields and feature induction. ACM Trans Asian Lang Inf Process 2(3):290–294

    Article  Google Scholar 

  31. Saha S, Sarkar S, Mitra P (2008) A hybrid feature set based maximum entropy Hindi named entity recognition. In: Proceedings of the 3rd international joint conference in natural langauge processing (IJCNLP 2008), pp 343–350

  32. Shishtla PM, Pingali P, Varma V (2008) A character n-gram based approach for improved recall in Indian language NER. In: Proceedings of the IJCNLP-08 workshop on NER for South and South East Asian Languages, pp 101–108

  33. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502

    Article  Google Scholar 

  34. Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Kluwer Academic Publishers, Norwell

    Book  MATH  Google Scholar 

  35. Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437

    Article  Google Scholar 

  36. Ekbal A, Saha S (2012) Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition. IJDAR 15(2):143–166

    Article  Google Scholar 

  37. Ekbal A, Saha S (2013) Full length article: Simulated annealing based classifier ensemble techniques: application to part of speech tagging. Inf Fusion 14(3):288–300

    Article  Google Scholar 

  38. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, England

    MATH  Google Scholar 

  39. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evolut Comput 6(2):181–197

    Article  Google Scholar 

  40. Daelemans W, den Bosch AV (2005) Memory-based language processing. Cambridge University Press, Cambridge

    Book  Google Scholar 

  41. Aha DW, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6:37–66

    Google Scholar 

  42. Daelemans W, Zavrel J, van den Bosch A, van der Sloot K (2010) Mbt:memory-based tagger. In: Version 3.2, reference guide. ILK technical report 10–04. http://ilk.uvt.nl/downloads/pub/papers/ilk.1004.pdf

  43. Darroch J, Ratcliff D (1972) Generalized iterative scaling for log-linear models. Ann Math Stat 43:1470–1480

    Article  MathSciNet  MATH  Google Scholar 

  44. Vapnik VN (1995) The nature of statistical learning theory. Springer, New York

    Book  MATH  Google Scholar 

  45. Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor

    Google Scholar 

  46. Tjong Kim Sang EF, De Meulder F (2003) Introduction to the Conll-2003 shared task: language independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003, pp 142–147

  47. Florian R, Ittycheriah A, Jing H, Zhang T (2003) Named entity recognition through classifier combination. In: Proceedings of the Seventh conference on natural language learning at HLT-NAACL 2003

  48. Lin D, Wu X (2009) Phrase Clustering for discriminative learning. In: Proceedings of 47th annual meeting of the ACL and the 4th IJCNLP of the AFNLP, pp 1030–1038

  49. Suzuki J, Isozaki H (2008) Semi-supervised sequential labeling and segmentation using Gigaword Scale unlabeled data. In: Proceedings of ACL/HLT-08, pp 665–673

  50. Chieu HL, Ng HT (2003) Named entity recognition with a maximum entropy approach. In: Proceedings of CoNLL-2003, HLT-NAACL 2003, pp 160–163

  51. Wu D, Ngai G, Carput M (2003) A stacked, voted, stacked model for named entity recognition. In: Proceedings of the CoNLL-2003, HLT-NAACL

  52. Klein D, Smarr J, Nguyen H, Manning CD (2003) Named entity recognition with character-level models. In: Proceedings of CoNLL-2003, HLT-NAACL 2003, pp 188–191

  53. Ekbal A, Bandyopadhyay S (2008) A web-based Bengali news corpus for named entity recognition. Lang Resour Eval J 42(2):173–182

    Article  Google Scholar 

  54. Singh AK (2008) Named entity recognition for South and South East Asian languages: taking stock. In: Proceedings of the IJCNLP-08 workshop on NER for South and South East Asian Languages, IJCNLP-08, India

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asif Ekbal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ekbal, A., Saha, S. Simultaneous feature and parameter selection using multiobjective optimization: application to named entity recognition. Int. J. Mach. Learn. & Cyber. 7, 597–611 (2016). https://doi.org/10.1007/s13042-014-0268-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-014-0268-7

Keywords

Navigation